Checking for an array overflow

André Hänsel · Jul 11, 2010

Hello list,

while programming for an embedded platform in C, I have an array and a
pointer to a single element of the array:

uint8_t buffer[100], *bp = buffer;

*bp = U0RXBUF; // Read a character
bp++; // Move pointer forward
if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?
bp = buffer; // Reset pointer to beginning

When I look at the disassembly there is no code generated for "bp =
buffer;".

Is the compiler allowed to assume that bp - buffer will never be 100
because the last element of buffer is 99 and optimize it away? Or did
I make some other mistake?

Regards,
André

Ian Collins · Jul 12, 2010

Hello list,

while programming for an embedded platform in C, I have an array and a
pointer to a single element of the array:

uint8_t buffer[100], *bp = buffer;

*bp = U0RXBUF; // Read a character
bp++; // Move pointer forward
if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?
bp = buffer; // Reset pointer to beginning

When I look at the disassembly there is no code generated for "bp =
buffer;".

Is the compiler allowed to assume that bp - buffer will never be 100
because the last element of buffer is 99 and optimize it away?

It can't because bp - buffer can be 100. There isn't any array bounds
checking in C.

Or did I make some other mistake?

Is that the exact code?

Thad Smith · Jul 12, 2010

Ian said:
On 07/12/10 10:40 AM, André Hänsel wrote:

while programming for an embedded platform in C, I have an array and a
pointer to a single element of the array:

uint8_t buffer[100], *bp = buffer;

*bp = U0RXBUF; // Read a character
bp++; // Move pointer forward
if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?
bp = buffer; // Reset pointer to beginning

When I look at the disassembly there is no code generated for "bp =
buffer;".

Is the compiler allowed to assume that bp - buffer will never be 100
because the last element of buffer is 99 and optimize it away?

Click to expand...

It can't because bp - buffer can be 100.

More precisely, behavior for bp = buffer+100 is defined and (bp - buffer) is
well defined for that value.

Is that the exact code?

If it is, bp == buffer+1 when tested, so the compiler can safely omit the
assignment (I assume that the real code has a loop somewhere).

Keith Thompson · Jul 12, 2010

Ian Collins said:
while programming for an embedded platform in C, I have an array and a
pointer to a single element of the array:

uint8_t buffer[100], *bp = buffer;

*bp = U0RXBUF; // Read a character
bp++; // Move pointer forward
if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?
bp = buffer; // Reset pointer to beginning

When I look at the disassembly there is no code generated for "bp =
buffer;".

Is the compiler allowed to assume that bp - buffer will never be 100
because the last element of buffer is 99 and optimize it away?

Click to expand...

It can't because bp - buffer can be 100. There isn't any array bounds
checking in C.

More precisely, C doesn't *require* array bounds checking.
It indirectly *permits* it, by making the behavior of any program
that violates array bounds undefined; an implementation can take
advantage of that by defining the behavior any way it likes.

For code similar to the above, I can imagine the compiler assuming that
nothing whose behavior is undefined can have happened, and therefore
chooses to omit code that can only be executed if the behavior is
already undefined.

But for this particular code, I don't think that can happen.
buffer+100 is a perfectly legal value for bp (it points one past
the end of an array, so you can't dereference it, but you don't
try so that's ok).

Is that the exact code?

That's the question. It might also be better to examine the visible
behavior of the code rather than (or in addition to) the generated code.

nothing · Jul 12, 2010

When I look at the disassembly there is no code generated for "bp =
buffer;".

It's probably a conditional branch to the first assignment.

Also, I would have written the code as

#define BUFSIZE 100
uint8_t buffer[BUFSIZE];

void foo () {
static uint8_t *bp = buffer;

bp++ = U0RXBUF;
if(bp >= buffer + BUFSIZE)
bp = buffer;
}

But that's just personal choice.

Eric Sosman · Jul 12, 2010

When I look at the disassembly there is no code generated for "bp =
buffer;".

Click to expand...

It's probably a conditional branch to the first assignment.

Also, I would have written the code as

#define BUFSIZE 100
uint8_t buffer[BUFSIZE];

void foo () {
static uint8_t *bp = buffer;

bp++ = U0RXBUF;
if(bp>= buffer + BUFSIZE)
bp = buffer;
}

But that's just personal choice.

Two notable points here: First, the original's division by
sizeof(uint8_t) was a bad idea, and would not have worked at all
had the buffer elements been of some type with a sizeof>1. Second,
the offered correction is missing a rather important asterisk, and
will not compile ...

Ben Bacarisse · Jul 12, 2010

[...] the original's division by
sizeof(uint8_t) was a bad idea, and would not have worked at all
had the buffer elements been of some type with a sizeof>1.

Just to clarify (because I was momentarily taken aback) you mean using a
fixed type name (uint8_t) is a bad idea because it would need to be
changed if the buffer element type were changed. I think your remark
could be misconstrued to mean that dividing by the element size is, of
itself, a bad idea in this situation.

<snip>

Eric Sosman · Jul 12, 2010

[restoring the original code:]

| uint8_t buffer[100], *bp = buffer;
|
| *bp = U0RXBUF; // Read a character
| bp++; // Move pointer forward
| if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?
| bp = buffer; // Reset pointer to beginning

[...] the original's division by
sizeof(uint8_t) was a bad idea, and would not have worked at all
had the buffer elements been of some type with a sizeof>1.

Click to expand...

Just to clarify (because I was momentarily taken aback) you mean using a
fixed type name (uint8_t) is a bad idea because it would need to be
changed if the buffer element type were changed. I think your remark
could be misconstrued to mean that dividing by the element size is, of
itself, a bad idea in this situation.

That is exactly the construction I intended -- but alas! I was
guilty of over-hasty reading, and was mistaken in my intent. The
original code was

uint8_t buffer[100], *bp = buffer;
...
if (bp - buffer == sizeof(buffer)/sizeof(uint8_t)) // Buffer overflow?

.... and I said to myself "The writer has forgotten that pointer
subtraction already `divides' by the element size!" Of course, he
had not: His `sizeof(buffer)/sizeof(uint8_t)' is a perfectly valid
calculation of the number of elements in `buffer', and I think I was
expecting to see the more usual `sizeof buffer / sizeof buffer[0]',
and it was evil companions, Your Honor, evil companions who led me
astray, and I ask for clemency because of my clean(ish) arrest
record, and I'll promise faithfully not to do it again -- until
next time.

I will now undertake to disengage my toenails from my tonsils.

nothing · Jul 13, 2010

*bp++, oops

Dean

When I look at the disassembly there is no code generated for "bp =
buffer;".

Click to expand...

It's probably a conditional branch to the first assignment.

Also, I would have written the code as

#define BUFSIZE 100
uint8_t buffer[BUFSIZE];

void foo () {
static uint8_t *bp = buffer;

bp++ = U0RXBUF;
if(bp>= buffer + BUFSIZE)
bp = buffer;
}

But that's just personal choice.

Click to expand...

Two notable points here: First, the original's division by
sizeof(uint8_t) was a bad idea, and would not have worked at all
had the buffer elements been of some type with a sizeof>1. Second,
the offered correction is missing a rather important asterisk, and
will not compile ...

André Hänsel · Jul 15, 2010

Sorry, sorry, I did not look properly.

It does actually generate code for the assignment, but it does it
entirely inside the registers and then my IDE does not show the
corresponding line of C code:
00E198 4F0D mov.w R15,R13
00E19A 8E0D sub.w R14,R13
00E19C 903D 0064 cmp.w #0x64,R13
00E1A0 2001 jne 0xE1A4
00E1A2 4E0F mov.w R14,R15 <--- Here it is!
00E1A4 4130 ret

Anyway, I thank you for the enjoyable discussion and in addition
learnt the more common "sizeof buffer / sizeof buffer[0]" construction.

Keith Thompson · Jul 15, 2010

AndrÃ© HÃ¤nsel said:
Sorry, sorry, I did not look properly.

It does actually generate code for the assignment, but it does it
entirely inside the registers and then my IDE does not show the
corresponding line of C code:
00E198 4F0D mov.w R15,R13
00E19A 8E0D sub.w R14,R13
00E19C 903D 0064 cmp.w #0x64,R13
00E1A0 2001 jne 0xE1A4
00E1A2 4E0F mov.w R14,R15 <--- Here it is!
00E1A4 4130 ret

Anyway, I thank you for the enjoyable discussion and in addition
learnt the more common "sizeof buffer / sizeof buffer[0]" construction.

The lesson here (sorry if I'm stating the obvious) is that the
visible behavior of a program is the only real criterion for
determining whether the compiler is generating correct code.
Examining the generated code can be useful, but as you've seen
optimizing compilers can be sufficiently clever as to obscure what's
going on unless you read the code *very* carefully. The CPU is
better at interpreting machine code than you are (certainly better
than I am).

How to be sure a structure field is aligned	19	Apr 19, 2012
Passing an array of structures back through the argument list	9	Sep 7, 2013
Pointer Equality for Different Array Objects	11	Feb 3, 2012
An idea for heap allocation at near stack allocation speed	14	Feb 13, 2011
C Is Not Assembly	6	Apr 13, 2010
Array and Pointer Tutorial	53	May 11, 2006
Okay to move an "object" will-nilly?	33	Jun 28, 2006
Can a static array contain a dynamic array of pointers?	8	Apr 19, 2004

Checking for an array overflow

André Hänsel

Ian Collins

Thad Smith

Keith Thompson

nothing

Eric Sosman

Ben Bacarisse

Eric Sosman

nothing

André Hänsel

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads