Index out of bounds question

M

Method Man

Say I have the following:

int main(void) {
char* p, q;
p = (char*) malloc(sizeof(char)*10);
q = (p + 100) - 99; /* legal? */
free(q - 1); /* legal? */
....
return 0;
}

Will this program always produce UB, always work, or is it compiler
dependent?
 
E

E. Robert Tisdale

Method said:
Say I have the following:

#include said:
int main(int argc, char* argv[]) {
char* p = (char*)malloc(sizeof(char)*10);
char* q = (p + 100) - 99; // illegal!
free(q - 1); // illegal!
// ....
return 0;
}
Will this program always produce UB?

This is an improper question.
Undefined Behavior (UB) is undefined.
There is no specific behavior to "produce".
Always work?

It works everywhere.
Or is it compiler dependent?

There are no ANSI/ISO C99 compliant compilers
that will not accept this code
and generate the expected output.
 
B

Ben Pfaff

Method Man said:
Say I have the following:

int main(void) {
char* p, q;

This is deceptive syntax. It *looks* like it's meant to declare
two pointers, but it *actually* declares a pointer and an
integer.
p = (char*) malloc(sizeof(char)*10);

I don't recommend casting the return value of malloc():

* The cast is not required in ANSI C.

* Casting its return value can mask a failure to #include
<stdlib.h>, which leads to undefined behavior.

* If you cast to the wrong type by accident, odd failures can
result.

Some others do disagree, such as P.J. Plauger (see article
<[email protected]>).

When calling malloc(), I recommend using the sizeof operator on
the object you are allocating, not on the type. For instance,
*don't* write this:

int *x = malloc (128 * sizeof (int)); /* Don't do this! */

Instead, write it this way:

int *x = malloc (128 * sizeof *x);

There's a few reasons to do it this way:

* If you ever change the type that `x' points to, it's not
necessary to change the malloc() call as well.

This is more of a problem in a large program, but it's still
convenient in a small one.

* Taking the size of an object makes writing the statement
less error-prone. You can verify that the sizeof syntax is
correct without having to look at the declaration.

Finally, sizeof(char) is always 1.
q = (p + 100) - 99; /* legal? */

Constraint violation that requires a diagnostic. See C99
6.5.16.1 "Simple assignment". Also, the pointer arithmetic
yields undefined behavior, because you're going beyond
one-past-the-end in an array.
free(q - 1); /* legal? */

Also a constraint violation. See C99 6.5.2.2 "Function calls"
para 2.
....
return 0;
}

Will this program always produce UB, always work, or is it compiler
dependent?

It won't compile without diagnostics. It also produces undefined
behavior.
 
D

Dave Vandervies

Say I have the following:

int main(void) {
char* p, q;
p = (char*) malloc(sizeof(char)*10);

Don't Do That.
This line is broken, since you forgot to #include <stdlib.h>; the compiler
incorrectly assumes (as required by the language definition) that malloc
returns int, and your cast prevents it from complaining about attempting
an invalid conversion (from int to pointer).
Preferred form:
p = malloc(10 * sizeof *p);
Since sizeof(char) is required to be 1, in this case you can even do:
p = malloc(10);
q = (p + 100) - 99; /* legal? */

No, but unlikely to cause problems on systems with a flat memory space
and general-purpose registers used for both pointer and integer operations
(that is, pretty much any system you're ever likely to use).
free(q - 1); /* legal? */

If q is a valid pointer to 1 past the pointer you got from malloc (which,
as noted above, is the only result you're likely to see from the line
above), this is legal and will do exactly what you appear to expect.

Badly formed code.
return 0;
}

Will this program always produce UB, always work, or is it compiler
dependent?

Always produce UB, and almost always (but compiler and, more likely,
hardware dependent) do the "exactly what you expect" that's the worst
possible kind of UB (except perhaps the "exactly what you expect, until
somebody important is watching" kind).

A system that checks every pointer value generated (such systems are
well within the bounds of the requirements on implementations, though
I'm not sure if any actually exist) can trap after evaluating `(p+100)'
(the left operand of the '-' operator in the line of code you're asking
about), since this generates a pointer that's 90 bytes past the end
of the chunk of memory allocated by malloc. Most systems only check
pointers (if at all) when you dereference them and not when you create
them, and since you never dereference this particular invalid pointer,
this check won't catch it.


dave
 
K

Keith Thompson

E. Robert Tisdale said:
Method said:
Say I have the following:

#include said:
int main(int argc, char* argv[]) {
char* p = (char*)malloc(sizeof(char)*10);
char* q = (p + 100) - 99; // illegal!
free(q - 1); // illegal!
// ....
return 0;
}
Will this program always produce UB?

This is an improper question.
Undefined Behavior (UB) is undefined.
There is no specific behavior to "produce".
Always work?

It works everywhere.
Or is it compiler dependent?

There are no ANSI/ISO C99 compliant compilers
that will not accept this code
and generate the expected output.

Tisdale has lied to us yet again. The code quoted above is not what
Method Man wrote. It's obvious that Tisdale isn't going to respond to
complaints, so I'll just post this as a warning to others.

The actual code was:

] int main(void) {
] char* p, q;
] p = (char*) malloc(sizeof(char)*10);
] q = (p + 100) - 99; /* legal? */
] free(q - 1); /* legal? */
] ....
] return 0;
] }

Method Man's code had serious error: "char *p, q;" declares p as a
pointer to char, and q as a char. Tisdale, for some unfathomable
reason, decided to quietly pretend the error didn't exist rather than
tell Method Man about it.

(Note to Mabden: Based on your past behavior I expect you'll jump in
and flame me for calling Tisdale on his lie. I know your opinion on
the matter and I'm really not interested in hearing about it again.)

Assuming the declaration is corrected to

char *p, *q;

the evaluation of p + 100 invokes undefined behavior, because it
yields a value outside the bounds of the memory allocated by malloc().
Once undefined behavior is invoked, all bets are off.

If you change the statement
q = (p + 100) - 99;
to
q = (p + 10) - 9;
there's no problem; p+10 points just past the last element of the
allocated memory (which is ok as long as you don't dereference it),
and q then points to p[1]. q - 1 is then equal to p, and passing that
value to free() is valid.

Will it "work"? Quite possibly. The possible consequences of
undefined behavior always include behaving just as you expect
(assuming you have any expectation). It may or may not be the case
that the code "works" in all existing implementations, but a
bounds-checking implementation with fat pointers could easily trap.
The only sensible thing to do is avoid the undefined behavior in the
first place.
 
J

Joona I Palaste

Chris Dollin said:
E. Robert Tisdale said:
Method said:
Say I have the following:

#include said:
int main(int argc, char* argv[]) {
char* p = (char*)malloc(sizeof(char)*10);
char* q = (p + 100) - 99; // illegal!
Excuse me, Sir, but you are mis-quoting the Man.
Don't do that.

Telling Tisdale not to mis-quote people is like telling P.J.Plauger not
to advertise his compiler, Dan Pop not to tell people to engage their
brains, or me not to insult people. I.e. like talking to a brick wall.
 
D

Dave Vandervies

In <[email protected]> Joona I Palaste


Huh?!?

Since, as far as I know, PJP doesn't have a compiler to advertise,
telling him not to advertise it wouldn't do much good, would it?

(Though I think Joona really meant to say Jacob Navia here.)


dave
 
J

Joona I Palaste

Dave Vandervies said:
Since, as far as I know, PJP doesn't have a compiler to advertise,
telling him not to advertise it wouldn't do much good, would it?
(Though I think Joona really meant to say Jacob Navia here.)

Yes, I meant Jacob Navia. Sorry.
 
M

Malcolm

Method Man said:
int main(void) {
char* p, q;
p = (char*) malloc(sizeof(char)*10);
q = (p + 100) - 99; /* legal? */
technically not, since p + 100 could load an illegal address into an address
register and trigger a trap, or something equally nasty.
free(q - 1); /* legal? */
covered by the first question. On most systems it will of course free the
pointer allocated by malloc(), but a perverse implementation or one with
funny constraints could crash you whilst keeping within the standard.
return 0;
}

Will this program always produce UB, always work, or is it compiler
dependent?
It is always UB. However on most systems the UB will be "correct" behaviour.
 
M

Method Man

Method Man said:
Say I have the following:

int main(void) {
char* p, q;
p = (char*) malloc(sizeof(char)*10);
q = (p + 100) - 99; /* legal? */
free(q - 1); /* legal? */
....
return 0;
}

Will this program always produce UB, always work, or is it compiler
dependent?

Thanks for the answers. Apologies for the missing stdlib.h header and
misdeclared char* pointer. I was a bit over-anxious. ;-)

I thought that the C standard might have a rule for performing the constant
arithmetic first (for efficiency reasons) so that 'q = (p + 100) - 99' would
always evaluate to 'q = p + 1'. So I've learned, this is not the case and UB
should be expected (regardless if it's right or wrong).
 
D

Dave Vandervies

I thought that the C standard might have a rule for performing the constant
arithmetic first (for efficiency reasons) so that 'q = (p + 100) - 99' would
always evaluate to 'q = p + 1'. So I've learned, this is not the case and UB
should be expected (regardless if it's right or wrong).

Note that there's nothing stopping a compiler from recognizing that doing
the constant arithmetic first is correctness-preserving (that is, won't
make correct code incorrect) and doing it, preventing this (incorrect)
code from causing a bad pointer to be generated as a side effect.

Undefined behavior is like that; there are no requirements on it (except
as defined by something other than the C standard), so doing what you
expect (and even transforming the code into something that does what
you expect without invoking UB) is allowed. Relying on it is just Not
A Good Idea.


dave
 
P

pete

E. Robert Tisdale said:
Method said:
Say I have the following:

#include said:
int main(int argc, char* argv[]) {
char* p = (char*)malloc(sizeof(char)*10);
char* q = (p + 100) - 99; // illegal!
free(q - 1); // illegal!
// ....
return 0;
}
Will this program always produce UB?

This is an improper question.
Undefined Behavior (UB) is undefined.
There is no specific behavior to "produce".

The answer to the question is "yes"

3.18
[#1] undefined behavior
behavior, upon use of a nonportable or erroneous program
construct, of erroneous data, or of indeterminately valued
objects, for which this International Standard imposes no
requirements
[#2] NOTE Possible undefined behavior ranges from ignoring
the situation completely with unpredictable results, to
behaving during translation or program execution in a
documented manner characteristic of the environment (with or
without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of
a diagnostic message).
[#3] EXAMPLE An example of undefined behavior is the
behavior on integer overflow.
 
T

Tim Rentsch

Ben Pfaff said:
I don't recommend casting the return value of malloc():

* The cast is not required in ANSI C.

How about the case where the code is intended for
both ANSI and pre-ANSI compilers?
* Casting its return value can mask a failure to #include
<stdlib.h>, which leads to undefined behavior.

* If you cast to the wrong type by accident, odd failures can
result.

Just curious - if stdlib.h has been #include'd, can there
still be odd failures when a malloc() return value has
been casted? If so then what are some examples?
 
B

Ben Pfaff

Tim Rentsch said:
How about the case where the code is intended for
both ANSI and pre-ANSI compilers?

If you're still using a pre-ANSI compiler, I pity you. You're
working with technology that's 15 years old. Feel free to use
whatever workarounds are needed.

Most posters to this newsgroup have no such need. We tend to
assume that code is in ANSI C unless otherwise specified.
Just curious - if stdlib.h has been #include'd, can there
still be odd failures when a malloc() return value has
been casted? If so then what are some examples?

I could envision an implementation that discards bits on
conversion to a pointer type with a bigger-than-byte required
alignment. In general, converting from A* to B* via C* is not
guaranteed to work.
 
K

Keith Thompson

Tim Rentsch said:
How about the case where the code is intended for
both ANSI and pre-ANSI compilers?

Then you've got more problems than deciding whether to cast the result
of malloc(). You can't use prototypes (except perhaps conditionally),
you can't assume that malloc is declared in <stdlib.h> rather than in,
say, <malloc.h>, etc. etc.

Fortunately, the need to write pre-ANSI-compatible C has pretty much
vanished. (I think the latest gcc even assumes an ANSI-compliant
bootstrap compiler.)
 
T

Tim Rentsch

Ben Pfaff said:
If you're still using a pre-ANSI compiler, I pity you.

I don't normally use such compilers myself. Certain software
that I work on forces me to consider such issues. But I'll
gladly accept the pity. :)

Most posters to this newsgroup have no such need. We tend to
assume that code is in ANSI C unless otherwise specified.

Right; that's why I posed the question as a question and
specifically mentioned pre-ANSI compilers. Other things
being equal, it seems like more widely applicable is better.
So in some sense the question is, how unequal are the two
things? That may be weighted by the relative likelihood
of the different environments if one wishes.

I could envision an implementation that discards bits on
conversion to a pointer type with a bigger-than-byte required
alignment. In general, converting from A* to B* via C* is not
guaranteed to work.

You're right; that could cause a problem. Does that case
ever come up if the types on the two sides have been checked?
It seems to me that this could cause a problem only if some
conversion has been invoked unknowingly, and I don't see a way
for that to happen in the presence of good type checking.
 
T

Tim Rentsch

Keith Thompson said:
Then you've got more problems than deciding whether to cast the result
of malloc(). You can't use prototypes (except perhaps conditionally),
you can't assume that malloc is declared in <stdlib.h> rather than in,
say, <malloc.h>, etc. etc.

Fortunately, the need to write pre-ANSI-compatible C has pretty much
vanished. (I think the latest gcc even assumes an ANSI-compliant
bootstrap compiler.)

Thank you for pointing out the obvious and failing to respond
to the question.
 
H

Herbert Rosenau

How about the case where the code is intended for
both ANSI and pre-ANSI compilers?

See the answer from Keith.
Just curious - if stdlib.h has been #include'd, can there
still be odd failures when a malloc() return value has
been casted? If so then what are some examples?

When the prototype is known to the compiler: No.

When the prototype is NOT known you lives always in undefined behavior
land - not only for malloc() but with any function returning a
pointer. The default behavior is to assume that a function returns
int. Some implermentations use different methods to return pointer
than other values. Left of the prototype and casting (int) to pointer
will result in casting something but not the (complete) pointer
returned - ending in undefined behavior. When god will your program
crashes immediately after that - when not yor program can do whatever
- starting with formatting the whole disk.

Casting is the most dangerous you can ever do. Never, never cast
something to resolve from a comiler warning. Casting is mostenly the
choce to hide the bug - but not resolve it. Whenever the compiler will
warn you you should double check and double recheck what ut really
means. The compiler is stupid enough to tell you the formal but not
the real bug. So check double and recheck double to look what the real
bug is. In case p = malloc() it is always that you have miss to
present the compiler the prototype of malloc, even when the compiler
whines something else.

Casting says the compiler only: be quite because I know what I do -
but here you knows nothing, you're lying only.

Don't cast! Don't cast anyway. Don't cast - except you knows exactly
why you needs to cast. You knows that casting us unneccessary anyway -
but there are exceptions to that - and that is the only why casting is
allwoed anyway. Casting only to get the compiler quite is an error -
ever!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Out-of-bounds nonsense 63
Fibonacci 0
Adding adressing of IPv6 to program 1
Out-of-bounds Nonsense 7
Bounds Checking as Undefined Behaviour? 29
Queue in C 25
Out-of-bounds Restrictions? 11
Lexical Analysis on C++ 1

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top