Index out of bounds question

Discussion in 'C Programming' started by Method Man, Oct 14, 2004.

  1. Method Man

    Method Man Guest

    Say I have the following:

    int main(void) {
    char* p, q;
    p = (char*) malloc(sizeof(char)*10);
    q = (p + 100) - 99; /* legal? */
    free(q - 1); /* legal? */
    ....
    return 0;
    }

    Will this program always produce UB, always work, or is it compiler
    dependent?
    Method Man, Oct 14, 2004
    #1
    1. Advertising

  2. Method Man wrote:

    > Say I have the following:


    #include <stdlib.h>
    >
    > int main(int argc, char* argv[]) {
    > char* p = (char*)malloc(sizeof(char)*10);
    > char* q = (p + 100) - 99; // illegal!
    > free(q - 1); // illegal!
    > // ....
    > return 0;
    > }


    > Will this program always produce UB?


    This is an improper question.
    Undefined Behavior (UB) is undefined.
    There is no specific behavior to "produce".

    > Always work?


    It works everywhere.

    > Or is it compiler dependent?


    There are no ANSI/ISO C99 compliant compilers
    that will not accept this code
    and generate the expected output.
    E. Robert Tisdale, Oct 14, 2004
    #2
    1. Advertising

  3. Method Man

    Ben Pfaff Guest

    "Method Man" <> writes:

    > Say I have the following:
    >
    > int main(void) {
    > char* p, q;


    This is deceptive syntax. It *looks* like it's meant to declare
    two pointers, but it *actually* declares a pointer and an
    integer.

    > p = (char*) malloc(sizeof(char)*10);


    I don't recommend casting the return value of malloc():

    * The cast is not required in ANSI C.

    * Casting its return value can mask a failure to #include
    <stdlib.h>, which leads to undefined behavior.

    * If you cast to the wrong type by accident, odd failures can
    result.

    Some others do disagree, such as P.J. Plauger (see article
    <9sFIb.9066$>).

    When calling malloc(), I recommend using the sizeof operator on
    the object you are allocating, not on the type. For instance,
    *don't* write this:

    int *x = malloc (128 * sizeof (int)); /* Don't do this! */

    Instead, write it this way:

    int *x = malloc (128 * sizeof *x);

    There's a few reasons to do it this way:

    * If you ever change the type that `x' points to, it's not
    necessary to change the malloc() call as well.

    This is more of a problem in a large program, but it's still
    convenient in a small one.

    * Taking the size of an object makes writing the statement
    less error-prone. You can verify that the sizeof syntax is
    correct without having to look at the declaration.

    Finally, sizeof(char) is always 1.

    > q = (p + 100) - 99; /* legal? */


    Constraint violation that requires a diagnostic. See C99
    6.5.16.1 "Simple assignment". Also, the pointer arithmetic
    yields undefined behavior, because you're going beyond
    one-past-the-end in an array.

    > free(q - 1); /* legal? */


    Also a constraint violation. See C99 6.5.2.2 "Function calls"
    para 2.

    > ....
    > return 0;
    > }
    >
    > Will this program always produce UB, always work, or is it compiler
    > dependent?


    It won't compile without diagnostics. It also produces undefined
    behavior.
    --
    Ben Pfaff
    email:
    web: http://benpfaff.org
    Ben Pfaff, Oct 14, 2004
    #3
  4. In article <spnbd.1453$>, Method Man <> wrote:
    >Say I have the following:
    >
    >int main(void) {
    > char* p, q;
    > p = (char*) malloc(sizeof(char)*10);


    Don't Do That.
    This line is broken, since you forgot to #include <stdlib.h>; the compiler
    incorrectly assumes (as required by the language definition) that malloc
    returns int, and your cast prevents it from complaining about attempting
    an invalid conversion (from int to pointer).
    Preferred form:
    p = malloc(10 * sizeof *p);
    Since sizeof(char) is required to be 1, in this case you can even do:
    p = malloc(10);

    > q = (p + 100) - 99; /* legal? */


    No, but unlikely to cause problems on systems with a flat memory space
    and general-purpose registers used for both pointer and integer operations
    (that is, pretty much any system you're ever likely to use).

    > free(q - 1); /* legal? */


    If q is a valid pointer to 1 past the pointer you got from malloc (which,
    as noted above, is the only result you're likely to see from the line
    above), this is legal and will do exactly what you appear to expect.

    > ....


    Badly formed code.

    > return 0;
    >}



    >Will this program always produce UB, always work, or is it compiler
    >dependent?


    Always produce UB, and almost always (but compiler and, more likely,
    hardware dependent) do the "exactly what you expect" that's the worst
    possible kind of UB (except perhaps the "exactly what you expect, until
    somebody important is watching" kind).

    A system that checks every pointer value generated (such systems are
    well within the bounds of the requirements on implementations, though
    I'm not sure if any actually exist) can trap after evaluating `(p+100)'
    (the left operand of the '-' operator in the line of code you're asking
    about), since this generates a pointer that's 90 bytes past the end
    of the chunk of memory allocated by malloc. Most systems only check
    pointers (if at all) when you dereference them and not when you create
    them, and since you never dereference this particular invalid pointer,
    this check won't catch it.


    dave

    --
    Dave Vandervies
    Since you're a hobbyist, I'm sure you'll want to write the code more
    correctly than a mere professional might do.
    --Richard Heathfield in comp.lang.c
    Dave Vandervies, Oct 14, 2004
    #4
  5. "E. Robert Tisdale" <> writes:
    > Method Man wrote:
    >
    >> Say I have the following:

    >
    > #include <stdlib.h>
    >> int main(int argc, char* argv[]) {
    >> char* p = (char*)malloc(sizeof(char)*10);
    >> char* q = (p + 100) - 99; // illegal!
    >> free(q - 1); // illegal!
    >> // ....
    >> return 0;
    >> }

    >
    >> Will this program always produce UB?

    >
    > This is an improper question.
    > Undefined Behavior (UB) is undefined.
    > There is no specific behavior to "produce".
    >
    >> Always work?

    >
    > It works everywhere.
    >
    >> Or is it compiler dependent?

    >
    > There are no ANSI/ISO C99 compliant compilers
    > that will not accept this code
    > and generate the expected output.


    Tisdale has lied to us yet again. The code quoted above is not what
    Method Man wrote. It's obvious that Tisdale isn't going to respond to
    complaints, so I'll just post this as a warning to others.

    The actual code was:

    ] int main(void) {
    ] char* p, q;
    ] p = (char*) malloc(sizeof(char)*10);
    ] q = (p + 100) - 99; /* legal? */
    ] free(q - 1); /* legal? */
    ] ....
    ] return 0;
    ] }

    Method Man's code had serious error: "char *p, q;" declares p as a
    pointer to char, and q as a char. Tisdale, for some unfathomable
    reason, decided to quietly pretend the error didn't exist rather than
    tell Method Man about it.

    (Note to Mabden: Based on your past behavior I expect you'll jump in
    and flame me for calling Tisdale on his lie. I know your opinion on
    the matter and I'm really not interested in hearing about it again.)

    Assuming the declaration is corrected to

    char *p, *q;

    the evaluation of p + 100 invokes undefined behavior, because it
    yields a value outside the bounds of the memory allocated by malloc().
    Once undefined behavior is invoked, all bets are off.

    If you change the statement
    q = (p + 100) - 99;
    to
    q = (p + 10) - 9;
    there's no problem; p+10 points just past the last element of the
    allocated memory (which is ok as long as you don't dereference it),
    and q then points to p[1]. q - 1 is then equal to p, and passing that
    value to free() is valid.

    Will it "work"? Quite possibly. The possible consequences of
    undefined behavior always include behaving just as you expect
    (assuming you have any expectation). It may or may not be the case
    that the code "works" in all existing implementations, but a
    bounds-checking implementation with fat pointers could easily trap.
    The only sensible thing to do is avoid the undefined behavior in the
    first place.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Oct 14, 2004
    #5
  6. Method Man

    Chris Dollin Guest

    E. Robert Tisdale wrote:

    > Method Man wrote:
    >
    >> Say I have the following:

    >
    > #include <stdlib.h>
    >>
    >> int main(int argc, char* argv[]) {
    >> char* p = (char*)malloc(sizeof(char)*10);
    >> char* q = (p + 100) - 99; // illegal!


    Excuse me, Sir, but you are mis-quoting the Man.

    Don't do that.

    --
    Chris "electric hedgehog" Dollin
    Chris Dollin, Oct 14, 2004
    #6
  7. Chris Dollin <> scribbled the following:
    > E. Robert Tisdale wrote:
    >> Method Man wrote:
    >>> Say I have the following:

    >>
    >> #include <stdlib.h>
    >>>
    >>> int main(int argc, char* argv[]) {
    >>> char* p = (char*)malloc(sizeof(char)*10);
    >>> char* q = (p + 100) - 99; // illegal!


    > Excuse me, Sir, but you are mis-quoting the Man.


    > Don't do that.


    Telling Tisdale not to mis-quote people is like telling P.J.Plauger not
    to advertise his compiler, Dan Pop not to tell people to engage their
    brains, or me not to insult people. I.e. like talking to a brick wall.

    --
    /-- Joona Palaste () ------------- Finland --------\
    \-------------------------------------------------------- rules! --------/
    "This is a personnel commuter."
    - Train driver in Scientific American
    Joona I Palaste, Oct 14, 2004
    #7
  8. Method Man

    Dan Pop Guest

    In <ckm0k0$fma$> Joona I Palaste <> writes:

    >Telling Tisdale not to mis-quote people is like telling P.J.Plauger not
    >to advertise his compiler,


    Huh?!?

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
    Currently looking for a job in the European Union
    Dan Pop, Oct 14, 2004
    #8
  9. In article <ckmar4$3ga$>, Dan Pop <> wrote:
    >In <ckm0k0$fma$> Joona I Palaste
    ><> writes:
    >
    >>Telling Tisdale not to mis-quote people is like telling P.J.Plauger not
    >>to advertise his compiler,

    >
    >Huh?!?


    Since, as far as I know, PJP doesn't have a compiler to advertise,
    telling him not to advertise it wouldn't do much good, would it?

    (Though I think Joona really meant to say Jacob Navia here.)


    dave

    --
    Dave Vandervies
    I should also have said that it's perfectly possible to do this in
    multiple dimensions. Just hurts a bit to think about...
    --Peter Boyle in comp.arch
    Dave Vandervies, Oct 14, 2004
    #9
  10. Dave Vandervies <> scribbled the following:
    > In article <ckmar4$3ga$>, Dan Pop <> wrote:
    >>In <ckm0k0$fma$> Joona I Palaste
    >><> writes:
    >>
    >>>Telling Tisdale not to mis-quote people is like telling P.J.Plauger not
    >>>to advertise his compiler,

    >>
    >>Huh?!?


    > Since, as far as I know, PJP doesn't have a compiler to advertise,
    > telling him not to advertise it wouldn't do much good, would it?


    > (Though I think Joona really meant to say Jacob Navia here.)


    Yes, I meant Jacob Navia. Sorry.

    --
    /-- Joona Palaste () ------------- Finland --------\
    \-------------------------------------------------------- rules! --------/
    "C++ looks like line noise."
    - Fred L. Baube III
    Joona I Palaste, Oct 14, 2004
    #10
  11. Method Man

    Malcolm Guest

    "Method Man" <> wrote
    >
    > int main(void) {
    > char* p, q;
    > p = (char*) malloc(sizeof(char)*10);
    > q = (p + 100) - 99; /* legal? */
    >

    technically not, since p + 100 could load an illegal address into an address
    register and trigger a trap, or something equally nasty.
    >
    > free(q - 1); /* legal? */
    >

    covered by the first question. On most systems it will of course free the
    pointer allocated by malloc(), but a perverse implementation or one with
    funny constraints could crash you whilst keeping within the standard.
    >
    > return 0;
    > }
    >
    > Will this program always produce UB, always work, or is it compiler
    > dependent?
    >

    It is always UB. However on most systems the UB will be "correct" behaviour.
    Malcolm, Oct 14, 2004
    #11
  12. Method Man

    Method Man Guest

    "Method Man" <> wrote in message
    news:spnbd.1453$...
    > Say I have the following:
    >
    > int main(void) {
    > char* p, q;
    > p = (char*) malloc(sizeof(char)*10);
    > q = (p + 100) - 99; /* legal? */
    > free(q - 1); /* legal? */
    > ....
    > return 0;
    > }
    >
    > Will this program always produce UB, always work, or is it compiler
    > dependent?
    >
    >


    Thanks for the answers. Apologies for the missing stdlib.h header and
    misdeclared char* pointer. I was a bit over-anxious. ;-)

    I thought that the C standard might have a rule for performing the constant
    arithmetic first (for efficiency reasons) so that 'q = (p + 100) - 99' would
    always evaluate to 'q = p + 1'. So I've learned, this is not the case and UB
    should be expected (regardless if it's right or wrong).
    Method Man, Oct 14, 2004
    #12
  13. In article <DXCbd.1608$>, Method Man <> wrote:

    >I thought that the C standard might have a rule for performing the constant
    >arithmetic first (for efficiency reasons) so that 'q = (p + 100) - 99' would
    >always evaluate to 'q = p + 1'. So I've learned, this is not the case and UB
    >should be expected (regardless if it's right or wrong).


    Note that there's nothing stopping a compiler from recognizing that doing
    the constant arithmetic first is correctness-preserving (that is, won't
    make correct code incorrect) and doing it, preventing this (incorrect)
    code from causing a bad pointer to be generated as a side effect.

    Undefined behavior is like that; there are no requirements on it (except
    as defined by something other than the C standard), so doing what you
    expect (and even transforming the code into something that does what
    you expect without invoking UB) is allowed. Relying on it is just Not
    A Good Idea.


    dave

    --
    Dave Vandervies
    If writing an OS for the DeathStation 9000, which UBs should invoke nasal
    demons (or daemons), how many, which nostril (or nostrils) and what support
    code needs to be written for them? --Geoff Field in comp.lang.c
    Dave Vandervies, Oct 14, 2004
    #13
  14. Method Man

    pete Guest

    E. Robert Tisdale wrote:
    >
    > Method Man wrote:
    >
    > > Say I have the following:

    >
    > #include <stdlib.h>
    > >
    > > int main(int argc, char* argv[]) {
    > > char* p = (char*)malloc(sizeof(char)*10);
    > > char* q = (p + 100) - 99; // illegal!
    > > free(q - 1); // illegal!
    > > // ....
    > > return 0;
    > > }

    >
    > > Will this program always produce UB?

    >
    > This is an improper question.
    > Undefined Behavior (UB) is undefined.
    > There is no specific behavior to "produce".


    The answer to the question is "yes"

    3.18
    [#1] undefined behavior
    behavior, upon use of a nonportable or erroneous program
    construct, of erroneous data, or of indeterminately valued
    objects, for which this International Standard imposes no
    requirements
    [#2] NOTE Possible undefined behavior ranges from ignoring
    the situation completely with unpredictable results, to
    behaving during translation or program execution in a
    documented manner characteristic of the environment (with or
    without the issuance of a diagnostic message), to
    terminating a translation or execution (with the issuance of
    a diagnostic message).
    [#3] EXAMPLE An example of undefined behavior is the
    behavior on integer overflow.

    --
    pete
    pete, Oct 15, 2004
    #14
  15. Method Man

    Tim Rentsch Guest

    Ben Pfaff <> writes:

    > I don't recommend casting the return value of malloc():
    >
    > * The cast is not required in ANSI C.


    How about the case where the code is intended for
    both ANSI and pre-ANSI compilers?

    > * Casting its return value can mask a failure to #include
    > <stdlib.h>, which leads to undefined behavior.
    >
    > * If you cast to the wrong type by accident, odd failures can
    > result.


    Just curious - if stdlib.h has been #include'd, can there
    still be odd failures when a malloc() return value has
    been casted? If so then what are some examples?
    Tim Rentsch, Oct 21, 2004
    #15
  16. Method Man

    Ben Pfaff Guest

    Tim Rentsch <> writes:

    > Ben Pfaff <> writes:
    >
    >> I don't recommend casting the return value of malloc():
    >>
    >> * The cast is not required in ANSI C.

    >
    > How about the case where the code is intended for
    > both ANSI and pre-ANSI compilers?


    If you're still using a pre-ANSI compiler, I pity you. You're
    working with technology that's 15 years old. Feel free to use
    whatever workarounds are needed.

    Most posters to this newsgroup have no such need. We tend to
    assume that code is in ANSI C unless otherwise specified.

    >> * Casting its return value can mask a failure to #include
    >> <stdlib.h>, which leads to undefined behavior.
    >>
    >> * If you cast to the wrong type by accident, odd failures can
    >> result.

    >
    > Just curious - if stdlib.h has been #include'd, can there
    > still be odd failures when a malloc() return value has
    > been casted? If so then what are some examples?


    I could envision an implementation that discards bits on
    conversion to a pointer type with a bigger-than-byte required
    alignment. In general, converting from A* to B* via C* is not
    guaranteed to work.
    --
    Ben Pfaff
    email:
    web: http://benpfaff.org
    Ben Pfaff, Oct 21, 2004
    #16
  17. Tim Rentsch <> writes:
    > Ben Pfaff <> writes:
    >
    >> I don't recommend casting the return value of malloc():
    >>
    >> * The cast is not required in ANSI C.

    >
    > How about the case where the code is intended for
    > both ANSI and pre-ANSI compilers?


    Then you've got more problems than deciding whether to cast the result
    of malloc(). You can't use prototypes (except perhaps conditionally),
    you can't assume that malloc is declared in <stdlib.h> rather than in,
    say, <malloc.h>, etc. etc.

    Fortunately, the need to write pre-ANSI-compatible C has pretty much
    vanished. (I think the latest gcc even assumes an ANSI-compliant
    bootstrap compiler.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Oct 21, 2004
    #17
  18. Method Man

    Tim Rentsch Guest

    Ben Pfaff <> writes:

    > Tim Rentsch <> writes:
    >
    > > Ben Pfaff <> writes:
    > >
    > >> I don't recommend casting the return value of malloc():
    > >>
    > >> * The cast is not required in ANSI C.

    > >
    > > How about the case where the code is intended for
    > > both ANSI and pre-ANSI compilers?

    >
    > If you're still using a pre-ANSI compiler, I pity you.


    I don't normally use such compilers myself. Certain software
    that I work on forces me to consider such issues. But I'll
    gladly accept the pity. :)


    > Most posters to this newsgroup have no such need. We tend to
    > assume that code is in ANSI C unless otherwise specified.


    Right; that's why I posed the question as a question and
    specifically mentioned pre-ANSI compilers. Other things
    being equal, it seems like more widely applicable is better.
    So in some sense the question is, how unequal are the two
    things? That may be weighted by the relative likelihood
    of the different environments if one wishes.


    > >> * Casting its return value can mask a failure to #include
    > >> <stdlib.h>, which leads to undefined behavior.
    > >>
    > >> * If you cast to the wrong type by accident, odd failures can
    > >> result.

    > >
    > > Just curious - if stdlib.h has been #include'd, can there
    > > still be odd failures when a malloc() return value has
    > > been casted? If so then what are some examples?

    >
    > I could envision an implementation that discards bits on
    > conversion to a pointer type with a bigger-than-byte required
    > alignment. In general, converting from A* to B* via C* is not
    > guaranteed to work.


    You're right; that could cause a problem. Does that case
    ever come up if the types on the two sides have been checked?
    It seems to me that this could cause a problem only if some
    conversion has been invoked unknowingly, and I don't see a way
    for that to happen in the presence of good type checking.
    Tim Rentsch, Oct 21, 2004
    #18
  19. Method Man

    Tim Rentsch Guest

    Keith Thompson <> writes:

    > Tim Rentsch <> writes:
    > > Ben Pfaff <> writes:
    > >
    > >> I don't recommend casting the return value of malloc():
    > >>
    > >> * The cast is not required in ANSI C.

    > >
    > > How about the case where the code is intended for
    > > both ANSI and pre-ANSI compilers?

    >
    > Then you've got more problems than deciding whether to cast the result
    > of malloc(). You can't use prototypes (except perhaps conditionally),
    > you can't assume that malloc is declared in <stdlib.h> rather than in,
    > say, <malloc.h>, etc. etc.
    >
    > Fortunately, the need to write pre-ANSI-compatible C has pretty much
    > vanished. (I think the latest gcc even assumes an ANSI-compliant
    > bootstrap compiler.)


    Thank you for pointing out the obvious and failing to respond
    to the question.
    Tim Rentsch, Oct 21, 2004
    #19
  20. On Thu, 21 Oct 2004 01:31:01 UTC, Tim Rentsch
    <> wrote:

    > Ben Pfaff <> writes:
    >
    > > I don't recommend casting the return value of malloc():
    > >
    > > * The cast is not required in ANSI C.


    > How about the case where the code is intended for
    > both ANSI and pre-ANSI compilers?


    See the answer from Keith.

    > > * Casting its return value can mask a failure to #include
    > > <stdlib.h>, which leads to undefined behavior.
    > >
    > > * If you cast to the wrong type by accident, odd failures can
    > > result.

    >
    > Just curious - if stdlib.h has been #include'd, can there
    > still be odd failures when a malloc() return value has
    > been casted? If so then what are some examples?


    When the prototype is known to the compiler: No.

    When the prototype is NOT known you lives always in undefined behavior
    land - not only for malloc() but with any function returning a
    pointer. The default behavior is to assume that a function returns
    int. Some implermentations use different methods to return pointer
    than other values. Left of the prototype and casting (int) to pointer
    will result in casting something but not the (complete) pointer
    returned - ending in undefined behavior. When god will your program
    crashes immediately after that - when not yor program can do whatever
    - starting with formatting the whole disk.

    Casting is the most dangerous you can ever do. Never, never cast
    something to resolve from a comiler warning. Casting is mostenly the
    choce to hide the bug - but not resolve it. Whenever the compiler will
    warn you you should double check and double recheck what ut really
    means. The compiler is stupid enough to tell you the formal but not
    the real bug. So check double and recheck double to look what the real
    bug is. In case p = malloc() it is always that you have miss to
    present the compiler the prototype of malloc, even when the compiler
    whines something else.

    Casting says the compiler only: be quite because I know what I do -
    but here you knows nothing, you're lying only.

    Don't cast! Don't cast anyway. Don't cast - except you knows exactly
    why you needs to cast. You knows that casting us unneccessary anyway -
    but there are exceptions to that - and that is the only why casting is
    allwoed anyway. Casting only to get the compiler quite is an error -
    ever!

    --
    Tschau/Bye
    Herbert

    Visit http://www.ecomstation.de the home of german eComStation
    Herbert Rosenau, Oct 21, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?ZHdlbndhQGNvbXBhbnlhYmMuY29t?=

    Re: Index Out of Bounds Error

    =?Utf-8?B?ZHdlbndhQGNvbXBhbnlhYmMuY29t?=, Oct 20, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    493
    =?Utf-8?B?ZHdlbndhQGNvbXBhbnlhYmMuY29t?=
    Oct 20, 2004
  2. Replies:
    0
    Views:
    3,170
  3. Mark Olbert
    Replies:
    1
    Views:
    457
    Luke Zhang [MSFT]
    Jan 25, 2006
  4. Biff
    Replies:
    4
    Views:
    1,910
    E. Robert Tisdale
    Jan 31, 2005
  5. Joel Finkel

    DataGrid1.DataKeys[e.Item.ItemIndex] array index out of bounds

    Joel Finkel, Sep 4, 2003, in forum: ASP .Net Datagrid Control
    Replies:
    1
    Views:
    314
    Joel Finkel
    Sep 4, 2003
Loading...

Share This Page