which is the better way to declare dynamic single dimension array inside struct

Discussion in 'C Programming' started by Geetesh, Feb 27, 2004.

  1. Geetesh

    Geetesh Guest

    Recently i saw a code in which there was a structer defination similar
    as bellow:
    struct foo
    {
    int dummy1;
    int dummy2;
    int last[1]
    };
    In application the above array is always allocated at runtime using
    malloc.In this last member of the structer "int last[1]" is not
    actually used as array with single element but when alloacting space
    for struct foo extra memory is allocated and last is used as array
    with more then one element. my question is what are the advantages of
    using the above defination instead of the shown below.
    struct foo
    {
    int dummy1;
    int dummy2;
    int *last;
    };
    The only advantage i can think of is that we will have to call single
    malloc in first declaration and two malloc in second declaration and
    also that in first declaration all the memeory allocated will be
    contigous which may lead to less framgmentation and better cache
    utilization. My question is does using first defination for accessing
    of elements faster when compared to second. If yes why?
    Thanks in advance.
     
    Geetesh, Feb 27, 2004
    #1
    1. Advertising

  2. Geetesh

    Jack Klein Guest

    On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    wrote in comp.lang.c:

    > Recently i saw a code in which there was a structer defination similar
    > as bellow:
    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int last[1]
    > };


    This causes undefined behavior and is invalid code under all versions
    of the C language standard.

    > In application the above array is always allocated at runtime using
    > malloc.In this last member of the structer "int last[1]" is not
    > actually used as array with single element but when alloacting space
    > for struct foo extra memory is allocated and last is used as array
    > with more then one element. my question is what are the advantages of
    > using the above defination instead of the shown below.


    The advantages are that some programmers in any language are hot-shots
    who think they know everything and happen on a trick that might work
    with their particular compiler and think clever trickery proves that
    they are good programmers.

    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int *last;
    > };
    > The only advantage i can think of is that we will have to call single
    > malloc in first declaration and two malloc in second declaration and
    > also that in first declaration all the memeory allocated will be
    > contigous which may lead to less framgmentation and better cache
    > utilization. My question is does using first defination for accessing
    > of elements faster when compared to second. If yes why?
    > Thanks in advance.


    You can still do a single malloc allocation:

    foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

    /* error checking omitted */

    fp->last = (char *)fp + sizeof *fp;

    As to "faster", that doesn't apply when you are talking about illegal
    code that produces undefined behavior.

    Even when comparing two different legal methods of doing something,
    the C standard does not specify the relative performance of anything.
    The answer could be exactly opposite from one compiler to another.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Feb 27, 2004
    #2
    1. Advertising

  3. Geetesh

    Jack Klein Guest

    On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    wrote in comp.lang.c:

    > Recently i saw a code in which there was a structer defination similar
    > as bellow:
    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int last[1]
    > };


    This causes undefined behavior and is invalid code under all versions
    of the C language standard.

    > In application the above array is always allocated at runtime using
    > malloc.In this last member of the structer "int last[1]" is not
    > actually used as array with single element but when alloacting space
    > for struct foo extra memory is allocated and last is used as array
    > with more then one element. my question is what are the advantages of
    > using the above defination instead of the shown below.


    The advantages are that some programmers in any language are hot-shots
    who think they know everything and happen on a trick that might work
    with their particular compiler and think clever trickery proves that
    they are good programmers.

    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int *last;
    > };
    > The only advantage i can think of is that we will have to call single
    > malloc in first declaration and two malloc in second declaration and
    > also that in first declaration all the memeory allocated will be
    > contigous which may lead to less framgmentation and better cache
    > utilization. My question is does using first defination for accessing
    > of elements faster when compared to second. If yes why?
    > Thanks in advance.


    You can still do a single malloc allocation:

    foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

    /* error checking omitted */

    fp->last = (char *)fp + sizeof *fp;

    As to "faster", that doesn't apply when you are talking about illegal
    code that produces undefined behavior.

    Even when comparing two different legal methods of doing something,
    the C standard does not specify the relative performance of anything.
    The answer could be exactly opposite from one compiler to another.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Feb 27, 2004
    #3
  4. Geetesh

    Nejat AYDIN Guest

    Re: which is the better way to declare dynamic single dimension arrayinside struct

    Jack Klein wrote:
    >

    [...]
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int *last;
    > > };

    [...]
    > You can still do a single malloc allocation:
    >
    > foo *fp = malloc(sizeof *foo + how_many_characters_i_want);

    ^^^ ^^^^
    struct foo *fp = malloc(sizeof *fp + how_many_characters_i_want);
     
    Nejat AYDIN, Feb 27, 2004
    #4
  5. Geetesh

    Richard Bos Guest

    Jack Klein <> wrote:

    > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > wrote in comp.lang.c:
    >
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int *last;
    > > };
    > > The only advantage i can think of is that we will have to call single
    > > malloc in first declaration and two malloc in second declaration

    >
    > You can still do a single malloc allocation:
    >
    > foo *fp = malloc(sizeof *foo + how_many_characters_i_want);


    Um... sizeof *type?

    > /* error checking omitted */
    >
    > fp->last = (char *)fp + sizeof *fp;


    Mind you, that only works with characters; the resulting space is not
    guaranteed to be properly aligned for other types. With a bit of trouble
    this should be solvable; for example, I think that

    struct foo {
    float bar;
    long baz;
    mytype *ptr;
    mytype dummy; /* Really _is_ a dummy, for alignment only. */
    }
    struct foo *fooptr=malloc(sizeof *fooptr +
    desired_number_of_objects_of_type_mytype);
    fooptr->ptr=(char *)fp + sizeof *fooptr - sizeof *(fooptr->ptr);

    should work no matter what mytype is. Note that the dummy object should
    be of the base type of the pointer for this trick to work, and that it's
    still a dirty piece of code and I give no guarantees; I wouldn't use
    this myself. The code with two malloc()s is clearer and cleaner.

    Richard
     
    Richard Bos, Feb 27, 2004
    #5
  6. On Fri, 27 Feb 2004 05:45:07 UTC, (Geetesh)
    wrote:

    > Recently i saw a code in which there was a structer defination similar
    > as bellow:
    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int last[1]
    > };
    > In application the above array is always allocated at runtime using
    > malloc.In this last member of the structer "int last[1]" is not
    > actually used as array with single element but when alloacting space
    > for struct foo extra memory is allocated and last is used as array
    > with more then one element. my question is what are the advantages of
    > using the above defination instead of the shown below.
    > struct foo
    > {
    > int dummy1;
    > int dummy2;
    > int *last;
    > };
    > The only advantage i can think of is that we will have to call single
    > malloc in first declaration and two malloc in second declaration and
    > also that in first declaration all the memeory allocated will be
    > contigous which may lead to less framgmentation and better cache
    > utilization. My question is does using first defination for accessing
    > of elements faster when compared to second. If yes why?
    > Thanks in advance.


    It save memory. At lest the amount of memory a pointer costs.
    It saves time as not every time are 2 malloc() required to fill a
    whole struct.

    No, it is NOT undefined behavior as Jack Klein says. But it is
    implementation defined.

    Look at the APIs of your OS. The chance is high that there is at least
    one or more APIs who deliver or receive such kind of structs.


    --
    Tschau/Bye
    Herbert

    Visit http://www.ecomstation.de the home of german eComStation
     
    The Real OS/2 Guy, Feb 27, 2004
    #6
  7. On Fri, 27 Feb 2004 06:14:54 UTC, Jack Klein <>
    wrote:

    > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > wrote in comp.lang.c:
    >
    > > Recently i saw a code in which there was a structer defination similar
    > > as bellow:
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int last[1]
    > > };

    >
    > This causes undefined behavior and is invalid code under all versions
    > of the C language standard.


    Chapter and verse please in both ANSI C 89 and ANSI C99.

    ANSI C 99 allows even <type> last[0] on this place what makes more
    clean that this is only a place holder to have a name for the extra
    space.

    You may still call this as implementation defined - but NOT undefined.

    --
    Tschau/Bye
    Herbert

    Visit http://www.ecomstation.de the home of german eComStation
     
    The Real OS/2 Guy, Feb 27, 2004
    #7
  8. Geetesh

    Richard Bos Guest

    "The Real OS/2 Guy" <> wrote:

    > On Fri, 27 Feb 2004 06:14:54 UTC, Jack Klein <>
    > wrote:
    >
    > > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > > wrote in comp.lang.c:
    > >
    > > > Recently i saw a code in which there was a structer defination similar
    > > > as bellow:
    > > > struct foo
    > > > {
    > > > int dummy1;
    > > > int dummy2;
    > > > int last[1]
    > > > };

    > >
    > > This causes undefined behavior and is invalid code under all versions
    > > of the C language standard.

    >
    > Chapter and verse please in both ANSI C 89 and ANSI C99.


    Note: it is not the declaration as such which invokes UB, but its use as
    a variable-sized struct. See the original post.

    In ISO (not ANSI; neither you nor I are in the United States - what have
    we to do with the US Standards Institute?) C99, 6.7.2.1, and note
    especially the indication of "undefined behaviour" in example #19. Since
    examples aren't normative, see also 6.5.6#8, and note that the result of
    adding a pointer and an integer which is larger than the size of that
    pointer is not defined, hence remains undefined. (Note also the
    equivalence of array subscription and pointer addition, 6.5.2.1).

    In C89, see 3.3.6, which defines pointer-integer addition essentially
    identically to C99. Since there's no incomplete final array type in C89
    structs, you won't find anything interesting in 3.5.2.1

    > ANSI C 99 allows even <type> last[0] on this place what makes more
    > clean that this is only a place holder to have a name for the extra
    > space.


    Not quite. It allows an incomplete array - that is, one _without_ a
    size, not with size 0 - as the last member of a structure. An array with
    size 0 is, AFAICT, simply not allowed; and using an array with size 1 as
    if it were an incomplete array is as undefined as in C89.

    > You may still call this as implementation defined - but NOT undefined.


    It _is_ undefined. Sorry <g>.

    Richard
     
    Richard Bos, Feb 27, 2004
    #8
  9. Geetesh

    Richard Bos Guest

    "The Real OS/2 Guy" <> wrote:

    > On Fri, 27 Feb 2004 05:45:07 UTC, (Geetesh)
    > wrote:
    >
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int last[1]
    > > };
    > > In application the above array is always allocated at runtime using
    > > malloc.In this last member of the structer "int last[1]" is not
    > > actually used as array with single element but when alloacting space
    > > for struct foo extra memory is allocated and last is used as array
    > > with more then one element.


    > No, it is NOT undefined behavior as Jack Klein says. But it is
    > implementation defined.


    Yes, it is. Pointer addition beyond the end of the array is undefined.

    > Look at the APIs of your OS. The chance is high that there is at least
    > one or more APIs who deliver or receive such kind of structs.


    That some OSes choose to make this kind of undefined behaviour "work"
    does not mean that it suddenly is defined.

    Richard
     
    Richard Bos, Feb 27, 2004
    #9
  10. On Fri, 27 Feb 2004 11:11:47 UTC, (Richard
    Bos) wrote:

    > "The Real OS/2 Guy" <> wrote:
    >
    > > On Fri, 27 Feb 2004 05:45:07 UTC, (Geetesh)
    > > wrote:
    > >
    > > > struct foo
    > > > {
    > > > int dummy1;
    > > > int dummy2;
    > > > int last[1]
    > > > };
    > > > In application the above array is always allocated at runtime using
    > > > malloc.In this last member of the structer "int last[1]" is not
    > > > actually used as array with single element but when alloacting space
    > > > for struct foo extra memory is allocated and last is used as array
    > > > with more then one element.

    >
    > > No, it is NOT undefined behavior as Jack Klein says. But it is
    > > implementation defined.

    >
    > Yes, it is. Pointer addition beyond the end of the array is undefined.


    Sou you says any action int an array allocated with malloc ends up in
    undefined behavior.

    > > Look at the APIs of your OS. The chance is high that there is at least
    > > one or more APIs who deliver or receive such kind of structs.

    >
    > That some OSes choose to make this kind of undefined behaviour "work"
    > does not mean that it suddenly is defined.


    You means that

    int *p = malloc(4000),

    stat *p1 = p + sizeof(stat) * 100;
    stat *p2 = p1++;

    is undefined behavior? So please, do never use malloc as it results
    always in undefined behavior;

    When it were really undefined behavior then we had produced 30 millon
    lines of code as we have produced in the last yeare to run on 5
    different OSes (linux, AIX, HPUX, MAC OS and OS/2 - as the same soure
    gets unmodified - but recompiled running on all that mashines.

    We should war our customers that theyr code inspections and regression
    checks have faild 5 years ago and that all of theyr critical
    applications are failing every minutes - even as the production in
    theyr time critical environments runs since then without since are
    runs well.

    You should warn any OS producer that theyr OS will fail always because
    they require undefined behavior as all of them have APIs based on that
    technique.

    I think you should inform yourself what pointer arithmetic can really
    do for you - when you knows what you are doing.

    Where is undefined behavior here?

    struct x {
    size_t cb;
    struct a *pa;
    int val;
    unsigned int flags;
    char *sa[1000]; /* we need 3 to 999 chars here */
    };

    struct y {
    size_t cb;
    struct a *pa;
    int val;
    unsigned int flags;
    char s[1]; /* we have to compile ANSI C 89! */
    };


    struct x *p1 = malloc(sizeof(struct x) * 1000); /* UB? */
    .....
    struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
    */
    .....
    strcpy(y->s, data); /* UB? on what? */

    Show one single ANSI C 89 compiler who will give undefined behavior on
    that. I can't find one.

    I can see no UB in the code fragments above. But I see that any byte
    addresse gets addressed well.

    Tell me what is the difference between UB and and implementation
    defined. I see there some.

    Whenever you allocs memory in the size you needs - not a single byte
    less - then you CAN'T get UB when you knows how to hanlde pointer
    arithmetic. There is even in struct y not a single byte that is UB
    because anything is well, well aligned well addressed.

    --
    Tschau/Bye
    Herbert

    Visit http://www.ecomstation.de the home of german eComStation
     
    The Real OS/2 Guy, Feb 27, 2004
    #10
  11. [snips]

    On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:

    > struct y {
    > size_t cb;
    > struct a *pa;
    > int val;
    > unsigned int flags;
    > char s[1]; /* we have to compile ANSI C 89! */
    > };
    >
    >
    > struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
    > */


    This, AFAICT, is not UB; you can allocate whatever size you want.


    > strcpy(y->s, data); /* UB? on what? */


    This, however...

    I *think* - not sure - that C99 offers explicit support for this. In c89,
    however, the code is broken if "data" contains anything but an empty
    string, since s is defined to be one byte long. A compiler which supports
    bounds-checking, for example, can happily trap here, since anything other
    than an empty string in data will overrun the array bounds.


    > Show one single ANSI C 89 compiler who will give undefined behavior on
    > that. I can't find one.


    All of them. The compiler doesn't define UB, the standard does. That
    your compiler collection happens to allow this behaviour is irrelevant.

    > Tell me what is the difference between UB and and implementation
    > defined. I see there some.


    UB is anything the standard either explicitly defines to be UB, or, by
    failure to define in another category, leaves undefined. Notable examples
    are anything which violates a "shall" clause, such as "main shall return
    an int" - thus void main() is UB.

    Implementation-defined behaviour is things the particular implementation
    has some freedom to "play with" - shifting of signed values, IIRC, falls
    into this category. However, the implementation is required to document
    the behaviour; the behaviour is _defined_... but defined by the
    implementation, not by the standard.

    > Whenever you allocs memory in the size you needs - not a single byte
    > less - then you CAN'T get UB when you knows how to hanlde pointer
    > arithmetic.


    The problem isn't with malloc or with pointer arithmetic; the problem is
    that you're accessing something - s - which has a definite size - 1 byte -
    but not staying within the limits of the object's size.
     
    Kelsey Bjarnason, Feb 27, 2004
    #11
  12. [snips]

    On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:

    > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > wrote in comp.lang.c:
    >
    >> Recently i saw a code in which there was a structer defination similar
    >> as bellow:
    >> struct foo
    >> {
    >> int dummy1;
    >> int dummy2;
    >> int last[1]
    >> };

    >
    > This causes undefined behavior and is invalid code under all versions
    > of the C language standard.


    I thought C99 brought in support for the "struct hack"?
     
    Kelsey Bjarnason, Feb 27, 2004
    #12
  13. Geetesh

    Ben Pfaff Guest

    Re: which is the better way to declare dynamic single dimensionarray inside struct

    Kelsey Bjarnason <> writes:

    > [snips]
    >
    > On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:
    >
    >> On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    >> wrote in comp.lang.c:
    >>
    >>> Recently i saw a code in which there was a structer defination similar
    >>> as bellow:
    >>> struct foo
    >>> {
    >>> int dummy1;
    >>> int dummy2;
    >>> int last[1]
    >>> };

    >>
    >> This causes undefined behavior and is invalid code under all versions
    >> of the C language standard.

    >
    > I thought C99 brought in support for the "struct hack"?


    Yes, but the C99 version is expressed with empty brackets [], not
    with [1].
    --
    "...Almost makes you wonder why Heisenberg didn't include postinc/dec operators
    in the uncertainty principle. Which of course makes the above equivalent to
    Schrodinger's pointer..."
    --Anthony McDonald
     
    Ben Pfaff, Feb 27, 2004
    #13
  14. On Fri, 27 Feb 2004, Kelsey Bjarnason wrote:
    >
    > On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:
    > >
    > > struct y {
    > > size_t cb;
    > > struct a *pa;
    > > int val;
    > > unsigned int flags;
    > > char s[1]; /* we have to compile ANSI C 89! */
    > > };
    > >
    > >
    > > struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
    > > */

    >
    > This, AFAICT, is not UB; you can allocate whatever size you want.
    >
    >
    > > strcpy(y->s, data); /* UB? on what? */


    UB on the fact that perhaps
    (sizeof(struct x)+strlen(data)) < (sizeof *y), for one thing. But
    I'll assume that you made a typo in the malloc line, and meant to write

    > > struct y *p2 = malloc(sizeof(struct y) + strlen(data));

    ^^^^^^^^

    (Incidentally, I'll evangelize again: The canonical c.l.c idiom
    for malloc calls would avoid this bug.)

    > > strcpy(y->s, data);


    > This, however...
    >
    > I *think* - not sure - that C99 offers explicit support for this.


    Not really. C99 offers a *new syntax* for variable-sized arrays
    in the last part of a struct, but the old C90 "struct hack" is still
    not supported. (In fact, it's explicitly "un-supported," since it
    explicitly invokes undefined behavior.)

    > In c89,
    > however, the code is broken if "data" contains anything but an empty
    > string, since s is defined to be one byte long. A compiler which supports
    > bounds-checking, for example, can happily trap here, since anything other
    > than an empty string in data will overrun the array bounds.

    <large snip>
    > The problem isn't with malloc or with pointer arithmetic; the problem is
    > that you're accessing something - s - which has a definite size - 1 byte -
    > but not staying within the limits of the object's size.


    Exactly.

    -Arthur
     
    Arthur J. O'Dwyer, Feb 27, 2004
    #14
  15. On Fri, 27 Feb 2004 18:59:09 UTC, Kelsey Bjarnason
    <> wrote:

    > [snips]
    >
    > On Fri, 27 Feb 2004 16:42:50 +0000, The Real OS/2 Guy wrote:
    >
    > > struct y {
    > > size_t cb;
    > > struct a *pa;
    > > int val;
    > > unsigned int flags;
    > > char s[1]; /* we have to compile ANSI C 89! */
    > > };
    > >
    > >
    > > struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what?
    > > */

    >
    > This, AFAICT, is not UB; you can allocate whatever size you want.
    >
    >
    > > strcpy(y->s, data); /* UB? on what? */

    >
    > This, however...
    >
    > I *think* - not sure - that C99 offers explicit support for this. In c89,
    > however, the code is broken if "data" contains anything but an empty
    > string, since s is defined to be one byte long. A compiler which supports
    > bounds-checking, for example, can happily trap here, since anything other
    > than an empty string in data will overrun the array bounds.


    There is no boundschecking in the standard.

    >
    > > Show one single ANSI C 89 compiler who will give undefined behavior on
    > > that. I can't find one.

    >
    > All of them. The compiler doesn't define UB, the standard does. That
    > your compiler collection happens to allow this behaviour is irrelevant.


    Where? I can't find it, so chapter and verse please.

    > > Tell me what is the difference between UB and and implementation
    > > defined. I see there some.

    >
    > UB is anything the standard either explicitly defines to be UB, or, by
    > failure to define in another category, leaves undefined. Notable examples
    > are anything which violates a "shall" clause, such as "main shall return
    > an int" - thus void main() is UB.


    Ah, as you says the standard explicity defines UB - but where is the
    UP belonging to to this?

    > Implementation-defined behaviour is things the particular implementation
    > has some freedom to "play with" - shifting of signed values, IIRC, falls
    > into this category. However, the implementation is required to document
    > the behaviour; the behaviour is _defined_... but defined by the
    > implementation, not by the standard.
    >
    > > Whenever you allocs memory in the size you needs - not a single byte
    > > less - then you CAN'T get UB when you knows how to hanlde pointer
    > > arithmetic.

    >
    > The problem isn't with malloc or with pointer arithmetic; the problem is
    > that you're accessing something - s - which has a definite size - 1 byte -
    > but not staying within the limits of the object's size.
    >

    Hm, the opbject size is defined through the size it is allocated
    through malloc(). Sure, you falls miserably when you declares such
    struct statically - but you can't fail when you use dynamic allocation
    right. That is the trick where you avoids UB because YOU defines the
    real size through malloc, so you gets always enough continous memory
    to pass the content you needs in it.


    --
    Tschau/Bye
    Herbert

    Visit http://www.ecomstation.de the home of german eComStation
     
    The Real OS/2 Guy, Feb 28, 2004
    #15
  16. Geetesh

    Jack Klein Guest

    On Fri, 27 Feb 2004 11:00:06 -0800, Kelsey Bjarnason
    <> wrote in comp.lang.c:

    > [snips]
    >
    > On Fri, 27 Feb 2004 06:06:25 +0000, Jack Klein wrote:
    >
    > > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > > wrote in comp.lang.c:
    > >
    > >> Recently i saw a code in which there was a structer defination similar
    > >> as bellow:
    > >> struct foo
    > >> {
    > >> int dummy1;
    > >> int dummy2;
    > >> int last[1]
    > >> };

    > >
    > > This causes undefined behavior and is invalid code under all versions
    > > of the C language standard.

    >
    > I thought C99 brought in support for the "struct hack"?


    Indeed it did, but not with [1], so this code is not valid for the C99
    struct with flexible array either.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Feb 28, 2004
    #16
  17. Geetesh

    Jack Klein Guest

    On Fri, 27 Feb 2004 09:49:30 +0000 (UTC), "The Real OS/2 Guy"
    <> wrote in comp.lang.c:

    > On Fri, 27 Feb 2004 05:45:07 UTC, (Geetesh)
    > wrote:
    >
    > > Recently i saw a code in which there was a structer defination similar
    > > as bellow:
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int last[1]
    > > };
    > > In application the above array is always allocated at runtime using
    > > malloc.In this last member of the structer "int last[1]" is not
    > > actually used as array with single element but when alloacting space
    > > for struct foo extra memory is allocated and last is used as array
    > > with more then one element. my question is what are the advantages of
    > > using the above defination instead of the shown below.
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int *last;
    > > };
    > > The only advantage i can think of is that we will have to call single
    > > malloc in first declaration and two malloc in second declaration and
    > > also that in first declaration all the memeory allocated will be
    > > contigous which may lead to less framgmentation and better cache
    > > utilization. My question is does using first defination for accessing
    > > of elements faster when compared to second. If yes why?
    > > Thanks in advance.

    >
    > It save memory. At lest the amount of memory a pointer costs.
    > It saves time as not every time are 2 malloc() required to fill a
    > whole struct.
    >
    > No, it is NOT undefined behavior as Jack Klein says. But it is
    > implementation defined.


    The term "implementation-defined" has a precise meaning in the C
    standard, and in fact is specifically defined in the standard. The
    only things which are implementation-defined in C are those which the
    C standard specifically states are implementation-defined, using that
    exact term, hyphen and all.

    The standard cannot, and makes no attempt to, prevent compilers from
    providing extensions beyond the language. Such extensions are thus
    that, extensions provided by an implementation. That does not mean
    that they are "implementation-defined" as far as the C language is
    concerned.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Feb 28, 2004
    #17
  18. Geetesh

    Richard Bos Guest

    "The Real OS/2 Guy" <> wrote:

    > On Fri, 27 Feb 2004 11:11:47 UTC, (Richard
    > Bos) wrote:
    >
    > > "The Real OS/2 Guy" <> wrote:
    > >
    > > > On Fri, 27 Feb 2004 05:45:07 UTC, (Geetesh)
    > > > wrote:
    > > >
    > > > > struct foo
    > > > > {
    > > > > int dummy1;
    > > > > int dummy2;
    > > > > int last[1]
    > > > > };
    > > > > In application the above array is always allocated at runtime using
    > > > > malloc.In this last member of the structer "int last[1]" is not
    > > > > actually used as array with single element but when alloacting space
    > > > > for struct foo extra memory is allocated and last is used as array
    > > > > with more then one element.

    > >
    > > > No, it is NOT undefined behavior as Jack Klein says. But it is
    > > > implementation defined.

    > >
    > > Yes, it is. Pointer addition beyond the end of the array is undefined.

    >
    > Sou you says any action int an array allocated with malloc ends up in
    > undefined behavior.


    Of course not, don't be daft! However did you get that idea?

    > > That some OSes choose to make this kind of undefined behaviour "work"
    > > does not mean that it suddenly is defined.

    >
    > You means that
    >
    > int *p = malloc(4000),
    >
    > stat *p1 = p + sizeof(stat) * 100;
    > stat *p2 = p1++;
    >
    > is undefined behavior?


    I don't know how you can construe what I wrote as meaning that all
    pointer arithmetic is UB, but that's neither what I meant nor what I
    wrote.

    > Where is undefined behavior here?
    >
    > struct x {
    > size_t cb;
    > struct a *pa;
    > int val;
    > unsigned int flags;
    > char *sa[1000]; /* we need 3 to 999 chars here */


    Whereas what you're getting is a thousand pointers to char - something
    completely different.

    > };
    >
    > struct y {
    > size_t cb;
    > struct a *pa;
    > int val;
    > unsigned int flags;
    > char s[1]; /* we have to compile ANSI C 89! */
    > };
    >
    > struct x *p1 = malloc(sizeof(struct x) * 1000); /* UB? */


    Of course not. Massive over-allocation, probably, but not UB.

    > struct y *p2 = malloc(sizeof(struct x) + strlen(data)); /* UB on what? */


    No, this is not yet UB.

    > strcpy(y->s, data); /* UB? on what? */


    Yes, this _is_ UB if more than one char is written - that is, if data
    contains more than an empty string. You are writing to the array and
    (most likely, since data is likely to contain something printable) write
    beyond its boundary.
    Since the behaviour of code which does this is not defined by the ISO C
    Standard, it is, surprise, surprise, _undefined_.

    > Show one single ANSI C 89 compiler who will give undefined behavior


    Compilers do not "give" undefined behaviour. The Standard _defines_
    defined behaviour, and any _code_ which is not defined by the Standard
    ipso facto invokes undefined behaviour.
    Compilers have nothing to do with it, except insofar that they _must_
    compile correct, not-undefined, code correctly, and _may_, but need not,
    compile code invoking undefined behaviour into something which happens
    to work.

    > I can see no UB in the code fragments above. But I see that any byte
    > addresse gets addressed well.


    Non sequitur, AFAICT.

    > Tell me what is the difference between UB and and implementation
    > defined.


    RTBS:

    # [#1] implementation-defined behavior
    # unspecified behavior where each implementation documents how
    # the choice is made
    # [#1] unspecified behavior
    # behavior where this International Standard provides two or
    # more possibilities and imposes no requirements on which is
    # chosen in any instance

    IOW, implementation-defined behaviour _must_ work, although it may work
    differently using different implementations.

    # [#1] undefined behavior
    # behavior, upon use of a nonportable or erroneous program
    # construct, of erroneous data, or of indeterminately valued
    # objects, for which this International Standard imposes no
    # requirements
    # [#2] NOTE Possible undefined behavior ranges from ignoring
    # the situation completely with unpredictable results, to
    # behaving during translation or program execution in a
    # documented manner characteristic of the environment (with or
    # without the issuance of a diagnostic message), to
    # terminating a translation or execution (with the issuance of
    # a diagnostic message).

    IOW, undefined behaviour _need not_ work, though it may do so if the
    compiler writer feels like being helpful, or may appear to work but fail
    every wednesday at four o'clock if he feels like being a bastard; in any
    case, it _cannot_ be relied on.

    > Whenever you allocs memory in the size you needs - not a single byte
    > less - then you CAN'T get UB when you knows how to hanlde pointer
    > arithmetic.


    Which, apparently, you do not.

    Look, in a post elsewhere in this thread I pointed you at some examples
    in the Standard that made this quite clear. Since you seem unable to
    read printed matter or to download a file, I'll give one other example
    myself, to point out exactly where the crux is:

    struct foo {
    int bar;
    char ptr[10];
    };

    struct foo *afoo;
    char *chptr;

    afoo=malloc(sizeof *afoo + 100);
    afoo->bar=10;
    afoo->ptr[5]='a'; /* Legal; within the array. */
    afoo->ptr[15]='b'; /* Illegal; outside the array. */
    chptr=afoo;
    chptr[sizeof *afoo + 50]='x';
    /* Legal, since chptr points to the memory area, _not_ to the array
    member. Also quite useless from ptr's POV, since we don't know
    how much padding there is at the end of a struct foo. */

    Richard
     
    Richard Bos, Mar 1, 2004
    #18
  19. In article <wmzsGguTDN6N-pn2-vhV7UNrERfsl@moon>, "The Real OS/2 Guy" <> writes:
    > On Fri, 27 Feb 2004 18:59:09 UTC, Kelsey Bjarnason
    > <> wrote:
    > > A compiler which supports
    > > bounds-checking, for example, can happily trap here, since anything other
    > > than an empty string in data will overrun the array bounds.

    >
    > There is no boundschecking in the standard.


    The standard does not prohibit a conforming implementation from
    performing bounds checking.

    C99 supports a form of the struct hack, but not the one you gave.
    Previous versions of C do not support it in any form. That it works
    with some C implementations is utterly beside the point.

    --
    Michael Wojcik

    It wasn't fair; my life was now like everyone else's. -- Eric Severance
     
    Michael Wojcik, Mar 1, 2004
    #19
  20. Geetesh

    Yakov Lerner Guest

    Jack Klein <> wrote:
    > On 26 Feb 2004 21:45:07 -0800, (Geetesh)
    > wrote in comp.lang.c:
    >
    > > Recently i saw a code in which there was a structer defination similar
    > > as bellow:
    > > struct foo
    > > {
    > > int dummy1;
    > > int dummy2;
    > > int last[1]
    > > };

    >
    > This [[meaning - access beyond last[1] -JL ]]
    > causes undefined behavior and is invalid code under all versions
    > of the C language standard.


    With caution, it is possible to write standard-compliant code that
    accesses the last[] up to allocated length. Let's call it
    "offsetoff-hack", see below. We will rely on address equivalence
    with underlying malloced block. Address arithmetics on underlying
    malloced block is valid up to allocated length.

    Here is how to do it:

    typedef ... T;
    typedef struct S {
    ...some members defined here ...
    T last[1];
    } S;
    int max_last = 100, k;
    int alloc_size = sizeof(S) + max_last * sizeof(T);
    /* or (max_last-1) */
    char *base = malloc( alloc_size );
    /* assuming malloc succeeds */
    S *p = (S*)base;
    for( k = 0; k < max_last; k++)
    {
    /* **** offsetoff-hack here---> ******/
    T* plast = (T*)(base + offsetof(S,last));
    /* **** ----------------------- ******/
    /* since valid indexes on base[] are 0..alloc_size-1, */
    /* valid indexes on plast[] are 0..max_last-1 */

    T tmp;
    tmp = plast[k]; // valid
    plast[k] = tmp; // valid
    }

    Such use of offsetof() is explicitly blessed by the standard in
    6.7.2.1 #16 (I'm using the Draft, from
    http://anubis.dkuug.dk/JTC1/SC22/WG14/www/docs/n869/)

    BTW I am not sure whether slighly altered expression:
    T* plast = (T*)((char*)p + offsetof(S,last));
    is still valid for accessing plast[0..max_last-1].
    I am positive that expression:
    T* plast = (T*)(base + offsetof(S,last));
    is OK for accessing for accessing plast[0..max_last-1].


    Additionally, If the last member is declared 'T last[]' not
    'T last[1])', and the compiler is C99-compliant, then direct
    access pfoo->last is legal up to allocated size, no need
    for 'offsetof-hack'.

    Jacob
     
    Yakov Lerner, Mar 3, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DanielEKFA
    Replies:
    8
    Views:
    641
    DanielEKFA
    May 16, 2005
  2. ottawajn

    Declare a two-dimension array

    ottawajn, Dec 13, 2006, in forum: C++
    Replies:
    4
    Views:
    474
    Bo Yang
    Dec 14, 2006
  3. Luuk
    Replies:
    15
    Views:
    882
    Nobody
    Feb 11, 2010
  4. Ashikali Ashikali

    single to multi dimension array conversion

    Ashikali Ashikali, Apr 13, 2009, in forum: Ruby
    Replies:
    3
    Views:
    109
    James Coglan
    Apr 13, 2009
  5. Tuan  Bui
    Replies:
    14
    Views:
    524
    it_says_BALLS_on_your forehead
    Jul 29, 2005
Loading...

Share This Page