Storing the size of an array in the structure itself

Discussion in 'C Programming' started by Wynand Winterbach, Jul 6, 2004.

  1. I think every C programmer can relate to the frustrations that malloc
    allocated arrays bring. In particular, I've always found the fact that
    the size of an array must be stored separately to be a nightmare.

    There are of course many solutions, but they all end up forcing you to
    abandon the array syntax in favour of macros or functions.

    Now I have two questions - one is historical, and the other practical.

    1.) Surely malloc (and friends) must store the size allocated for a
    particular memory allocation, if malloc is to know how much to
    deallocate when a free() occurs? Thus, why was the C library designed
    in such a fashion as not to make this information available? Or I am
    seriously missing something here?

    2.) Why not store the size of the array in its first four bytes (or
    first sizeof( size_t ) bytes ), and then shift the pointer to the
    array on by four bytes? Thus one has:

    first 4 bytes everything else
    [ size ][ data ]
    /\
    void * blah ---'

    Then it should behave as a "normal" array, with the added advantage of
    knowing its size. The reason I have doubts here, is that if this was
    such a good idea, I'm sure it would already have been widely used. Any
    compelling reason for avoiding this? This is a bit of hackery, but the
    hackery will be confined to the functions for allocating, resizing and
    checking the size of the array.

    The code could work as follows:

    void*
    malloc_array( size_t element_size, size_t items )
    {
    size_t sz = element_size * items;
    void* result = malloc( sz + sizeof( size_t ) ); /* allocate memory
    for array and for size chunk */
    *((size_t*) result) = items; /* assign the size to
    the first few bytes */
    return sz + sizeof( size_t ); /* return a pointer
    to the array pointing just beyond size chunk */
    }

    size_t
    sizeof_array( void *array )
    {
    return *( (size_t*) (array - sizeof( size_t )) );
    }

    This technique of course could also be used to store the byte size of
    the elements in the array. Oh yes, and in order to detect whether the
    size value was corrupted by accidentally writing over it, one could
    use a magic number (which would again be added by the same technique),
    which could be consulted with debugging code.
    Wynand Winterbach, Jul 6, 2004
    #1
    1. Advertising

  2. Wynand Winterbach

    Mike Wahler Guest

    "Wynand Winterbach" <> wrote in message
    news:...
    > I think every C programmer can relate to the frustrations that malloc
    > allocated arrays bring.


    I don't find them frustrating at all.


    > In particular, I've always found the fact that
    > the size of an array must be stored separately to be a nightmare.


    Why? Why is it any more 'frustrating' than keeping track
    of any other piece of information in a program? Also note
    that memory obtained with 'malloc()' isn't inherently an 'array',
    it's just a 'chunk' of memory, which you might or might not use
    to store an array.


    > There are of course many solutions, but they all end up forcing you to
    > abandon the array syntax in favour of macros or functions.


    Not at all. The 'best' solution imo is to simply save the
    size you allocate. And again, allocated memory isn't required
    to be used as an array.


    > Now I have two questions - one is historical, and the other practical.
    >
    > 1.) Surely malloc (and friends) must store the size allocated for a
    > particular memory allocation, if malloc is to know how much to
    > deallocate when a free() occurs?


    An implementation of 'malloc()' must of course keep 'housekeeping'
    information. But each implementation is free to implement 'malloc()'
    with whatever method is most appropriate for the target platform.
    The language standard only dictates the *behavior* of 'malloc()',
    not how it is to be implemented.


    > Thus, why was the C library designed
    > in such a fashion as not to make this information available?


    Think about it. WHen you call 'malloc()', you *have* this information.
    Otherwise you couldn't tell it how much to allocate.

    Also, if you're willing to go nonstandard and platform-specific,
    many implementations do provide a function to give the information
    you're after. Check your documentation.

    >Or I am
    > seriously missing something here?.


    I think you're just being lazy. :)

    >
    > 2.) Why not store the size of the array in its first four bytes (or
    > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > array on by four bytes? Thus one has:
    >
    > first 4 bytes everything else
    > [ size ][ data ]
    > /\
    > void * blah ---'


    This might indeed be the way it is done for some implementations,
    but it's not required. Perhaps for a given architecture it's
    simply not possible or too inefficient.

    > Then it should behave as a "normal" array,


    IMO you need to stop automatically thinking of allocated memory as
    an 'array'. It's simply allocated memory, to be used as desired.


    >with the added advantage of
    > knowing its size.


    You allocated it, you already know its size. Also note that
    the requirement for 'malloc()' is that it allocate *at least*
    the number of requested bytes, but it's allowed to allocate more
    (would typically be done in the interest of meeting the target
    platform's alignment requirements and/or of efficiency).

    > The reason I have doubts here, is that if this was
    > such a good idea, I'm sure it would already have been widely used.


    It would unnecessarily restrict implementors and possibly which
    platforms the C standard library could be implemented for.


    >Any
    > compelling reason for avoiding this? This is a bit of hackery, but the
    > hackery will be confined to the functions for allocating, resizing and
    > checking the size of the array.


    Right, it's 'hackery'. Keep It Simple. Just Remember The Size.
    (Pass it to any functions that need it).


    > The code could work as follows:
    >
    > void*
    > malloc_array( size_t element_size, size_t items )
    > {
    > size_t sz = element_size * items;
    > void* result = malloc( sz + sizeof( size_t ) ); /* allocate memory
    > for array and for size chunk */
    > *((size_t*) result) = items; /* assign the size to
    > the first few bytes */
    > return sz + sizeof( size_t ); /* return a pointer
    > to the array pointing just beyond size chunk */
    > }
    >
    > size_t
    > sizeof_array( void *array )
    > {
    > return *( (size_t*) (array - sizeof( size_t )) );
    > }


    If you want to go to all that trouble, be my guest. But I wouldn't
    bother.


    > This technique of course could also be used to store the byte size of
    > the elements in the array.


    But the memory allocated by 'malloc()' needn't necessarily be
    used as an array.


    >Oh yes, and in order to detect whether the
    > size value was corrupted by accidentally writing over it, one could
    > use a magic number (which would again be added by the same technique),
    > which could be consulted with debugging code.


    Perhaps some implementations do this. But again, they're not
    required to.

    -Mike
    Mike Wahler, Jul 6, 2004
    #2
    1. Advertising

  3. Wynand Winterbach

    Malcolm Guest

    "Wynand Winterbach" <> wrote
    >
    > I think every C programmer can relate to the frustrations that malloc
    > allocated arrays bring. In particular, I've always found the fact that
    > the size of an array must be stored separately to be a nightmare.
    >

    "Nightmare" is way too strong. It is a slight inconvenience to have to keep
    track of array size separately.
    >
    > There are of course many solutions, but they all end up forcing you to
    > abandon the array syntax in favour of macros or functions.
    >

    So these are basically non-solutions. If you want a higher level language
    that does array management for you, then use C++. Trying to use some sort to
    hand-rolled definearray() macro just makes your C code harder to read and
    to maintain.
    >
    > Now I have two questions - one is historical, and the other practical.
    >
    > 1.) Surely malloc (and friends) must store the size allocated for a
    > particular memory allocation, if malloc is to know how much to
    > deallocate when a free() occurs? Thus, why was the C library designed
    > in such a fashion as not to make this information available? Or I am
    > seriously missing something here?
    >

    I wouldn't say "seriously missing". ANSI C could easily have demanded that
    the library provide an msize() function, and it could have been added with
    minor overhead. However in their wisdom they decided against this, probably
    to keep old implementations in business.
    >
    > 2.) Why not store the size of the array in its first four bytes (or
    > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > array on by four bytes?
    >

    Internally a lot of libraries do this. The problem with doing it yourself is
    that it is not the convention, so it will confuse anyone else reading your
    code. You've also got to consider that, strictly, if you allocate an array
    of structures alignment issues may preclude you from grabbing the first four
    bytes. This problem can be solved, but it's another bit of fiddling and
    ugliness.

    malloc() and free() provide a clean, conceptually simple pair of routines
    for memory allocation and deallocation. Once you start messing with them you
    also begin to destroy the essential simplicity of the C language.
    Malcolm, Jul 6, 2004
    #3
  4. Wynand Winterbach

    Chris Torek Guest

    In article <>
    Wynand Winterbach <> writes:
    >1.) Surely malloc (and friends) must store the size allocated for a
    >particular memory allocation, if malloc is to know how much to
    >deallocate when a free() occurs?


    Perhaps. Or perhaps the size is computed via a long, painstaking
    process when you call free(), rather than being stored explicitly.

    Moreover, which size do you suspect that different malloc()
    implementations remember: the size you asked for, or the size
    you got? (You may get more than you asked for -- some malloc()s
    will round the size up in some cases. For instance, certain fast
    but somewhat-space-wasteful malloc()s will give you 4096 bytes
    when you ask for 2100. Indeed, almost all malloc()s probably
    round up in many cases, if not quite so severely.)

    None of this would prohibit a future Standard C from requiring
    some kind of "mallocsize" function, but it would require some
    debate as to whether mallocsize() must return n for all successful
    malloc(n) calls, or whether it could return rounded_up(n). It
    might also constrain future implementors (if mallocsize() were
    "expected" to be fast, and/or if it must return n rather than
    rounded_up(n)).

    All of this adds up to: "It is certainly possible, and not necessarily
    a bad idea, but it is not as simple as it looks at first either."

    >2.) Why not store the size of the array in its first four bytes (or
    >first sizeof( size_t ) bytes ), and then shift the pointer to the
    >array on by four bytes? Thus one has:
    >
    > first 4 bytes everything else
    >[ size ][ data ]
    > /\
    >void * blah ---'


    If you try this on a Sun SPARCstation (in 32-bit "size_t" mode),
    you will find that this technique works for "int"s, "longs", and
    "floats", but fails for "long long"s and "double"s. (In 64-bit
    mode it will work if size_t is itself a 64-bit type.) The reason
    is that the hardware requires 8-byte alignment for 8-byte data
    types loaded or stored via ldd/std/ldx/stx/lddf/stdf, and the
    compiler tends to use those instructions for those datatypes (with
    some exceptions -- function parameters of type "double" are misaligned
    in some of the subroutine-call protocols).

    Many other architectures have similar restrictions. Even the
    otherwise-quite-liberal x86 architecture has strong alignment
    constraints for its MMX and SSE instructions (at least if you
    want to use the "fast" MOVAPS instruction). Here the required
    alignment is not 8 but 16 bytes. (It gets even worse for special
    SMP instructions, where "good performance" uses 128-byte alignment!)

    There are nonportable ways to make this work -- basically you need
    to prefix the allocated space with a union of a single size_t, and
    the machine's most restrictive data type (whatever that is) or an
    array of bytes of the size of the most restrictive type (whatever
    that is, again). Unfortunately, the C Standard gives you no help
    in finding this most-restrictive type or its size. Such a type/size
    must in fact exist (because malloc() works), but the Standard does
    not export it to user code.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jul 7, 2004
    #4
  5. Wynand Winterbach

    jacob navia Guest

    I used an implementation of malloc that I wrote for windows 16 bit
    that essentially was what you propose: I stored a cookie (magic number)
    the size, and at the end I stored another cookie, to check if the block was
    overwritten.

    But basicaly what you want is a bounded pointer.

    A bounded pointer is a pointer that can move within a certain memory area
    and not elsewhee.

    Support for bounded pointers is inexistent in standard C and you must figure
    it
    out yourself. You do:

    someFn(ptr,siz);

    It is up to you to never make a mistake.

    This is a hole in the language, and recently I wrote an article about this
    in comp.lang.lcc. You can participate in that discussion if you wish.

    jacob
    jacob navia, Jul 7, 2004
    #5
  6. Wynand Winterbach

    Mike Wahler Guest

    "jacob navia" <> wrote in message
    news:ccg414$ti9$...
    > I used an implementation of malloc that I wrote for windows 16 bit
    > that essentially was what you propose: I stored a cookie (magic number)
    > the size, and at the end I stored another cookie, to check if the block

    was
    > overwritten.
    >
    > But basicaly what you want is a bounded pointer.
    >
    > A bounded pointer is a pointer that can move within a certain memory area
    > and not elsewhee.
    >
    > Support for bounded pointers is inexistent in standard C and you must

    figure
    > it
    > out yourself. You do:
    >
    > someFn(ptr,siz);
    >
    > It is up to you to never make a mistake.
    >
    > This is a hole in the language,


    I wouldn't call it (lack of a memory access 'safety net)'
    a 'hole' in the language at all. Some might deem having it
    a nicety, but it's certainly not needed. It would also impose
    an unnecessary restriction on implementations, especially those
    tailored for maximum efficiency (rather than 'safety').
    Someone here recently said "C is a sharp tool". I agree, and
    I like it that way. :)

    -Mike
    Mike Wahler, Jul 7, 2004
    #6
  7. Wynand Winterbach

    jacob navia Guest

    "Mike Wahler" <> a écrit dans le message de
    news:b2NGc.7647$...
    > Someone here recently said "C is a sharp tool". I agree, and
    > I like it that way. :)


    Mike:

    The prototype of a sharp tool is a knife. But you have surely remarked that
    it as two sides:

    A sharp side, the blade, that YOU DO NOT TOUCH.

    A blunt side, the handle, that allows you to drive the blade SAFELY.

    Without this blunt side, a sharp knife would be unusable because
    you would always CUT YOURSELF THE FINGERS when using it,
    unless extremely careful.

    What I am missing in C is exactly this blunt side that would allow you to
    use safely a sharp tool. A sharp tool without this is UNUSABLE.
    You end always with bleeding fingers at the end. You are bound to
    make a mistake.

    jacob
    jacob navia, Jul 7, 2004
    #7
  8. Wynand Winterbach

    Harti Brandt Guest

    On Wed, 7 Jul 2004, jacob navia wrote:

    jn>
    jn>"Mike Wahler" <> a écrit dans le message de
    jn>news:b2NGc.7647$...
    jn>> Someone here recently said "C is a sharp tool". I agree, and
    jn>> I like it that way. :)
    jn>
    jn>Mike:
    jn>
    jn>The prototype of a sharp tool is a knife. But you have surely remarked that
    jn>it as two sides:
    jn>
    jn>A sharp side, the blade, that YOU DO NOT TOUCH.
    jn>
    jn>A blunt side, the handle, that allows you to drive the blade SAFELY.
    jn>
    jn>Without this blunt side, a sharp knife would be unusable because
    jn>you would always CUT YOURSELF THE FINGERS when using it,
    jn>unless extremely careful.
    jn>
    jn>What I am missing in C is exactly this blunt side that would allow you to
    jn>use safely a sharp tool. A sharp tool without this is UNUSABLE.
    jn>You end always with bleeding fingers at the end. You are bound to
    jn>make a mistake.

    I fail to see why one should bloat the C standard just to allow
    programmers not to think? With every language that allows you what C
    allows you to do, programmers that are too lazy to think, will shot
    themself into their knees. snprintf() has been around for years, yet in
    new code you'll find that people just use sprintf() in places where they
    really shouldn't. On the other hand someone has still to show me why

    void
    foo(int i)
    {
    char buf[100];

    sprintf(buf, "%d", i);

    ...
    }

    is unsafe (unless the implementation has an int with a size of thousands
    of bits). sprintf() is not unsafe by itself, but the usage of it may be
    unsafe.

    The only way to make people to do programming more secure is to teach them
    (and just to avoid unteachable ones) and, if appropriate, to layer
    policies on top of the language. There are appropriate standards available
    that cripple the usage of C (not C itself) that seems 'secure' to the
    writers of the standard. I've been told, for example, that for satellite
    on-board control software the use of dynamic memory (malloc() and friends)
    is excluded. But again, this is not a technical matter, but one of policy.

    harti
    Harti Brandt, Jul 7, 2004
    #8
  9. Wynand Winterbach

    -berlin.de Guest

    jacob navia <> wrote:

    > "Mike Wahler" <> a écrit dans le message de
    > news:b2NGc.7647$...
    >> Someone here recently said "C is a sharp tool". I agree, and
    >> I like it that way. :)


    > Mike:


    > The prototype of a sharp tool is a knife. But you have surely remarked that
    > it as two sides:


    Perhaps Mike was refering to a two-edged sword, knives you can even
    put (more or less safely) in the hands of children;-)

    Regards, Jens
    --
    \ Jens Thoms Toerring ___ -berlin.de
    \__________________________ http://www.toerring.de
    -berlin.de, Jul 7, 2004
    #9
  10. Wynand Winterbach

    jacob navia Guest

    <-berlin.de> a écrit dans le message de
    news:...
    > jacob navia <> wrote:
    >
    >
    > Perhaps Mike was refering to a two-edged sword, knives you can even
    > put (more or less safely) in the hands of children;-)


    Swords (as knives) have handles too. You do NOT touch the sharp edge
    with your hands. All sharp tools have blunt edges. We need this for
    C. It is not a matter of making C what it isn't, it is just making C
    safer to use and KEEPING its qualities!
    jacob navia, Jul 7, 2004
    #10
  11. Wynand Winterbach

    Michael Mair Guest

    Hiho,

    > 2.) Why not store the size of the array in its first four bytes (or
    > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > array on by four bytes? Thus one has:
    >
    > first 4 bytes everything else
    > [ size ][ data ]
    > /\
    > void * blah ---'
    >
    > Then it should behave as a "normal" array, with the added advantage of
    > knowing its size. The reason I have doubts here, is that if this was
    > such a good idea, I'm sure it would already have been widely used. Any
    > compelling reason for avoiding this? This is a bit of hackery, but the
    > hackery will be confined to the functions for allocating, resizing and
    > checking the size of the array.
    >
    > The code could work as follows:
    >
    > void*
    > malloc_array( size_t element_size, size_t items )
    > {
    > size_t sz = element_size * items;
    > void* result = malloc( sz + sizeof( size_t ) ); /* allocate memory
    > for array and for size chunk */
    > *((size_t*) result) = items; /* assign the size to
    > the first few bytes */
    > return sz + sizeof( size_t ); /* return a pointer
    > to the array pointing just beyond size chunk */
    > }
    >
    > size_t
    > sizeof_array( void *array )
    > {
    > return *( (size_t*) (array - sizeof( size_t )) );
    > }
    >
    > This technique of course could also be used to store the byte size of
    > the elements in the array. Oh yes, and in order to detect whether the
    > size value was corrupted by accidentally writing over it, one could
    > use a magic number (which would again be added by the same technique),
    > which could be consulted with debugging code.


    Other people have pointed out reasons why not to use this approach
    for pointers and how it could go wrong.

    However, as you seem to think "array" when you hear or say pointer,
    you maybe should have a look at variable length arrays as last
    entry of a structure (variable array member). Just put the size in as a
    first element of that structure. This, in essence, produces the same
    thing as you want to have in a clean way, and if you need a pointer,
    you can generate it from the address of the array; however, you have to
    do this for every type if you do not want to run into memory alignment
    issues.

    Something like that:

    struct arrayplussize_double {
    size_t size;
    double array[];
    };

    with sizeof(struct arrayplussize_double) *ignoring* the flexible
    array member but extending the struct size such that the flexible
    array member would be correctly aligned.
    Allocation works like this

    struct arrayplussize_double *myarray = malloc(sizeof(struct
    arrayplussize_double)+sizeof(double)*desiredsize);

    where desiredsize is the desired size of the array.


    Cheers,
    Michael
    Michael Mair, Jul 7, 2004
    #11
  12. Wynand Winterbach

    Michael Mair Guest

    Hi Jacob,


    >>Perhaps Mike was refering to a two-edged sword, knives you can even
    >>put (more or less safely) in the hands of children;-)

    >
    >
    > Swords (as knives) have handles too. You do NOT touch the sharp edge
    > with your hands. All sharp tools have blunt edges. We need this for
    > C. It is not a matter of making C what it isn't, it is just making C
    > safer to use and KEEPING its qualities!


    I have not had a look at lcc nor at your proposals, so
    I cannot say anything to what you actually did.
    I just wanted to point out that the others do not criticise
    your wanting to fit a more convenient and safe handle to a sharp
    tool but your remaking this sharp tool into a spoon (in their
    opinion)... whether this keeps the qualities you need but
    not theirs, I cannot say.

    Back to somewhat more on-topical: You are free to sell your bounded
    pointers as compiler extensions. I consider it certainly safer seeing
    some people getting kitchen knives and spoons when learning to use
    tools instead of swords and daggers.

    However, I am perhaps not up to the sword yet but I certainly
    appreciate having a dagger when I need one.


    Apart from all this nice pictures:
    None of us is unhappy when you point out possible *additional*
    solutions or mention (together with a caveat) that there is
    an incredibly fool proof extension in your compiler suite.
    The thing is more that we do not want to get these things
    *instead* and do not want to discuss your compiler only.

    Especially if you are handling things in a way which is not
    conforming to the standard as you sometimes point out
    then you are making your comments off-topic. Also, some people
    react to a tone sounding too much like heralding the only true
    and of course different from all others solution.


    I think that you have presented some nice ideas up to now and
    come time I certainly will have a look at lcc if I need a
    compiler in an environment where lcc fits in.


    Cheers
    Michael
    Michael Mair, Jul 7, 2004
    #12
  13. Wynand Winterbach

    Mike Wahler Guest

    "jacob navia" <> wrote in message
    news:ccg8ta$vvg$...
    >
    > "Mike Wahler" <> a écrit dans le message de
    > news:b2NGc.7647$...
    > > Someone here recently said "C is a sharp tool". I agree, and
    > > I like it that way. :)

    >
    > Mike:
    >
    > The prototype of a sharp tool is a knife. But you have surely remarked

    that
    > it as two sides:
    >
    > A sharp side, the blade, that YOU DO NOT TOUCH.


    Right. Or only touch it very lightly.

    >
    > A blunt side,


    Some knife blades have a blunt side, others are sharp on
    both edges.

    > the handle, that allows you to drive the blade SAFELY.


    Right. My 'handle' is my mind, my judgement, my experience.


    >
    > Without this blunt side, a sharp knife would be unusable because
    > you would always CUT YOURSELF THE FINGERS when using it,
    > unless extremely careful.


    Right. That's why when I write C, I think carefully while doing it.

    >
    > What I am missing in C is exactly this blunt side that would allow you to
    > use safely a sharp tool.


    I'm not missing it. It's part of me.


    > A sharp tool without this is UNUSABLE.


    So think while you code in C.

    > You end always with bleeding fingers at the end. You are bound to
    > make a mistake.


    Everyone makes mistakes, with or without 'safety features'. How
    many times have you heard of folks seriously injured or killed
    in auto accidents because they failed to fasten their seatbelts?

    -Mike
    Mike Wahler, Jul 7, 2004
    #13
  14. In article <ccg8ta$vvg$>,
    jacob navia <> wrote:
    >
    >"Mike Wahler" <> a écrit dans le message de
    >news:b2NGc.7647$...
    >> Someone here recently said "C is a sharp tool". I agree, and
    >> I like it that way. :)


    <snip>
    >What I am missing in C is exactly this blunt side that would allow you to
    >use safely a sharp tool. A sharp tool without this is UNUSABLE.
    >You end always with bleeding fingers at the end. You are bound to
    >make a mistake.


    Funny, I thought that being able to keep the size around so you knew
    how big the array is was that blunt side.

    There's nothing wrong with requiring programmers to be careful with
    languages like C, any more than with requiring anybody else to be careful
    using potentially dangerous tools. If you want Java, you know where to
    find it.


    dave

    --
    Dave Vandervies
    > The smartest people I know aren't programmers. What does that say?

    Nothing surprising!
    --Andrew Dalke and Coby Beck in comp.lang.scheme
    Dave Vandervies, Jul 7, 2004
    #14
  15. On Wed, 7 Jul 2004 15:07:35 +0200, "jacob navia"
    <> wrote:

    >
    ><-berlin.de> a écrit dans le message de
    >news:...
    >> jacob navia <> wrote:
    >>
    >>
    >> Perhaps Mike was refering to a two-edged sword, knives you can even
    >> put (more or less safely) in the hands of children;-)

    >
    >Swords (as knives) have handles too. You do NOT touch the sharp edge
    >with your hands. All sharp tools have blunt edges. We need this for
    >C. It is not a matter of making C what it isn't, it is just making C
    >safer to use and KEEPING its qualities!
    >
    >


    With C, you are free to design the hilt in any way you like. You may
    encrust it with diamonds, you can wrap a towel around the blade's
    base, or just use it bare and with bare hands, risking very nasty
    cuts. You have done a great job with LCC (and I love using it), but
    IMHO _the_standard_ is fine the way it is. But this HO is coming from
    a person who doesn't like that fact that functions can return structs,
    so feel free to discard it :).

    --
    aib

    ISP e-mail accounts are good for receiving spam.
    Orhan Kavrakoglu, Jul 8, 2004
    #15
  16. "Malcolm" <> wrote in message news:<ccfa6i$64a$>...
    > "Wynand Winterbach" <> wrote
    > >
    > > I think every C programmer can relate to the frustrations that malloc
    > > allocated arrays bring. In particular, I've always found the fact that
    > > the size of an array must be stored separately to be a nightmare.
    > >

    > "Nightmare" is way too strong. It is a slight inconvenience to have to keep
    > track of array size separately.


    Ok, well, it annoys the hell out of me. Really.

    > >
    > > There are of course many solutions, but they all end up forcing you to
    > > abandon the array syntax in favour of macros or functions.
    > >

    > So these are basically non-solutions. If you want a higher level language
    > that does array management for you, then use C++. Trying to use some sort to
    > hand-rolled definearray() macro just makes your C code harder to read and
    > to maintain.


    I agree that macros make code harder to read, but this is no reason to
    go for C++. C++ has the major drawback, that one is not guaranteed that
    &a[0], where a is a std::vector of some type, will point to a chunk of
    memory containing your values. It happens to work for the my g++ library's
    implementation, but AFAIK, it is not required to work. This makes it
    a real burden to ensure portable code that must also work with C functions.

    So why not write a CArray type, which gives a function to do this? I could,
    but I happen to want to port my code Plan 9, and hence I want to keep it
    in C. I don't find the GCC port to Plan 9 satisfactory, and I like C,
    besides.

    > > Now I have two questions - one is historical, and the other practical.
    > >
    > > 1.) Surely malloc (and friends) must store the size allocated for a
    > > particular memory allocation, if malloc is to know how much to
    > > deallocate when a free() occurs? Thus, why was the C library designed
    > > in such a fashion as not to make this information available? Or I am
    > > seriously missing something here?
    > >

    > I wouldn't say "seriously missing". ANSI C could easily have demanded that
    > the library provide an msize() function, and it could have been added with
    > minor overhead. However in their wisdom they decided against this, probably
    > to keep old implementations in business.
    > >
    > > 2.) Why not store the size of the array in its first four bytes (or
    > > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > > array on by four bytes?
    > >

    > Internally a lot of libraries do this. The problem with doing it yourself is
    > that it is not the convention, so it will confuse anyone else reading your
    > code. You've also got to consider that, strictly, if you allocate an array
    > of structures alignment issues may preclude you from grabbing the first four
    > bytes. This problem can be solved, but it's another bit of fiddling and
    > ugliness.
    >
    > malloc() and free() provide a clean, conceptually simple pair of routines
    > for memory allocation and deallocation. Once you start messing with them you
    > also begin to destroy the essential simplicity of the C language.


    No, that is patently untrue. I cannot see how this would destroy the
    simplicity of C. My method is also conceptually simple (I think).
    Wynand Winterbach, Jul 20, 2004
    #16
  17. "jacob navia" <> wrote in message news:<ccg414$ti9$>...
    > I used an implementation of malloc that I wrote for windows 16 bit
    > that essentially was what you propose: I stored a cookie (magic number)
    > the size, and at the end I stored another cookie, to check if the block was
    > overwritten.
    >
    > But basicaly what you want is a bounded pointer.
    >
    > A bounded pointer is a pointer that can move within a certain memory area
    > and not elsewhee.
    >
    > Support for bounded pointers is inexistent in standard C and you must figure
    > it
    > out yourself. You do:
    >
    > someFn(ptr,siz);
    >
    > It is up to you to never make a mistake.
    >
    > This is a hole in the language, and recently I wrote an article about this
    > in comp.lang.lcc. You can participate in that discussion if you wish.
    >
    > jacob


    Actually I don't want a bounded pointer, although one could certainly
    provide a bounds checking indexing function. I only really want to be
    able to forget about maintaining an array's size. However, I think
    what you implemented was pretty much what I have in mind.

    I don't know whether I consider the lack of bounded pointers to be a
    hole in the language. I don't even consider my critisism really to
    indicate a hole in the language either. I just find it a pain to
    maintain an array size separately.
    Wynand Winterbach, Jul 20, 2004
    #17
  18. Michael Mair <-stuttgart.de> wrote in message news:<ccgts1$kqi$-stuttgart.de>...
    > Hiho,
    >
    > > 2.) Why not store the size of the array in its first four bytes (or
    > > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > > array on by four bytes? Thus one has:
    > >
    > > first 4 bytes everything else
    > > [ size ][ data ]
    > > /\
    > > void * blah ---'
    > >
    > > Then it should behave as a "normal" array, with the added advantage of
    > > knowing its size. The reason I have doubts here, is that if this was
    > > such a good idea, I'm sure it would already have been widely used. Any
    > > compelling reason for avoiding this? This is a bit of hackery, but the
    > > hackery will be confined to the functions for allocating, resizing and
    > > checking the size of the array.
    > >
    > > The code could work as follows:
    > >
    > > void*
    > > malloc_array( size_t element_size, size_t items )
    > > {
    > > size_t sz = element_size * items;
    > > void* result = malloc( sz + sizeof( size_t ) ); /* allocate memory
    > > for array and for size chunk */
    > > *((size_t*) result) = items; /* assign the size to
    > > the first few bytes */
    > > return sz + sizeof( size_t ); /* return a pointer
    > > to the array pointing just beyond size chunk */
    > > }
    > >
    > > size_t
    > > sizeof_array( void *array )
    > > {
    > > return *( (size_t*) (array - sizeof( size_t )) );
    > > }
    > >
    > > This technique of course could also be used to store the byte size of
    > > the elements in the array. Oh yes, and in order to detect whether the
    > > size value was corrupted by accidentally writing over it, one could
    > > use a magic number (which would again be added by the same technique),
    > > which could be consulted with debugging code.

    >
    > Other people have pointed out reasons why not to use this approach
    > for pointers and how it could go wrong.
    >
    > However, as you seem to think "array" when you hear or say pointer,


    Err no. I don't know why everyone is so eager to assume that because I
    was talking about arrays that I think all pointers point to arrays.
    It's a bit patronising to make such assumptions.

    > you maybe should have a look at variable length arrays as last
    > entry of a structure (variable array member). Just put the size in as a
    > first element of that structure. This, in essence, produces the same
    > thing as you want to have in a clean way, and if you need a pointer,
    > you can generate it from the address of the array; however, you have to
    > do this for every type if you do not want to run into memory alignment
    > issues.
    >
    > Something like that:
    >
    > struct arrayplussize_double {
    > size_t size;
    > double array[];
    > };
    >
    > with sizeof(struct arrayplussize_double) *ignoring* the flexible
    > array member but extending the struct size such that the flexible
    > array member would be correctly aligned.
    > Allocation works like this
    >
    > struct arrayplussize_double *myarray = malloc(sizeof(struct
    > arrayplussize_double)+sizeof(double)*desiredsize);
    >
    > where desiredsize is the desired size of the array.


    I don't like this approach. It's used by GLib, and it's exactly this
    that made me think of another approach.

    >
    > Cheers,
    > Michael
    Wynand Winterbach, Jul 20, 2004
    #18
  19. Harti Brandt <> wrote in message news:<>...
    > On Wed, 7 Jul 2004, jacob navia wrote:
    >
    > jn>
    > jn>"Mike Wahler" <> a crit dans le message de
    > jn>news:b2NGc.7647$...
    > jn>> Someone here recently said "C is a sharp tool". I agree, and
    > jn>> I like it that way. :)
    > jn>
    > jn>Mike:
    > jn>
    > jn>The prototype of a sharp tool is a knife. But you have surely remarked t
    > hat
    > jn>it as two sides:
    > jn>
    > jn>A sharp side, the blade, that YOU DO NOT TOUCH.
    > jn>
    > jn>A blunt side, the handle, that allows you to drive the blade SAFELY.
    > jn>
    > jn>Without this blunt side, a sharp knife would be unusable because
    > jn>you would always CUT YOURSELF THE FINGERS when using it,
    > jn>unless extremely careful.
    > jn>
    > jn>What I am missing in C is exactly this blunt side that would allow you t
    > o
    > jn>use safely a sharp tool. A sharp tool without this is UNUSABLE.
    > jn>You end always with bleeding fingers at the end. You are bound to
    > jn>make a mistake.
    >
    > I fail to see why one should bloat the C standard just to allow
    > programmers not to think? With every language that allows you what C
    > allows you to do, programmers that are too lazy to think, will shot
    > themself into their knees. snprintf() has been around for years, yet in
    > new code you'll find that people just use sprintf() in places where they
    > really shouldn't. On the other hand someone has still to show me why


    Because even smart programmers make mistakes. If we took the argument
    that programmers "have to think" to its fullest, then I see no reason
    why we abandoned
    assembler. Sure, assembler isn't portable, but I hardly think people
    made
    the shift for this reason. The less you have to keep programming
    details in
    mind, the more you can concentrate on the problem at hand.

    > void
    > foo(int i)
    > {
    > char buf[100];
    >
    > sprintf(buf, "%d", i);
    >
    > ...
    > }
    >
    > is unsafe (unless the implementation has an int with a size of thousands
    > of bits). sprintf() is not unsafe by itself, but the usage of it may be
    > unsafe.
    >
    > The only way to make people to do programming more secure is to teach them
    > (and just to avoid unteachable ones) and, if appropriate, to layer
    > policies on top of the language. There are appropriate standards available
    >
    > that cripple the usage of C (not C itself) that seems 'secure' to the
    > writers of the standard. I've been told, for example, that for satellite
    > on-board control software the use of dynamic memory (malloc() and friends)
    >
    > is excluded. But again, this is not a technical matter, but one of policy.


    This has more to do with the fact that such programs must meet
    realtime demands.
    This isn't generally possible with dynamic allocation.

    Proper usage of tools like splint and spin are far more beneficial
    than the application of some Draconian policy on how to do secure
    programming.

    > harti
    > --
    Wynand Winterbach, Jul 21, 2004
    #19
  20. "Mike Wahler" <> wrote in message news:<dRFGc.7333$>...
    > "Wynand Winterbach" <> wrote in message
    > news:...
    > > I think every C programmer can relate to the frustrations that malloc
    > > allocated arrays bring.

    >
    > I don't find them frustrating at all.
    >
    >
    > > In particular, I've always found the fact that
    > > the size of an array must be stored separately to be a nightmare.

    >
    > Why? Why is it any more 'frustrating' than keeping track
    > of any other piece of information in a program?


    Anything that decouples information in this way makes the life
    of the programmer harder, and the code harder to read! Array size
    is an integral part of the concept of an array. Would you have
    an employee struct, only to store an employee's birth date separately?
    I doubt it.

    Also note
    > that memory obtained with 'malloc()' isn't inherently an 'array',
    > it's just a 'chunk' of memory, which you might or might not use
    > to store an array.


    All I'm going to say here is DUH. Geesh, I cannot think of a single
    mammal capable of C programming that would assume a malloc chunk
    is anything more than just that - a chunk of memory. I never even
    once intimated to any such suggestion.

    >
    > > There are of course many solutions, but they all end up forcing you to
    > > abandon the array syntax in favour of macros or functions.

    >
    > Not at all. The 'best' solution imo is to simply save the
    > size you allocate. And again, allocated memory isn't required
    > to be used as an array.
    >
    >
    > > Now I have two questions - one is historical, and the other practical.
    > >
    > > 1.) Surely malloc (and friends) must store the size allocated for a
    > > particular memory allocation, if malloc is to know how much to
    > > deallocate when a free() occurs?

    >
    > An implementation of 'malloc()' must of course keep 'housekeeping'
    > information. But each implementation is free to implement 'malloc()'
    > with whatever method is most appropriate for the target platform.
    > The language standard only dictates the *behavior* of 'malloc()',
    > not how it is to be implemented.
    >
    >
    > > Thus, why was the C library designed
    > > in such a fashion as not to make this information available?

    >
    > Think about it. WHen you call 'malloc()', you *have* this information.
    > Otherwise you couldn't tell it how much to allocate.


    Yes, but again you're missing the point. Of course you have it! But
    this is not my principle problem.

    > Also, if you're willing to go nonstandard and platform-specific,
    > many implementations do provide a function to give the information
    > you're after. Check your documentation.
    >
    > >Or I am
    > > seriously missing something here?.

    >
    > I think you're just being lazy. :)


    There is no virtue in the calvinistic discipline wrought onto the
    programmer who has to keep track of so many details of the language
    at the expense of the problem s/he is solving.

    This is why Python is so great - the ultimate language for the lazy
    programmer. You also happen to produce pretty solid code with it,
    due to the concentration on the problem at hand. And when you need
    speed, use SWIG and C to build a module.

    > >
    > > 2.) Why not store the size of the array in its first four bytes (or
    > > first sizeof( size_t ) bytes ), and then shift the pointer to the
    > > array on by four bytes? Thus one has:
    > >
    > > first 4 bytes everything else
    > > [ size ][ data ]
    > > /\
    > > void * blah ---'

    >
    > This might indeed be the way it is done for some implementations,
    > but it's not required. Perhaps for a given architecture it's
    > simply not possible or too inefficient.
    >
    > > Then it should behave as a "normal" array,

    >
    > IMO you need to stop automatically thinking of allocated memory as
    > an 'array'. It's simply allocated memory, to be used as desired.
    >
    >
    > >with the added advantage of
    > > knowing its size.

    >
    > You allocated it, you already know its size. Also note that
    > the requirement for 'malloc()' is that it allocate *at least*
    > the number of requested bytes, but it's allowed to allocate more
    > (would typically be done in the interest of meeting the target
    > platform's alignment requirements and/or of efficiency).
    >
    > > The reason I have doubts here, is that if this was
    > > such a good idea, I'm sure it would already have been widely used.

    >
    > It would unnecessarily restrict implementors and possibly which
    > platforms the C standard library could be implemented for.
    >
    >
    > >Any
    > > compelling reason for avoiding this? This is a bit of hackery, but the
    > > hackery will be confined to the functions for allocating, resizing and
    > > checking the size of the array.

    >
    > Right, it's 'hackery'. Keep It Simple. Just Remember The Size.
    > (Pass it to any functions that need it).
    >
    >
    > > The code could work as follows:
    > >
    > > void*
    > > malloc_array( size_t element_size, size_t items )
    > > {
    > > size_t sz = element_size * items;
    > > void* result = malloc( sz + sizeof( size_t ) ); /* allocate memory
    > > for array and for size chunk */
    > > *((size_t*) result) = items; /* assign the size to
    > > the first few bytes */
    > > return sz + sizeof( size_t ); /* return a pointer
    > > to the array pointing just beyond size chunk */
    > > }
    > >
    > > size_t
    > > sizeof_array( void *array )
    > > {
    > > return *( (size_t*) (array - sizeof( size_t )) );
    > > }

    >
    > If you want to go to all that trouble, be my guest. But I wouldn't
    > bother.
    >
    >
    > > This technique of course could also be used to store the byte size of
    > > the elements in the array.

    >
    > But the memory allocated by 'malloc()' needn't necessarily be
    > used as an array.
    >
    >
    > >Oh yes, and in order to detect whether the
    > > size value was corrupted by accidentally writing over it, one could
    > > use a magic number (which would again be added by the same technique),
    > > which could be consulted with debugging code.

    >
    > Perhaps some implementations do this. But again, they're not
    > required to.
    >
    > -Mike
    Wynand Winterbach, Jul 21, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Excluded_Middle

    Pointers to structure and array of structure.

    Excluded_Middle, Oct 24, 2004, in forum: C Programming
    Replies:
    4
    Views:
    739
    Martin Ambuhl
    Oct 26, 2004
  2. shan

    Is it Nesting structure within itself

    shan, Aug 26, 2006, in forum: C Programming
    Replies:
    6
    Views:
    487
    Eric Sosman
    Aug 26, 2006
  3. toton
    Replies:
    11
    Views:
    699
    toton
    Oct 13, 2006
  4. Kislay

    Size of a structure : Structure Padding

    Kislay, Oct 1, 2007, in forum: C Programming
    Replies:
    15
    Views:
    937
    clinuxpro
    Jul 13, 2011
  5. Jonathan Wood
    Replies:
    1
    Views:
    498
    Jonathan Wood
    Jun 2, 2008
Loading...

Share This Page