Passing an array of structures back through the argument list

Discussion in 'C Programming' started by DSF, Sep 7, 2013.

  1. DSF

    DSF Guest

    Hello,

    My desire is to pass an array of structures, created within the
    called function, back through one of its parameters.

    I have had no problem doing this with arrays of strings, which I
    would think is more difficult. You start with a pointer to a pointer
    of arrays **strings, and then pass the address of **strings, to the
    function so the function takes parameters like char ***string. But I
    did get that to work.

    I've found doing the same thing with an array of structs to be more
    difficult. I feel I'm missing the obvious. The string example above
    returns an array of pointers because each pointer is pointing to a
    different string, since the strings vary in length. In an array of
    structs, each structure is the same size. So there is no need for
    that level of indirection.
    A memory allocation of sizeof(struct) * numberofstructs is all that
    is needed/desired.

    Of course, there is a FreeAlternateStreams function to free the
    names and the struct array!

    It works properly when I use a proxy pointer (*ds) to build the
    array. I then assign the proxy pointer to the start of the array to
    *sd, which assigns it back to the calling function.

    It seems I could just use *sd and eliminate the proxy, but when I do
    that, *sd points to an array of pointers to struct instead of the
    start of the structure.

    The code below works when VERSION1 is defined (proxy *ds). It is an
    example written purely for this message. There is no error checking
    on malloc for clarity's sake.

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    typedef struct tag_streamdata
    {
    char *name;
    int size;
    int attr;
    }streamdata;

    int GetAlternateStreams(streamdata **sd);

    #define VERSION1

    int main()
    {
    streamdata *sd;
    int n = 0;

    GetAlternateStreams(&sd);

    while(sd[n].name != NULL)
    {
    printf("Name: %s\n", sd[n].name);
    n++;
    }
    return EXIT_SUCCESS;
    }

    #ifdef VERSION1
    int GetAlternateStreams(streamdata **sd)
    {
    char *names[] = {"name1", "name2", "name3"};
    int nameidx = 0;

    /* tds is a temp for realloc failure*/
    streamdata *tsd, *ds;

    int i;
    int numstreams = 0;

    ds = NULL;

    for(i = 0; i < 3; i++)
    {
    tsd = realloc(ds, sizeof(streamdata) * (numstreams+2));
    ds = tsd;
    ds[numstreams].name = malloc(strlen(names[nameidx])+1);
    strcpy(ds[numstreams].name, names[nameidx++]);
    ds[numstreams].size = 4;
    ds[numstreams].attr = 3;
    numstreams++;
    /* A NULL name member marks the end of the array */
    ds[numstreams].name = NULL;
    *sd = ds;
    }
    return 0;
    }
    #else

    /* I would think that ds could be replaced with *sd, which would de
    reference it back to the calling function's *sd, but I can't get it to
    work. When I use the above code, sd[0] is the first structure, sd[1]
    is located sizeof(streamdata) away from sd[0], a nice linear array.
    In the following code sd[1] is 4 bytes away from sd[0], it's
    creating an array of pointers instead of structs. I'm sure *sd below
    can be made to work without the proxy *ds, but I can't seem to get the
    syntax correct so that I get a pointer to the first element of an
    array of structures, not an array of pointer to structures.
    */

    int GetAlternateStreams(streamdata **sd)
    {
    char *names[] = {"name1", "name2", "name3"};
    int nameidx = 0;

    streamdata *tsd;

    int i;
    int numstreams = 0;

    *sd = NULL;

    /* This section simplifies data entry that is much more complicated */
    for(i = 0; i < 3; i++)
    {
    tsd = realloc(*sd, sizeof(streamdata) * (numstreams+2));
    *sd = tsd;
    (*sd)[numstreams].name = malloc(strlen(names[nameidx])+1);
    strcpy((*sd)[numstreams].name, names[nameidx++]);
    (*sd)[numstreams].size = 4;
    (*sd)[numstreams].attr = 3;
    numstreams++;
    /* A NULL name member marks the end of the array */
    (*sd)[numstreams].name = NULL;

    /* Note that I have also used (*sd[numstreams).name with the same
    results */
    }
    return 0;
    }

    #endif

    There has to be something simple I'm missing. **sd dereferenced by
    *sd[element] should be the same as *ds dereferenced by ds[element],
    but it's not. I think it relates to the * in *sd[element] indicating
    an array of pointers, but I'm not sure how.

    Thanks for any help!

    DSF
     
    DSF, Sep 7, 2013
    #1
    1. Advertisements

  2. DSF

    Ian Collins Guest

    I'll keep may comments general for now.

    One technique I find helps with tasks like this it to think of the type
    in your array as a generic type T, or a char. Then you just have to
    think about returning an array of char (or T) and the problem tends to
    become clearer. You can apply this to the code by extracting the code
    that builds a record entry into a discrete function. Then you would
    only have the logic that builds the array of them to analyse. This is
    often the way function templates are built in C++, which is probably why
    I break these tasks down in this way.
     
    Ian Collins, Sep 7, 2013
    #2
    1. Advertisements

  3. DSF

    Tim Rentsch Guest

    Some guesses/suggestions:

    Part of your confusion is some of the names are confusing. In
    particular the name 'sd' is used in main() and as the name of a
    parameter in GetAlternateStreams(), but with very different
    meanings in the two places. First change the name of the
    variable in main(), also giving it an initial value -

    streamdata *streams = 0; /* or NULL */

    Now we can use expressions like 'streams' to refer to the
    i'th stream structure. Next, change the name of the parameter
    in GetAlternateStreams() -

    int GetAlternateStreams( streamdata **address_of_streams )

    This name should make it clear that '(*address_of_streams)' is
    the same as 'streams', so '(*address_of_streams)' is an
    expression that will refer to the i'th stream structure.

    Another source of confusion has to do with the precedence of
    * and [], and related to that how we can switch between them.
    You probably know that

    (*(x)) === ((x)[0])

    where '===' means semantically equivalent, with parentheses
    added to avoid reader uncertainty. What happens when * and []
    are used in combination? Applying the above formula

    *(x[k]) === (x[k])[0] === x[k][0]
    (*x)[k] === (x[0])[k] === x[0][k]

    Obviously there is a big difference between these. Now let's
    use the 'address_of_streams' parameter, still using 'k' as
    the index -

    *(address_of_streams[k]) === address_of_streams[k][0]
    (*address_of_streams)[k] === address_of_streams[0][k]

    The first of these treats 'address_of_streams' like it is
    (pointing to the first element of) an array; we get the
    k'th element of the array, which is a POINTER, and then
    access the 0'th element of whatever that pointer points to.
    Of course this is a problem, because 'address_of_streams'
    does not point to an array, it points to a single variable,
    namely 'streams'.

    The second of these treats 'address_of_streams' sort of like an
    array, but an array with only one element; we get the 0'th
    element of that "array" (which is just the single variable
    'streams'), and use the pointer that was in 'streams' to access
    the array of structures, accessing the k'th structure in the
    array of streams. So the second expression is what we want.

    Note that the unparenthesized expression is the same as the
    first (ie, wrong) case:

    *address_of_streams[k] === address_of_streams[k][0]

    which is not what is wanted.

    Now let's look at writing GetAlternateStreams(). Because the
    expression '*address_of_streams' is rather cumbersome, we use a
    local variable 's' to hold the address of the streams array (so
    it will have the same value as the 'streams' variable in the
    main() function). Using 's' in this way also lets us assign
    the value of calling 'realloc()' directly to 's', since 's' is
    just a "cached" copy of 'streams', without having to worry
    about saving a copy of the address being realloc()'ed, in case
    it fails. Taking out incidental variables and reformatting
    slightly, we have

    int
    GetAlternateStreams( streamdata **address_of_streams ){
    int i;
    streamdata *s = *address_of_streams;

    for( i = 0; i < 3; i++ ){
    s = realloc( s, (i+2) * sizeof *s );
    if( s == 0 ){ ... handle error, return ... }
    *address_of_streams = s;

    s.name = malloc( ... );
    /* ... test here for malloc failure ... */
    strcpy( s.name, ... );
    s.size = ...;
    s.attr = ...;

    s[i+1].name = NULL;
    }
    return 0;
    }

    One additional note: the call to 'realloc()' was re-written
    using a well-known idiom for avoiding using an explicit type
    with 'sizeof'.

    Hopefully these comments and suggestions have cleared up what
    was confusing you.
     
    Tim Rentsch, Sep 8, 2013
    #3
  4. DSF

    DSF Guest

    {Everything snipped}

    And here's why:

    The alternate (second) version of my function works perfectly.

    In brief, my IDE's debugger is nuckin' futs!

    In the variable watch window, when numstreams = 1 on the second
    loop, (*sd[0])'s values are overwritten. But not in reality. The
    addresses of the various elements were 4 bytes apart at one point,
    leading me to believe it was an array of pointers (32-bit system,
    four-byte pointers.) But not in reality.

    I found this out by running through my entire original program
    instead of stopping when I received values way out of range in the
    debugger, and it printed the correct values! A timely placing of
    printf confirmed this. Each element's address in an array of pointers
    would be 4 bytes apart. Each element's address in an array of
    structures would be sizeof(structure) apart. For laughs, here's what
    printf displays for the first three element's addresses, followed by
    the debugger's. Please note that the streamdata structure in the
    original program is slightly different than the example I posted here.
    The size of streamdata is 16 ( 0x10) bytes.

    printf:
    &(*sd[0]) = 02566C7C &(*sd[1]) = 02566C8C &(*sd[2]) = 02566C9C
    debugger:
    &(*sd[0]) = 02566C7C &(*sd[1]) = 0018FF78 &(*sd[2]) = 00415101

    Considering the address of any element in an array can be calculated
    from the starting address even if it doesn't exist (hasn't been
    allocated), the debugger's output is truly screwed-up.

    DSF

    P.S. I changed (*sd)[numstreams] back to (*sd[numstreams]) because
    they seem to perform the same function and I like the looks of the
    latter better. It seems to be more "syntaxily" correct.
     
    DSF, Sep 8, 2013
    #4
  5. DSF

    DSF Guest

    Beep...beep...beep. That's me backing up from the statement above

    As I quickly discovered (and Tim Rentsch explained in his post),
    (*sd)[numstreams] and (*sd[numstreams]) are *NOT* alike. However, the
    second version of GetAlternateStreams written as I posted does work
    properly. It creates an array of streamdata structures (not pointers
    to) and alters *sd in main to point to the 0th element. I stepped
    through the assembly code to confirm it.

    I still stand by my statements regarding the debugger!

    DSF
     
    DSF, Sep 8, 2013
    #5
  6. DSF

    gdotone Guest

    Don't you actually lose access to such a created structure.

    I mean, when the function is called it's placed on the program stack,
    it's local variables and parameters local to that function, too. When the function
    is popped off the program stack don't you lose access to that memory location.
    Well, by lose access, doesn't that memory go back to the system for it
    to reallocate anyway it sees fit. So, even through you may have the actual
    address in memory where your structure resides and you may treat it as
    that structure, you really can't count on it, because the system will over-
    write it when it has a need to do so.

    g. (just a question. i reread Deitel and Deitel's chapter 5 last night on functions)
     
    gdotone, Sep 9, 2013
    #6
  7. Exactly.
    On most systems you have a stack which is pushed with every call and popped
    with every return. So if you access memory after a pop but before the next
    push, it will still be as you left it.
    But for security reasons some systems "shred" the stack on popping. Others
    implement a "stack top" protection and trap on an attempt to exceed it. There
    might also be some weird and wonderful systems out there that use a more
    complicated arrangement than a stack. So you can't guarantee that local
    variables exist after a return, and an attempt to access them is undefined
    behaviour.
     
    Malcolm McLean, Sep 9, 2013
    #7
  8. DSF

    Eric Sosman Guest

    See whether D&D have anything to say about "storage duration,"
    and read that part.

    The technical term for what you've described is "automatic
    storage duration:" a variable with ASD ceases to exist when its
    containing {block} exits. A variable with "static storage duration"
    exists throughout the entire lifetime of the program. A memory area
    with "dynamic storage duration" is created by a call to malloc() and
    continues to exist until released by free() (calloc(), realloc(), etc.
    can also be used).
     
    Eric Sosman, Sep 9, 2013
    #8
  9. The standard uses the term "allocated storage duration", not "dynamic
    storage duration". (And C11 adds "thread storage duration" to the mix.
    It also adds objects with automatic storage duration and "temporary
    lifetime", to deal with an obscure corner case involving structures or
    unions with arrays as members.)
     
    Keith Thompson, Sep 9, 2013
    #9
  10. DSF

    Eric Sosman Guest

    Sonuvagun. Thanks for the correction.

    (I wonder where I picked up "dynamic." Perhaps it was because the
    original ANSI Standard defined only static and automatic durations; as
    far as ANSI was concerned malloc'ed objects had no storage duration at
    all -- and no "lifetime," either! Yet it was essential to be able to
    speak of the lifetime of a dynamically-allocated object, and "dynamic"
    seemed an appropriate adjective. I've been using it so long that I'd
    forgotten it wasn't the Standard's term -- and I completely missed the
    boat when C99 corrected ANSI's oversight and adopted "allocated" as
    The Official Word.)
     
    Eric Sosman, Sep 9, 2013
    #10
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.