program design, what datastructure to choose question.....

Discussion in 'C Programming' started by jason, Nov 9, 2007.

  1. jason

    jason Guest

    Hello,

    I have a question about what kind of datastructure to use. I'm reading
    collumn based data in the form of:
    10\t12\t9\t11\n
    24\t11\t4\t10\n
    .....

    I now have a structure which allows me to access the data like this:
    x->row.coll[0].value.d;
    where coll is a structure containing a union and the type of data used in
    this field. x->row.coll[0].utype ;

    However in my inexperience I forgot that most of the functions I want to
    use on this data accept arrays only - Now I have added an extra step to
    create an array of pointers linking to all the values in the
    collumn.unions.

    When I think of it [what I should have done before starting to program] I
    have three choices;
    Leave it this way disregarding the extra overhead of creating an array of
    pointers, but maintaining all the preferences of each field.
    Remove the union and create void pointers
    or at runtime create an array for each collumn and a `super' structure
    that holds all the preferences of each collumn.

    Again I'm not very experienced with this - but what would be the best way
    to go for storing this type of data in such way that it becomes easy to
    preform operations on it ?

    Any suggestions are welcome.

    Thank you.

    Jas.
     
    jason, Nov 9, 2007
    #1
    1. Advertising

  2. jason <> writes:

    > I have a question about what kind of datastructure to use. I'm reading
    > collumn based data in the form of:
    > 10\t12\t9\t11\n
    > 24\t11\t4\t10\n
    > ....
    >
    > I now have a structure which allows me to access the data like this:
    > x->row.coll[0].value.d;
    > where coll is a structure containing a union and the type of data used in
    > this field. x->row.coll[0].utype ;


    Looks like overkill. You have a pointer to a struct containing a
    array of another struct containing an array of some union type? Wow.
    (BTW coll does not seem to be structure as you claim but rather an
    array.)

    The data looks tabular, so the first thing is so say why a 2D array is
    not suitable. If it is not, we can then go further.

    There may be very sound reasons for what you have done, but showing
    two rows of four integers does not capture the complexity. What is
    the data and how is it accessed? Do you need to process columns as
    well as rows?

    --
    Ben.
     
    Ben Bacarisse, Nov 9, 2007
    #2
    1. Advertising

  3. jason

    Jams Jamma Guest

    On Fri, 09 Nov 2007 16:36:16 +0000, Ben Bacarisse wrote:

    > jason <> writes:
    >
    >> I have a question about what kind of datastructure to use. I'm reading
    >> collumn based data in the form of:
    >> 10\t12\t9\t11\n
    >> 24\t11\t4\t10\n
    >> ....
    >>
    >> I now have a structure which allows me to access the data like this:
    >> x->row.coll[0].value.d;
    >> where coll is a structure containing a union and the type of data used
    >> in this field. x->row.coll[0].utype ;

    >
    > Looks like overkill. You have a pointer to a struct containing a array
    > of another struct containing an array of some union type? Wow. (BTW
    > coll does not seem to be structure as you claim but rather an array.)
    >
    > The data looks tabular, so the first thing is so say why a 2D array is
    > not suitable. If it is not, we can then go further.
    >
    > There may be very sound reasons for what you have done, but showing two
    > rows of four integers does not capture the complexity. What is the data
    > and how is it accessed? Do you need to process columns as well as rows?


    Well that's the thing... Youre right that it is a overkill...

    I would like todo something like function(A, B, nsize);
    Where `A' and `B' are the collumn designators.
    about accessing rows I would also like to do something like:
    function(A10, B10); so accessing the values in row[10] collumn `A' etc...

    I think that I'm going to scale back the complexity as I think of it
    more..

    Jas.
     
    Jams Jamma, Nov 9, 2007
    #3
  4. jason

    cr88192 Guest

    "jason" <> wrote in message
    news:47347214$0$21900$4all.nl...
    > Hello,
    >
    > I have a question about what kind of datastructure to use. I'm reading
    > collumn based data in the form of:
    > 10\t12\t9\t11\n
    > 24\t11\t4\t10\n
    > ....
    >
    > I now have a structure which allows me to access the data like this:
    > x->row.coll[0].value.d;
    > where coll is a structure containing a union and the type of data used in
    > this field. x->row.coll[0].utype ;
    >
    > However in my inexperience I forgot that most of the functions I want to
    > use on this data accept arrays only - Now I have added an extra step to
    > create an array of pointers linking to all the values in the
    > collumn.unions.
    >
    > When I think of it [what I should have done before starting to program] I
    > have three choices;
    > Leave it this way disregarding the extra overhead of creating an array of
    > pointers, but maintaining all the preferences of each field.
    > Remove the union and create void pointers
    > or at runtime create an array for each collumn and a `super' structure
    > that holds all the preferences of each collumn.
    >
    > Again I'm not very experienced with this - but what would be the best way
    > to go for storing this type of data in such way that it becomes easy to
    > preform operations on it ?
    >
    > Any suggestions are welcome.
    >


    this depends on what you are doing...

    however, here are a few typical ways to deal with 2D grids of values:
    a 2D array (say, of pointers);
    a 1D array indexing 1D arrays;
    a custom representation (say, the above, but combined with a sequence of
    "spans" or similar for applying in-memory compression);
    ....

    a 2D array is good for speed and simplicity, and can also be memory
    efficient (if the data is strictly a grid), but it can become very slow to
    resize or also very wasteful with memory in some cases.
    a 1D array of 1D arrays adds a little more complexity and overhead, but can
    make dynamic resizing a lot faster and easier (adding an item often only
    involves expanding 1 or 2 of these arrays).

    a pair of arrays with a system of 'spans' is useful for when items are
    spread sparsely over a large area, as this allows storing a fairly large
    space in a comparatively small number of array spots. however, this adds yet
    more complexity.


    value:
    this depends again very highly on usage.

    if the values are usually or almost always the same type, and there is
    little or no variation is the size, a slab may be a good option. this is an
    array of items (say, structs), where we grab off items as needed and use
    them where needed (usually a method exists for fairly quickly finding free
    items, with linear allocation, marks+rovers, and free-lists being common
    approaches).

    it types are far more hetrogeneous (one item may be an integer, a float, a
    string, another array, an object, ...) implementing some kind of dynamic
    type system may be sensible.

    for example, each item is an integer with a few low bits used to indicate
    type, with the rest being either a value, a pointer, or a handle.

    another approach is to use pointers, but implement a system for finding
    types associated with pointers (for example, hidden object headers, having
    the pointers point into 'special' areas of memory known to represent items
    of specific types, ...).


    and there are many other factors...

    or such...


    > Thank you.
    >
    > Jas.
     
    cr88192, Nov 9, 2007
    #4
  5. jason

    Tor Rustad Guest

    jason wrote:
    > Hello,
    >
    > I have a question about what kind of datastructure to use. I'm reading
    > collumn based data in the form of:
    > 10\t12\t9\t11\n
    > 24\t11\t4\t10\n
    > .....


    The first thing i thought of, was

    T matrix[n][m];


    --
    Tor < | tr i-za-h a-z>
     
    Tor Rustad, Nov 9, 2007
    #5
  6. Jams Jamma <> writes:

    > On Fri, 09 Nov 2007 16:36:16 +0000, Ben Bacarisse wrote:
    >
    >> jason <> writes:
    >>
    >>> I have a question about what kind of datastructure to use. I'm reading
    >>> collumn based data in the form of:
    >>> 10\t12\t9\t11\n
    >>> 24\t11\t4\t10\n
    >>> ....
    >>>
    >>> I now have a structure which allows me to access the data like this:
    >>> x->row.coll[0].value.d;
    >>> where coll is a structure containing a union and the type of data used
    >>> in this field. x->row.coll[0].utype ;

    >>
    >> Looks like overkill.
    >> The data looks tabular, so the first thing is so say why a 2D array is
    >> not suitable. If it is not, we can then go further.
    >>

    > Well that's the thing... Youre right that it is a overkill...
    >
    > I would like todo something like function(A, B, nsize);
    > Where `A' and `B' are the collumn designators.
    > about accessing rows I would also like to do something like:
    > function(A10, B10); so accessing the values in row[10] collumn `A'
    > etc...


    C has no way to access the columns of a 2D array (well, it has not
    way other than simply indexing the elements). If you need a function
    that can do the same sort of thing to either the rows or the columns,
    a common trick is to make the parameter a plain pointer and to pass a
    "stride" -- the gap between elements.

    So you build a 'T array[N][M];' and you pass:

    function(&array[x][0], 1); /* to access row x */
    and
    function(&array[0][y], M); /* to access column y */

    Inside 'function(T *p, size_t stride)', you inspect elements
    'p[i * stride]'.

    The other standard method is simply to transpose the matrix or extract
    the columns into a 1D array as needed. Obviously the trade-offs
    between methods depends on the details.

    Some languages allow you to do this directly.

    (Code caveat: check the details, it's late!)

    --
    Ben.
     
    Ben Bacarisse, Nov 10, 2007
    #6
  7. jason

    jason Guest

    On Fri, 09 Nov 2007 14:43:32 +0000, jason wrote:

    > Hello,
    >
    > I have a question about what kind of datastructure to use. I'm reading
    > collumn based data in the form of:
    > 10\t12\t9\t11\n
    > 24\t11\t4\t10\n
    > ....
    >
    > I now have a structure which allows me to access the data like this:
    > x->row.coll[0].value.d;
    > where coll is a structure containing a union and the type of data used
    > in this field. x->row.coll[0].utype ;
    >
    > However in my inexperience I forgot that most of the functions I want to
    > use on this data accept arrays only - Now I have added an extra step to
    > create an array of pointers linking to all the values in the
    > collumn.unions.
    >
    > When I think of it [what I should have done before starting to program]
    > I have three choices;
    > Leave it this way disregarding the extra overhead of creating an array
    > of pointers, but maintaining all the preferences of each field. Remove
    > the union and create void pointers or at runtime create an array for
    > each collumn and a `super' structure that holds all the preferences of
    > each collumn.
    >
    > Again I'm not very experienced with this - but what would be the best
    > way to go for storing this type of data in such way that it becomes easy
    > to preform operations on it ?
    >
    > Any suggestions are welcome.
    >
    > Thank you.
    >
    > Jas.


    Thank you for all your suggestions, Its more clear now. I made the
    overall structure a bit to complicated and I'm going for a plain 2D array
    with one structure that holds any preferences for each collumn.
    Simplicity is the key for me here..

    Thankx again.

    Jas.
     
    jason, Nov 10, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anony!

    Set DataStructure

    Anony!, Aug 12, 2004, in forum: Java
    Replies:
    2
    Views:
    396
    Anony!
    Aug 13, 2004
  2. Sharp

    Index-based datastructure

    Sharp, Mar 14, 2005, in forum: Java
    Replies:
    1
    Views:
    327
    Chris Uppal
    Mar 14, 2005
  3. Replies:
    3
    Views:
    387
    shakah
    Jun 23, 2005
  4. Santosh

    Datastructure design

    Santosh, Nov 19, 2003, in forum: C Programming
    Replies:
    6
    Views:
    380
    pandy
    Nov 20, 2003
  5. jason
    Replies:
    4
    Views:
    304
    Bruno Desthuilliers
    Sep 9, 2006
Loading...

Share This Page