program design, what datastructure to choose question.....

J

jason

Hello,

I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
.....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used in
this field. x->row.coll[0].utype ;

However in my inexperience I forgot that most of the functions I want to
use on this data accept arrays only - Now I have added an extra step to
create an array of pointers linking to all the values in the
collumn.unions.

When I think of it [what I should have done before starting to program] I
have three choices;
Leave it this way disregarding the extra overhead of creating an array of
pointers, but maintaining all the preferences of each field.
Remove the union and create void pointers
or at runtime create an array for each collumn and a `super' structure
that holds all the preferences of each collumn.

Again I'm not very experienced with this - but what would be the best way
to go for storing this type of data in such way that it becomes easy to
preform operations on it ?

Any suggestions are welcome.

Thank you.

Jas.
 
B

Ben Bacarisse

jason said:
I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used in
this field. x->row.coll[0].utype ;


Looks like overkill. You have a pointer to a struct containing a
array of another struct containing an array of some union type? Wow.
(BTW coll does not seem to be structure as you claim but rather an
array.)

The data looks tabular, so the first thing is so say why a 2D array is
not suitable. If it is not, we can then go further.

There may be very sound reasons for what you have done, but showing
two rows of four integers does not capture the complexity. What is
the data and how is it accessed? Do you need to process columns as
well as rows?
 
J

Jams Jamma

jason said:
I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used
in this field. x->row.coll[0].utype ;


Looks like overkill. You have a pointer to a struct containing a array
of another struct containing an array of some union type? Wow. (BTW
coll does not seem to be structure as you claim but rather an array.)

The data looks tabular, so the first thing is so say why a 2D array is
not suitable. If it is not, we can then go further.

There may be very sound reasons for what you have done, but showing two
rows of four integers does not capture the complexity. What is the data
and how is it accessed? Do you need to process columns as well as rows?


Well that's the thing... Youre right that it is a overkill...

I would like todo something like function(A, B, nsize);
Where `A' and `B' are the collumn designators.
about accessing rows I would also like to do something like:
function(A10, B10); so accessing the values in row[10] collumn `A' etc...

I think that I'm going to scale back the complexity as I think of it
more..

Jas.
 
C

cr88192

jason said:
Hello,

I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used in
this field. x->row.coll[0].utype ;

However in my inexperience I forgot that most of the functions I want to
use on this data accept arrays only - Now I have added an extra step to
create an array of pointers linking to all the values in the
collumn.unions.

When I think of it [what I should have done before starting to program] I
have three choices;
Leave it this way disregarding the extra overhead of creating an array of
pointers, but maintaining all the preferences of each field.
Remove the union and create void pointers
or at runtime create an array for each collumn and a `super' structure
that holds all the preferences of each collumn.

Again I'm not very experienced with this - but what would be the best way
to go for storing this type of data in such way that it becomes easy to
preform operations on it ?

Any suggestions are welcome.


this depends on what you are doing...

however, here are a few typical ways to deal with 2D grids of values:
a 2D array (say, of pointers);
a 1D array indexing 1D arrays;
a custom representation (say, the above, but combined with a sequence of
"spans" or similar for applying in-memory compression);
....

a 2D array is good for speed and simplicity, and can also be memory
efficient (if the data is strictly a grid), but it can become very slow to
resize or also very wasteful with memory in some cases.
a 1D array of 1D arrays adds a little more complexity and overhead, but can
make dynamic resizing a lot faster and easier (adding an item often only
involves expanding 1 or 2 of these arrays).

a pair of arrays with a system of 'spans' is useful for when items are
spread sparsely over a large area, as this allows storing a fairly large
space in a comparatively small number of array spots. however, this adds yet
more complexity.


value:
this depends again very highly on usage.

if the values are usually or almost always the same type, and there is
little or no variation is the size, a slab may be a good option. this is an
array of items (say, structs), where we grab off items as needed and use
them where needed (usually a method exists for fairly quickly finding free
items, with linear allocation, marks+rovers, and free-lists being common
approaches).

it types are far more hetrogeneous (one item may be an integer, a float, a
string, another array, an object, ...) implementing some kind of dynamic
type system may be sensible.

for example, each item is an integer with a few low bits used to indicate
type, with the rest being either a value, a pointer, or a handle.

another approach is to use pointers, but implement a system for finding
types associated with pointers (for example, hidden object headers, having
the pointers point into 'special' areas of memory known to represent items
of specific types, ...).


and there are many other factors...

or such...
 
T

Tor Rustad

jason said:
Hello,

I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
.....

The first thing i thought of, was

T matrix[n][m];
 
B

Ben Bacarisse

Jams Jamma said:
jason said:
I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used
in this field. x->row.coll[0].utype ;


Looks like overkill.
The data looks tabular, so the first thing is so say why a 2D array is
not suitable. If it is not, we can then go further.

Well that's the thing... Youre right that it is a overkill...

I would like todo something like function(A, B, nsize);
Where `A' and `B' are the collumn designators.
about accessing rows I would also like to do something like:
function(A10, B10); so accessing the values in row[10] collumn `A'
etc...


C has no way to access the columns of a 2D array (well, it has not
way other than simply indexing the elements). If you need a function
that can do the same sort of thing to either the rows or the columns,
a common trick is to make the parameter a plain pointer and to pass a
"stride" -- the gap between elements.

So you build a 'T array[N][M];' and you pass:

function(&array[x][0], 1); /* to access row x */
and
function(&array[0][y], M); /* to access column y */

Inside 'function(T *p, size_t stride)', you inspect elements
'p[i * stride]'.

The other standard method is simply to transpose the matrix or extract
the columns into a 1D array as needed. Obviously the trade-offs
between methods depends on the details.

Some languages allow you to do this directly.

(Code caveat: check the details, it's late!)
 
J

jason

Hello,

I have a question about what kind of datastructure to use. I'm reading
collumn based data in the form of:
10\t12\t9\t11\n
24\t11\t4\t10\n
....

I now have a structure which allows me to access the data like this:
x->row.coll[0].value.d;
where coll is a structure containing a union and the type of data used
in this field. x->row.coll[0].utype ;

However in my inexperience I forgot that most of the functions I want to
use on this data accept arrays only - Now I have added an extra step to
create an array of pointers linking to all the values in the
collumn.unions.

When I think of it [what I should have done before starting to program]
I have three choices;
Leave it this way disregarding the extra overhead of creating an array
of pointers, but maintaining all the preferences of each field. Remove
the union and create void pointers or at runtime create an array for
each collumn and a `super' structure that holds all the preferences of
each collumn.

Again I'm not very experienced with this - but what would be the best
way to go for storing this type of data in such way that it becomes easy
to preform operations on it ?

Any suggestions are welcome.

Thank you.

Jas.


Thank you for all your suggestions, Its more clear now. I made the
overall structure a bit to complicated and I'm going for a plain 2D array
with one structure that holds any preferences for each collumn.
Simplicity is the key for me here..

Thankx again.

Jas.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,161
Latest member
GertrudeMa
Top