Cheap way to tokenize variable array?

T

Travis

So here's something I've been working on (professionally not academic)
that has puzzled me.

Given an array (say of ints or bytes if easier to think about), the
begginning of the array tells you the overall length and each index in
the array is formatted as follows:

first byte: length
second / third byte: unique id
optional: data

Also keep in mind this array is never sorted in anyway. So to find a
specific unique ID, I'm forced to traverse the entire thing until I
find it.

What I'm curious about is if there's a more efficient way to do this.
Any STL mechanisms that might make this less expensive. Something that
can tokenize the array or something like (given that i know the status
ID I'm looking for).

What do you think?
 
J

James Kanze

So here's something I've been working on (professionally not academic)
that has puzzled me.
Given an array (say of ints or bytes if easier to think about), the
begginning of the array tells you the overall length and each index in
the array is formatted as follows:
first byte: length
second / third byte: unique id
optional: data
Also keep in mind this array is never sorted in anyway. So to find a
specific unique ID, I'm forced to traverse the entire thing until I
find it.
What I'm curious about is if there's a more efficient way to do this.
Any STL mechanisms that might make this less expensive. Something that
can tokenize the array or something like (given that i know the status
ID I'm looking for).

For a single search, no. If you're doing a lot of lookups, it
might pay to create an index: go through the array once,
creating a map of id's to the index of the element.
 
M

Mirco Wahab

Travis said:
Given an array (say of ints or bytes if easier to think about), the
begginning of the array tells you the overall length and each index in
the array is formatted as follows:

first byte: length
second / third byte: unique id
optional: data

Also keep in mind this array is never sorted in anyway. So to find a
specific unique ID, I'm forced to traverse the entire thing until I
find it.

What I'm curious about is if there's a more efficient way to do this.
Any STL mechanisms that might make this less expensive. Something that
can tokenize the array or something like (given that i know the status
ID I'm looking for).

It has been already mentioned (by Alf) that this would be
a good candidate for std::map. But what are the id's like?
If you have a two byte id, that means you have 65534 different
chunks. The simplest thing would be then - to create an offset
array:

...
size_t id_offset[65536]; // maximum unsigned short offset
...

and store the chunk offsets at the corresponding
array elements:

[pseudo]
...
typedef unsigned short int unique_id_t; // assert: this has to be 2 bytes!
...
const unsigned char *p = theArray;

for(size_t n=0; n<nchunks; n++) {
// store the unique_id's offsett located at p+1
id_offset[ *(unique_id_t *)(p+1) ] = p - theArray;
// advance to next chunk
p += *p;
}
...

If you want to access your "array", just use this offset

...
// access array chunks by stored offset
size_t id = 10333;
unsigned char *chunk = theArray + id_offset[id];

cout << "id:" << id << "/" << *(unique_id_t*)(chunk+1)
<< ", chunk length: " << size_t(*chunk)
<< " bytes, data:" << string(chunk+3, chunk+*chunk)
<< endl;
...

If this won't be possible, you can use the hash (std::map)
in the same spirit:

...

typedef unsigned short int unique_id_t; // the 2 byte thing
typedef map<unique_id_t, size_t> MyHash;

...

// create a hash map from the data array
void map_chunks_to(MyHash& hash, const unsigned char *theArray, size_t nchunks)
{
const unsigned char *p = theArray;
for(size_t n=0; n<nchunks; n++) {
hash[ *(unique_id_t *)(p+1) ] = p - theArray;
p += *p;
}
}

...

the access will then be similar to the above:

MyHash h;
map_chunks_to(h, theArray, nchunks); // see above

...

size_t id = 10333;
unsigned char *chunk = theArray + h[id];

...
cout << ...


Regards

M.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top