What's the best data structure for this job?

Rui Maciel · May 24, 2009

I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.

So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:

std::vector< std:

air<unsigned int, mpq_t> > expression;

The SI unit list is being stored as:

std::map<unsigned int, std:

air<std::string, std::string> > units;

Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.

So, what's your take on this issue? How would you implement a data structure for this task?

Thanks in advance,
Rui Maciel

sharmashivkumar01 · May 25, 2009

Rui said:
Rui said:

I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.

Click to expand...

So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:

Click to expand...

std::vector< std:air<unsigned int, mpq_t> > expression;

Click to expand...

The SI unit list is being stored as:

Click to expand...

std::map<unsigned int, std:air<std::string, std::string> > units;

Click to expand...

Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.

Click to expand...

So, what's your take on this issue? How would you implement a data structure for this task?

Click to expand...

Have you considered using a unified list of base and compound SI units?
There's a variety ways of implementing it:

class si_info {
public:
std::string symbol;
std::string name;
std::vector<unsigned int> base_units;

};

typedef std::map<unsigned int, si_info> units;

A compount SI unit, like a newton, would have a three-element base_units
vector, containing the index of the kg, m, and s base units. An SI base unit
is represented with the base_units vector containing one element, the index
of the SI base unit itself.

You now have a uniform reference for either a base or a compount unit, the
single unsigned int handle. To be even more pedantic, rather than using
unsigned int explicitly as a handle, use a symbolic typedef:

typedef unsigned int si_handle_t;

class si_info {
public:
std::string symbol;
std::string name;
std::vector<si_handle_t> base_units;

};

typedef std::map<si_handle_t, si_info> units;

This is considered a more elegant design, and it adds additional flexibility
later down the road.

The value of an SI unit is then reduced to a single vector:

class si_value {
public:
si_handle_t si_unit;
std::vector<mpq_t> si_values;

};

si_unit gives the key to the units map. You can also use a simple
std:air<si_unit, std::vector<mpq_t> > typedef, here.

You'll have to enforce two restrictions with this design:

1) All elements of the base_units vector in the SI map must reference only
other elements whose base_units vector is of size 1, and contain the same
si_handle_t value as the element itself -- that is, the element is either a
base element, or consists of only other base elements.

2) The size of your mpq_t vector must match the size of the base_units
vector in the si_info map. The nth element of the mpq_t vector gives the
value for the SI unit given by the nth element of the base_units vector.

Seems to me that with this design you can proceed and write code that
uniformly deals with base SI units and compount SI units, using the same
logic.

application_pgp-signature_part
< 1KViewDownload- Hide quoted text -

- Show quoted text -

Pascal J. Bourguignon · May 25, 2009

Rui Maciel said:
I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.

So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:

std::vector< std:air<unsigned int, mpq_t> > expression;

The SI unit list is being stored as:

std::map<unsigned int, std:air<std::string, std::string> > units;

Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.

So, what's your take on this issue? How would you implement a data structure for this task?

The best data structure for this task is a simple rational.

Map each unit to a different prime, so you can easily compute the new unit or compare two units.

eg with:
m = 2
s = 3
kg = 5
N = kg*m/s² = 5*2/9 = 10/9
m/s = 2/9

N / (m/s) = 10/9 / 2/3 = 5/3 = kg/s

Otherwise, just use a map from unit to exponent.

Rui Maciel · May 25, 2009

Mateusz said:
I've not analysed your problem in details, but I think it links well to
in-depth example of use of metafunctions to represent various units
explained by David Abrahams and Aleksey Gurtovoy in their book

http://www.boostpro.com/mplbook/

You can find some overview of this technique here:

http://www.boostpro.com/writing/ACCU_MPL_slides.ppt

I hope it will give you some ideas.

Best regards,

I've browsed the sample chapter and I have to say that, in spite of the probably needless template stuff,
the idea looks promising. The job of converting between units would end up being nothing more than altering
the unit's coefficient. Very nice.

The only drawback that I can see is that it doesn't offer a way to preserve information regarding unit
ratios. For example, I would like to express some units as the ratio of two units of the same type (cm^2/m,
m/m, etc...) but I don't see how it can be achieved through that method.

Rui Maciel

Rui Maciel · May 25, 2009

Pascal said:
The best data structure for this task is a simple rational.

Map each unit to a different prime, so you can easily compute the new unit
or compare two units.

eg with:
m = 2
s = 3
kg = 5
N = kg*m/sÂ² = 5*2/9 = 10/9
m/s = 2/9

N / (m/s) = 10/9 / 2/3 = 5/3 = kg/s

Otherwise, just use a map from unit to exponent.

Probably that solution is a bit too complex and doesn't offer a simple way to convert between units, which
is the main purpose of storing dimensional units.

Rui Maciel

Martin Eisenberg · May 25, 2009

Pascal said:
Map each unit to a different prime, so you can easily compute
the new unit or compare two units.

How is it that people still come up with the wasteful and
practically one-way original Goedel numbering in this age of
computing? Using flag bits, the idea might make sense.

Martin

Martin Eisenberg · May 25, 2009

Sam said:
typedef unsigned int si_handle_t;

class si_info {
public:
std::string symbol;
std::string name;
std::vector<si_handle_t> base_units;
};

typedef std::map<si_handle_t, si_info> units;

This is considered a more elegant design, and it adds additional
flexibility later down the road.

The value of an SI unit is then reduced to a single vector:

class si_value {
public:
si_handle_t si_unit;
std::vector<mpq_t> si_values;
};

Why do you put the exponents in the value class? Aren't they going to
be invariant for a given unit once the decomposition into base units
is fixed?

Martin

Best structure for a "data buffer".	4	Aug 25, 2009
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Best algorithm/data structure combination for word lookup?	8	Jul 27, 2007
best data structure for a caching class	7	Dec 14, 2008
TF-IDF	1	Aug 19, 2021
For verification, what's the best way to introduce delay offsetbetween a DUT's data array?	1	Feb 20, 2013
what is the best way to implement this problem?	3	Jan 27, 2009
Data Structure Issue	5	Jul 4, 2007

What's the best data structure for this job?

Rui Maciel

sharmashivkumar01

Pascal J. Bourguignon

Rui Maciel

Rui Maciel

Martin Eisenberg

Martin Eisenberg

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads