What's the best data structure for this job?

R

Rui Maciel

I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.

So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:

std::vector< std::pair<unsigned int, mpq_t> > expression;


The SI unit list is being stored as:

std::map<unsigned int, std::pair<std::string, std::string> > units;


Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.

So, what's your take on this issue? How would you implement a data structure for this task?



Thanks in advance,
Rui Maciel
 
S

sharmashivkumar01

Rui said:
I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.
So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:
std::vector< std::pair<unsigned int, mpq_t> > expression;
The SI unit list is being stored as:
std::map<unsigned int, std::pair<std::string, std::string> > units;
Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.
So, what's your take on this issue? How would you implement a data structure for this task?

Have you considered using a unified list of base and compound SI units?
There's a variety ways of implementing it:

class si_info {
public:
   std::string symbol;
   std::string name;
   std::vector<unsigned int> base_units;

};

typedef std::map<unsigned int, si_info> units;

A compount SI unit, like a newton, would have a three-element base_units
vector, containing the index of the kg, m, and s base units. An SI base unit
is represented with the base_units vector containing one element, the index
of the SI base unit itself.

You now have a uniform reference for either a base or a compount unit, the
single unsigned int handle. To be even more pedantic, rather than using
unsigned int explicitly as a handle, use a symbolic typedef:

typedef unsigned int si_handle_t;

class si_info {
public:
   std::string symbol;
   std::string name;
   std::vector<si_handle_t> base_units;

};

typedef std::map<si_handle_t, si_info> units;

This is considered a more elegant design, and it adds additional flexibility
later down the road.

The value of an SI unit is then reduced to a single vector:

class si_value {
public:
   si_handle_t si_unit;
   std::vector<mpq_t> si_values;

};

si_unit gives the key to the units map. You can also use a simple
std::pair<si_unit, std::vector<mpq_t> > typedef, here.

You'll have to enforce two restrictions with this design:

1) All elements of the base_units vector in the SI map must reference only
other elements whose base_units vector is of size 1, and contain the same
si_handle_t value as the element itself -- that is, the element is either a
base element, or consists of only other base elements.

2) The size of your mpq_t vector must match the size of the base_units
vector in the si_info map. The nth element of the mpq_t vector gives the
value for the SI unit given by the nth element of the base_units vector.

Seems to me that with this design you can proceed and write code that
uniformly deals with base SI units and compount SI units, using the same
logic.

 application_pgp-signature_part
< 1KViewDownload- Hide quoted text -

- Show quoted text -
 
P

Pascal J. Bourguignon

Rui Maciel said:
I'm writing a class that processes SI units, which represents each unit as a vector of unit/exponent pairs.
According to my current implementation plan, the force units (the Newton) would be represented as [N,1] or,
after converting to the SI base units, [kg,1],[m,1],[s,-2]. To avoid wasting space with redundant
information, the unit in the unit/exponent pair is a reference index to a hash table that stores the unit's
name and symbol.

So, according to this scenario, what would be the best data structure for this job? Currently I'm using
nothing more than the STL containers for the data stuctures and the GNU MP bignum library for the exponent.
More precisely, I'm storing the SI units as:

std::vector< std::pair<unsigned int, mpq_t> > expression;


The SI unit list is being stored as:

std::map<unsigned int, std::pair<std::string, std::string> > units;


Nonetheless, I feel that this solution is far from perfect, not to mention a bit overwhelming. Moreover,
the static and global nature of a SI unit list leads me to believe that there must be a simpler solution
than a singleton wrapper class on a simple standard map.

So, what's your take on this issue? How would you implement a data structure for this task?

The best data structure for this task is a simple rational.

Map each unit to a different prime, so you can easily compute the new unit or compare two units.

eg with:
m = 2
s = 3
kg = 5
N = kg*m/s² = 5*2/9 = 10/9
m/s = 2/9

N / (m/s) = 10/9 / 2/3 = 5/3 = kg/s



Otherwise, just use a map from unit to exponent.
 
R

Rui Maciel

Mateusz said:
I've not analysed your problem in details, but I think it links well to
in-depth example of use of metafunctions to represent various units
explained by David Abrahams and Aleksey Gurtovoy in their book

http://www.boostpro.com/mplbook/

You can find some overview of this technique here:

http://www.boostpro.com/writing/ACCU_MPL_slides.ppt

I hope it will give you some ideas.

Best regards,

I've browsed the sample chapter and I have to say that, in spite of the probably needless template stuff,
the idea looks promising. The job of converting between units would end up being nothing more than altering
the unit's coefficient. Very nice.

The only drawback that I can see is that it doesn't offer a way to preserve information regarding unit
ratios. For example, I would like to express some units as the ratio of two units of the same type (cm^2/m,
m/m, etc...) but I don't see how it can be achieved through that method.


Rui Maciel
 
R

Rui Maciel

Pascal said:
The best data structure for this task is a simple rational.

Map each unit to a different prime, so you can easily compute the new unit
or compare two units.

eg with:
m = 2
s = 3
kg = 5
N = kg*m/s² = 5*2/9 = 10/9
m/s = 2/9

N / (m/s) = 10/9 / 2/3 = 5/3 = kg/s



Otherwise, just use a map from unit to exponent.

Probably that solution is a bit too complex and doesn't offer a simple way to convert between units, which
is the main purpose of storing dimensional units.


Rui Maciel
 
M

Martin Eisenberg

Pascal said:
Map each unit to a different prime, so you can easily compute
the new unit or compare two units.

How is it that people still come up with the wasteful and
practically one-way original Goedel numbering in this age of
computing? Using flag bits, the idea might make sense.


Martin
 
M

Martin Eisenberg

Sam said:
typedef unsigned int si_handle_t;

class si_info {
public:
std::string symbol;
std::string name;
std::vector<si_handle_t> base_units;
};

typedef std::map<si_handle_t, si_info> units;

This is considered a more elegant design, and it adds additional
flexibility later down the road.

The value of an SI unit is then reduced to a single vector:

class si_value {
public:
si_handle_t si_unit;
std::vector<mpq_t> si_values;
};

Why do you put the exponents in the value class? Aren't they going to
be invariant for a given unit once the decomposition into base units
is fixed?


Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top