Better data structure?

M

Marcus Kwok

I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data (Gaussian
distribution, some of you call it a "normal distribution"). There are
two values of relevance: the "z" value (telling how many standard
deviations away from the mean), and the "phi" value (cumulative
probability).

The data we are reading lists "z" values from -4.00 to +4.00, in 0.02
increments. As a result, the old data structure was an array declared
as such:

double probs[401][2];

where probs[0] is the "z" value and probs[1] is the "phi" value.

Later, we want to look up values in the table, and interpolate if the
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];
else if (z >= 4.0)
phi = probs[400][1];
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];
else {
ratio = (z - probs[indexL][0]) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
}
}


Pretty ugly! So, I am trying to use better data structures, but it
doesn't seem to be a huge improvement:


std::vector< std::pair<double, double> > probs;

// z is calculated here

if (z <= -4.0) {
phi = probs.front().second;
}
else if (z >= 4.0) {
phi = probs.back().second;
}
else {
double tmp = (z + 4.0) / 0.02;
int indexL = static_cast<int>(std::floor(tmp));
int indexH = static_cast<int>(std::ceil(tmp));
if (indexL == indexH) {
phi = probs[indexL].second;
}
else {
double ratio = (z - probs[indexL].first) / 0.02;
phi = probs[indexL].second + ratio * (probs[indexH].second - probs[indexL].second);
}
}


Is there a better way to do this, maybe involving a std::map<double, double>?
However, my issue with the std::map<> is the floating-point inaccuracies
when comparing the keys to see if the exact key is in the table.
 
J

Jay Nabonne

double probs[401][2];

struct Prob
{
double z;
double phi;
};

Prob probs[401];
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];

phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];

phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];

phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;

ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);

- Jay
 
M

Marcus Kwok

Jay Nabonne said:
double probs[401][2];

struct Prob
{
double z;
double phi;
};

Prob probs[401];
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];

phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];

phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];

phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;

ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);

Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.
 
K

Karl Heinz Buchegger

Marcus said:
Jay Nabonne said:
double probs[401][2];

struct Prob
{
double z;
double phi;
};

Prob probs[401];
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];

phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];

phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];

phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;

ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);

Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

I don't think there is much you can do. It already is short, is easy to understand.
I really don't see a need to clean it up.
The only thing I would do is: I would check if it ever happens that IndexL equals
indexH. If it happens, how often does it happen? Due to round-off errors I don't think
it is likely that the calculation of ( z + 4.0 ) / 0.02 equals a whole number when
using double arithmetic. So there is no point in having an 'if' that is almost never
taken. But I may be wrong, only a test could show that.
 
M

makc.the.great

Marcus said:
I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data [is] Pretty ugly!
So, I am trying to use better data structures...

In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".
 
M

Marcus Kwok

Marcus said:
I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data [is] Pretty ugly!
So, I am trying to use better data structures...

In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".

I am reimplementing the entire program (it's not too terribly big) so I
have full control over the entire code base, so it's not too big of a
deal. Converting it to C++ has allowed me to not worry as much about
resource management (since I use RAII a lot), leaving me more time to
spot inefficiencies in the algorithms we're using.
 
M

Marcus Kwok

Jay Nabonne said:
On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:

double probs[401][2];

struct Prob
{
double z;
double phi;
};

Prob probs[401];


exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];

phi = probs[0].phi;

else if (z >= 4.0)
phi = probs[400][1];

phi = probs[400].phi;

else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];

phi = probs[indexL].phi;

else {
ratio = (z - probs[indexL][0]) / 0.02;

ratio = (z - probs[indexL].z) / 0.02;

phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);

}
}
}
}
Marcus said:
Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

Karl Heinz Buchegger said:
I don't think there is much you can do. It already is short, is easy
to understand. I really don't see a need to clean it up. The only
thing I would do is: I would check if it ever happens that IndexL
equals indexH. If it happens, how often does it happen? Due to
round-off errors I don't think it is likely that the calculation of (
z + 4.0 ) / 0.02 equals a whole number when using double arithmetic.
So there is no point in having an 'if' that is almost never taken. But
I may be wrong, only a test could show that.

Thanks for your response. I may go back and look at it later, but a
separate, more urgent issue (completely unrelated to this) has come up
in a different area of the code. Since it works now, I may leave it as
is.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,150
Latest member
MakersCBDReviews
Top