A malloc error in C++ - incorrect checksum for freed object

G

giacomomonari

Hello everyone,
I wrote a program in C++ (called lab3_1) that compiled has no errors,
but when I run it returns in the Run Log this error that doesn't
allow
it to terminate correctly:

lab3_1(1545) malloc: *** error for object 0x1804400: incorrect
checksum for freed object - object was probably modified after being
freed, break at szone_error to debug
lab3_1(1545) malloc: *** set a breakpoint in szone_error to debug

lab3_1 has exited due to signal 11 (SIGSEGV).

The main of the program is:

#define SLEEP_LGTH 4

#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include "dataAnalysis.h"
#include "gnuplot_i.h"


using namespace std;

namespace GMNLFS=Giacomo_Monari_Numerical_Library_For_Simulations;

int main (int argc, char * const argv[]) {

ifstream dataFile("/Users/giacomomonari/Programmi C++/
lab3_1/3cr_1.txt");
if (dataFile.fail())
{
cout << "No data file!" << endl;
}

string dummy1;
getline(dataFile,dummy1);

vector<double> z(0);
vector<double> mv(0);
vector<double> s178(0);
vector<double> a150(0);
vector<double> las(0);
vector<double> lls(0);
vector<double> loglb(0);
vector<double> logp178(0);

char dummy2[19];
double currentVar;

int count = 0;
while(!dataFile.eof())
{
dataFile.ignore(19,*dummy2);
dataFile >> currentVar;
z.push_back(currentVar);
dataFile >> currentVar;
dataFile >> currentVar;
mv.push_back(currentVar);
dataFile >> currentVar;
s178.push_back(currentVar);
dataFile >> currentVar;
a150.push_back(currentVar);
dataFile >> currentVar;
las.push_back(currentVar);
dataFile >> currentVar;
lls.push_back(currentVar);
dataFile >> currentVar;
loglb.push_back(currentVar);
dataFile >> currentVar;
logp178.push_back(currentVar);
count++;
}

dataFile.close();


{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
zAnalysis(z.begin(),z.end());
cout << "Max z = " << zAnalysis.GetMax() << endl;
cout << "Min z = " << zAnalysis.GetMin() << endl;
cout << "Average z = " << zAnalysis.GetAverage() << endl;
cout << "Variance of z = " << zAnalysis.GetVariance() << endl;
cout << "Standard Deviation of z = " << zAnalysis.GetStdDeviation()
<< endl;
cout << "Skewness of z = " << zAnalysis.GetSkewness() << endl;
cout << "Kurtosis of z = " << zAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
mvAnalysis(mv.begin(),mv.end());
cout << "Max mv = " << mvAnalysis.GetMax() << endl;
cout << "Min mv = " << mvAnalysis.GetMin() << endl;
cout << "Average mv = " << mvAnalysis.GetAverage() << endl;
cout << "Variance of mv = " << mvAnalysis.GetVariance() << endl;
cout << "Standard Deviation of mv = " << mvAnalysis.GetStdDeviation()
<< endl;
cout << "Skewness of mv = " << mvAnalysis.GetSkewness() << endl;
cout << "Kurtosis of mv = " << mvAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
s178Analysis(s178.begin(),s178.end());
cout << "Max s178 = " << s178Analysis.GetMax() << endl;
cout << "Min s178 = " << s178Analysis.GetMin() << endl;
cout << "Average s178 = " << s178Analysis.GetAverage() << endl;
cout << "Variance of s178 = " << s178Analysis.GetVariance() << endl;
cout << "Standard Deviation of s178 = " <<
s178Analysis.GetStdDeviation() << endl;
cout << "Skewness of s178 = " << s178Analysis.GetSkewness() << endl;
cout << "Kurtosis of s178 = " << s178Analysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
a150Analysis(a150.begin(),a150.end());
cout << "Max a150 = " << a150Analysis.GetMax() << endl;
cout << "Min a150 = " << a150Analysis.GetMin() << endl;
cout << "Average a150 = " << a150Analysis.GetAverage() << endl;
cout << "Variance of a150 = " << a150Analysis.GetVariance() << endl;
cout << "Standard Deviation of a150 = " <<
a150Analysis.GetStdDeviation() << endl;
cout << "Skewness of a150 = " << a150Analysis.GetSkewness() << endl;
cout << "Kurtosis of a150 = " << a150Analysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
lasAnalysis(las.begin(),las.end());
cout << "Max las = " << lasAnalysis.GetMax() << endl;
cout << "Min las = " << lasAnalysis.GetMin() << endl;
cout << "Average las = " << lasAnalysis.GetAverage() << endl;
cout << "Variance of las = " << lasAnalysis.GetVariance() << endl;
cout << "Standard Deviation of las = " <<
lasAnalysis.GetStdDeviation() << endl;
cout << "Skewness of las = " << lasAnalysis.GetSkewness() << endl;
cout << "Kurtosis of las = " << lasAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
llsAnalysis(lls.begin(),lls.end());

cout << "Max lls = " << llsAnalysis.GetMax() << endl;
cout << "Min lls = " << llsAnalysis.GetMin() << endl;
cout << "Average lls = " << llsAnalysis.GetAverage() << endl;
cout << "Variance of lls = " << llsAnalysis.GetVariance() << endl;
cout << "Standard Deviation of lls = " <<
llsAnalysis.GetStdDeviation() << endl;
cout << "Skewness of lls = " << llsAnalysis.GetSkewness() << endl;
cout << "Kurtosis of lls = " << llsAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
loglbAnalysis(loglb.begin(),loglb.end());
cout << "Max loglb = " << loglbAnalysis.GetMax() << endl;
cout << "Min loglb = " << loglbAnalysis.GetMin() << endl;
cout << "Average loglb = " << loglbAnalysis.GetAverage() << endl;
cout << "Variance of loglb = " << loglbAnalysis.GetVariance() <<
endl;
cout << "Standard Deviation of loglb = " <<
loglbAnalysis.GetStdDeviation() << endl;
cout << "Skewness of loglb = " << loglbAnalysis.GetSkewness() <<
endl;
cout << "Kurtosis of loglb = " << loglbAnalysis.GetKurtosis() <<
endl;

vector<double> xAxeLoglb(50);
vector<double> yAxeLoglb(50);

loglbAnalysis.CreateHistogram(xAxeLoglb.begin(), xAxeLoglb.end(),
yAxeLoglb.begin(), yAxeLoglb.end());

Gnuplot g;
g.set_style("boxes");
g.plot_xy(xAxeLoglb,yAxeLoglb,"luminosity distribution");
sleep(SLEEP_LGTH);

loglbAnalysis.ComputeAlphaAndK(0.,50.);
cout << "Alpha = " << loglbAnalysis.GetAlpha() << endl;
cout << "K = " << loglbAnalysis.GetK() << endl;
cout << " ------------------------------------- " << endl;

}



{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
logp178Analysis(logp178.begin(),logp178.end());
cout << "Max logp178 = " << logp178Analysis.GetMax() << endl;
cout << "Min logp178 = " << logp178Analysis.GetMin() << endl;
cout << "Average logp178 = " << logp178Analysis.GetAverage() << endl;
cout << "Variance of logp178 = " << logp178Analysis.GetVariance() <<
endl;
cout << "Standard Deviation of logp178 = " <<
logp178Analysis.GetStdDeviation() << endl;
cout << "Skewness of logp178 = " << logp178Analysis.GetSkewness() <<
endl;
cout << "Kurtosis of logp178 = " << logp178Analysis.GetKurtosis() <<
endl;
cout << " ------------------------------------- " << endl;

}


return 0;
}

Note that before returning that error message the program does
correctly its work (but returning a wrong graphic with gnuplot).
I'm using in this program an header I created, "dataAnalysis.h", that
in other programs hadn't problems... So I don't understand if it is
that header's fault... What could it be?
Giacomo
 
V

Victor Bazarov

Hello everyone,
I wrote a program in C++ (called lab3_1) that compiled has no errors,
but when I run it returns in the Run Log this error that doesn't
allow
it to terminate correctly:

lab3_1(1545) malloc: *** error for object 0x1804400: incorrect
checksum for freed object - object was probably modified after being
freed, break at szone_error to debug
lab3_1(1545) malloc: *** set a breakpoint in szone_error to debug

lab3_1 has exited due to signal 11 (SIGSEGV).

The main of the program is:

[...incomplete code with a bunch of 3rd party thingamajigs...]

Note that before returning that error message the program does
correctly its work (but returning a wrong graphic with gnuplot).
I'm using in this program an header I created, "dataAnalysis.h", that
in other programs hadn't problems... So I don't understand if it is
that header's fault... What could it be?

<shrug> Memory overrun.

Either run it under a tool that would actually watch your memory
access, like BoundsChecker (if you're on Windows), or narrow it down
to the actual statement causing your memory overrun. I recommend
the latter, since it's easier (no need to look for the tool). Just
comment out anything non-essential, and see if it still bombs. If
it doesn't, start re-adding things back, and see when it again croaks.
Then analyse further.

V
 
A

Alf P. Steinbach

* (e-mail address removed):
int main (int argc, char * const argv[]) {

argc and argv are apparently not used, so no need to declare them.

ifstream dataFile("/Users/giacomomonari/Programmi C++/
lab3_1/3cr_1.txt");
if (dataFile.fail())
{
cout << "No data file!" << endl;
}

string dummy1;
getline(dataFile,dummy1);

vector<double> z(0);
vector<double> mv(0);
vector<double> s178(0);
vector<double> a150(0);
vector<double> las(0);
vector<double> lls(0);
vector<double> loglb(0);
vector<double> logp178(0);

char dummy2[19];
double currentVar;

int count = 0;
while(!dataFile.eof())

You should not test for eof() here, but for failure.
{
dataFile.ignore(19,*dummy2);

Using uninitialized dummy2[0]'s value. This may not always ignore 19
characters.

dataFile >> currentVar;
z.push_back(currentVar);

Should check for failure.

Cheers, & hth.,

- Alf
 
G

giacomomonari

Either run it under a tool that would actually watch your memory
access, like BoundsChecker (if you're on Windows), or narrow it down
to the actual statement causing your memory overrun. I recommend
the latter, since it's easier (no need to look for the tool). Just
comment out anything non-essential, and see if it still bombs. If
it doesn't, start re-adding things back, and see when it again croaks.
Then analyse further.

V


First of all thx Victor :)

Ok, I did in the latter manner you indicated (however I'm on Apple
Xcode with GuardMalloc, but I'm not expert in using it).
The incriminated part of the main seems this:

{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
llsAnalysis(lls.begin(),lls.end());

cout << "Max lls = " << llsAnalysis.GetMax() << endl;
cout << "Min lls = " << llsAnalysis.GetMin() << endl;
cout << "Average lls = " << llsAnalysis.GetAverage() << endl;
cout << "Variance of lls = " << llsAnalysis.GetVariance() << endl;
cout << "Standard Deviation of lls = " <<
llsAnalysis.GetStdDeviation() << endl;
cout << "Skewness of lls = " << llsAnalysis.GetSkewness() << endl;
cout << "Kurtosis of lls = " << llsAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}


wich doesn't seem very different to the other parts... I have no
idea... What could it be?...
Giacomo
 
V

Victor Bazarov

[..]
Ok, I did in the latter manner you indicated (however I'm on Apple
Xcode with GuardMalloc, but I'm not expert in using it).
The incriminated part of the main seems this:

{
GMNLFS::DataAnalysis<vector<double>::iterator,double>
llsAnalysis(lls.begin(),lls.end());

cout << "Max lls = " << llsAnalysis.GetMax() << endl;
cout << "Min lls = " << llsAnalysis.GetMin() << endl;
cout << "Average lls = " << llsAnalysis.GetAverage() << endl;
cout << "Variance of lls = " << llsAnalysis.GetVariance() << endl;
cout << "Standard Deviation of lls = " <<
llsAnalysis.GetStdDeviation() << endl;
cout << "Skewness of lls = " << llsAnalysis.GetSkewness() << endl;
cout << "Kurtosis of lls = " << llsAnalysis.GetKurtosis() << endl;
cout << " ------------------------------------- " << endl;
}


wich doesn't seem very different to the other parts... I have no
idea... What could it be?...

Take it one level down, my friend. There are several lines of code here.
Many function calls. Which one is doing it? Or is it only happening
when _all_ of them are called in this particular order? (I doubt it)

V
 
G

giacomomonari

Take it one level down, my friend. There are several lines of code here.
Many function calls. Which one is doing it? Or is it only happening
when _all_ of them are called in this particular order? (I doubt it)

V

Ok, sorry. I checked... It's the first line, the instantiation of the
object llsAnalysis:

GMNLFS::DataAnalysis<vector<double>::iterator,double>
llsAnalysis(lls.begin(),lls.end());

the other functions can't be called without it ;)







Thx Alf :) Could you please show me the edits I have to do?... I'm not
sure.



Giacomo
 
V

Victor Bazarov

Ok, sorry. I checked... It's the first line, the instantiation of the
object llsAnalysis:

GMNLFS::DataAnalysis<vector<double>::iterator,double>
llsAnalysis(lls.begin(),lls.end());

the other functions can't be called without it ;)

Right, but as you've found out, they don't need to be called :)

Now, what's lls? Is it by any chance empty? If not, and it's valid
(from the 'DataAnalysis' point of view), then you need to keep digging
in, find what in the constructor of 'DataAnalysis' makes the program
crash (yes, I understand that sometimes you can't go without some
steps, but you need to know how your 'lls' is used, what's done to
the iterators, etc.) Be fully aware of how your data are handled,
even if by somebody else's library.

V
 
G

giacomomonari

Right, but as you've found out, they don't need to be called :)

Now, what's lls? Is it by any chance empty? If not, and it's valid
(from the 'DataAnalysis' point of view), then you need to keep digging
in, find what in the constructor of 'DataAnalysis' makes the program
crash (yes, I understand that sometimes you can't go without some
steps, but you need to know how your 'lls' is used, what's done to
the iterators, etc.) Be fully aware of how your data are handled,
even if by somebody else's library.

V

lls isn't empty.
I created the header "dataAnalysis.h", if you want I can enrtirely
post it (and btw, how you hid part of your posts? It would be very
useful if I'll post all the code ;))

I suspect that the bad part of my code could be like this:

template <typename Itr, typename TYPE>
TYPE DataAnalysis<Itr,TYPE>::ComputeVariance()
{
const TYPE initValue = (TYPE) (0.0);
Itr averageConstFirst = dataLast++;
std::fill_n(averageConstFirst,(dataLast - dataFirst), average);
TYPE tmp = std::inner_product(dataFirst, dataLast, averageConstFirst,
initValue, std::plus<TYPE>(), SquaredDifference<TYPE>()) ;
tmp /= (TYPE) (dataLast - dataFirst - 1);
return tmp;
}

these (there are other similar) are functions of my code directely
called from the constructor...
I needed to use inner_product beetween the data vector and a vector of
the same size that values "average" in every components.
So I used these strange lines:
Itr averageConstFirst = dataLast++;
std::fill_n(averageConstFirst,(dataLast - dataFirst), average);
in order to create that all-average-vector and not write on data
vector (all using iterators)...
But this doesn't seem a very elegant solution and could be the spring
of the problems... What do you think? Should I post the entire code?
Giacomo
 
A

Alf P. Steinbach

* (e-mail address removed):
Thx Alf :) Could you please show me the edits I have to do?... I'm not
sure.

In that case it gets complicated from /my/ point of view... Like
talking down a plane with a passenger as pilot. But OK, I had an idea.

0.
Do this IN ORDER.

I.e. do /not/ start with the last point, but instead with point (1) below.

1.
First, reorganize your program as follows, moving your current "main"
code to a new function called "cppMain":

#include <iostream>
#include <ostream>

#include <cstddef>
#include <stdexcept>

void cppMain()
{
// Here goes your earlier "main" code.
}

int main()
{
try
{
cppMain();
return EXIT_SUCCESS;
}
catch( std::exception const& x )
{
std::cerr << "!" << x.what() << std::endl;
return EXIT_FAILURE;
}
}

Note new headers <stdexcept> and <cstddef>, which you need to include.

2.
Make this /compile successfully/ and reproduce earlier behavior exactly.

3.
Change the file stream to generate exceptions on failure, by (3a) adding
the following code[*]

bool throwStdX( char const s[] ) { throw std::runtime_error( s ); }

struct XEof: std::runtime_error
{
XEof( char const s[] = "XEof" ): std::runtime_error( s ) {}
};

struct InFileStream: std::ifstream
{
void throwEofOrStd( char const s[] )
{
if( eof() ) { throw XEof(); } else { throwStdX( s ); }
}

InFileStream( char const fname[] )
: std::ifstream( fname )
{
if( fail() ) { throwStdX( "file open failed" ); }
}

InFileStream& operator>>( double& x )
{
*this >> x;
if( fail() ){ throwEofOrStd( ">>(double) failed" ); }
return *this;
}
};

and (3b) replacing std::ifstream with InFileStream,

and (3c) replacing each loop (I think there was only one, but I don't
have your code in sight)

while( !f.eof() )
{
// Blah blah
}

with

while( !f.eof() )
{
try
{
// Blah blah
}
catch( XEof const& ) {}
}


4.
Recompile and check that indeed, now the program produces some "!...."
exception message, indicating that some input operation failed.

5.
Fix the obvious "ignore" bug. For that, check the documentation of
"ignore". What arguments does it expect?


Cheers, & hth.,

- Alf


Notes:
[*] Simply using the exceptions() member function is problematic because
setting eofbit may also set failbit, then generating an exception. The
standard iostreams are seemingly not designed. Just arbitrary evolved.
 
G

giacomomonari

Thx you very much Alf! I did what you said... There are no bugs in
that. Now is surely much more elegant.
Too bad the problems I had aren't gone yet... :(
Giacomo
 
G

giacomomonari

* Alf P. Steinbach:


Uh, don't do that. I was a bit hasty. Do something like

static_cast<std::ifstream&)(*this) >> x;

- Alf (hasty)

Thx Alf! I did what you said. There are no bugs. Now surely my program
is much more elegant...
Too bad the old problems aren't gone yet :(
Giacomo
 
V

Victor Bazarov

lls isn't empty.
I created the header "dataAnalysis.h", if you want I can enrtirely
post it (and btw, how you hid part of your posts?

I am not versed in the problem domain as well as you (most people
here probably aren't), I don't have the source data you have (and
even if you share those, I am not sure I'd have the time to delve
into the details of your algorithm. *You* on the other hand hold
all the cards, you know what your program /should/ do, step after
step, so you're the best person to continue debugging it.
It would be very
useful if I'll post all the code ;))

No, it would not. Next after that you're going to post the source
data, so we can also run your program, and not just look at it.
And there are probably other pieces that you will either forget or
they are not allowed to be copied...
I suspect that the bad part of my code could be like this:

template <typename Itr, typename TYPE>
TYPE DataAnalysis<Itr,TYPE>::ComputeVariance()
{
const TYPE initValue = (TYPE) (0.0);
Itr averageConstFirst = dataLast++;
std::fill_n(averageConstFirst,(dataLast - dataFirst), average);
TYPE tmp = std::inner_product(dataFirst, dataLast, averageConstFirst,
initValue, std::plus<TYPE>(), SquaredDifference<TYPE>()) ;
tmp /= (TYPE) (dataLast - dataFirst - 1);
return tmp;
}

What's so bad about it?
these (there are other similar) are functions of my code directely
called from the constructor...

Well, if it's your constructor that's failing, try pulling it apart
and call those functions separately... You need to make sure that
the pieces from which you assemble your object are actually valid to
begin with. Are they? You don't have to answer, just figure it out
for yourself.
I needed to use inner_product beetween the data vector and a vector of
the same size that values "average" in every components.

Have those vectors been allocated (resized) to the proper length, or
are you using 'back_inserter'?
So I used these strange lines:
Itr averageConstFirst = dataLast++;
std::fill_n(averageConstFirst,(dataLast - dataFirst), average);
in order to create that all-average-vector and not write on data
vector (all using iterators)...

OK. Does 'fill_n' insert or does it assign? Does your vector (to
which 'dataLast' and 'dataFirst' are iterators) have enough room in
it to contain your averages (in case 'fill_n' actually assigns and
doesn't insert)? If you have access to the vector, resize it right
before 'fill_n' to twice its current size, and see what happens.
But this doesn't seem a very elegant solution and could be the spring
of the problems... What do you think? Should I post the entire code?

<shrug> Post it if you must. Keep in mind that everybody in this
newsgroup also has other things to do in their lives than fix your
code, no offense intended. Besides, if you fix it yourself you not
only will have more pride in your work, you'll also learn more than
if somebody else does it.

V
 
G

giacomomonari

I am not versed in the problem domain as well as you (most people
here probably aren't), I don't have the source data you have (and
even if you share those, I am not sure I'd have the time to delve
into the details of your algorithm. *You* on the other hand hold
all the cards, you know what your program /should/ do, step after
step, so you're the best person to continue debugging it.


No, it would not. Next after that you're going to post the source
data, so we can also run your program, and not just look at it.
And there are probably other pieces that you will either forget or
they are not allowed to be copied...



What's so bad about it?


Well, if it's your constructor that's failing, try pulling it apart
and call those functions separately... You need to make sure that
the pieces from which you assemble your object are actually valid to
begin with. Are they? You don't have to answer, just figure it out
for yourself.


Have those vectors been allocated (resized) to the proper length, or
are you using 'back_inserter'?


OK. Does 'fill_n' insert or does it assign? Does your vector (to
which 'dataLast' and 'dataFirst' are iterators) have enough room in
it to contain your averages (in case 'fill_n' actually assigns and
doesn't insert)? If you have access to the vector, resize it right
before 'fill_n' to twice its current size, and see what happens.


<shrug> Post it if you must. Keep in mind that everybody in this
newsgroup also has other things to do in their lives than fix your
code, no offense intended. Besides, if you fix it yourself you not
only will have more pride in your work, you'll also learn more than
if somebody else does it.

V
Hey Victor, your questions helped a lot :)
Yes the part of my code I suspected was wrong... Was wrong...
fill_n doesn't insert but just fill... I did know that but I simply
didn't allocate enough memory to fill. This was because I tried to
avoid the use of vectors in that header file and tried to do all with
iterators. But isn't important because I call the vectors only to
create that average vector...
I changed those parts of code in this way...

template <typename Itr, typename TYPE>
TYPE DataAnalysis<Itr,TYPE>::ComputeVariance()
{
const TYPE initValue = (TYPE) (0.0);
std::vector<TYPE> averageVec((dataLast - dataFirst),average);
TYPE tmp = std::inner_product(dataFirst, dataLast,
averageVec.begin(), initValue, std::plus<TYPE>(),
SquaredDifference<TYPE>()) ;
tmp /= (TYPE) (dataLast - dataFirst - 1);
return tmp;
}

so now I've an averageVec to use for the inner product... before I
just... wrote on parts of the memory near my dataVecs I suppose...
However, perhaps isn't that elegant but the problem seems to be solved
(for now)...
Thank you very much! :D
Giacomo
 
B

BobR

int main (int argc, char * const argv[]) {
[snip]
vector<double> z(0);
vector<double> mv(0);
vector<double> s178(0);
vector<double> a150(0);
vector<double> las(0);
vector<double> lls(0);
vector<double> loglb(0);
vector<double> logp178(0);

There is nothing wrong with those vectors, but, it looks 'cluttered' to me.
Therefore, I'd suggest something like:

enum Keys{ z = 0, mv, s178, a150, las, lls, loglb, logp178 };

std::map<size_t, std::vector<double> > mvD;

// example usage:

mvD[ z ] = std::vector<double>( 10, 3.14 );
mvD[ mv ] = std::vector<double>( 10, 0.001 );
std::cout<<"mvD[z].at(2) ="<<mvD[z].at(2)<<'\n'<<std::endl;
// out: mvD[z].at(2) =3.140000
mvD[ mv ].at(4) = 10.77;
std::cout<<"mvD[mv].at(4) ="<<mvD[mv].at(4)<<'\n'<<std::endl;
// out: mvD[mv].at(4) =10.770000
mvD[ logp178 ].push_back( 7.0125 );
// etc....

char dummy2[19];
double currentVar;

int count = 0;
while( !dataFile.eof() ){
dataFile.ignore(19,*dummy2);
dataFile >> currentVar;

// > z.push_back(currentVar);

mvD[ z ].push_back( currentVar );

// etc....

Just a thought....
 
F

Frank Birbacher

Hi!

I changed those parts of code in this way...

template <typename Itr, typename TYPE>
TYPE DataAnalysis<Itr,TYPE>::ComputeVariance()
{
const TYPE initValue = (TYPE) (0.0);
std::vector<TYPE> averageVec((dataLast - dataFirst),average);
TYPE tmp = std::inner_product(dataFirst, dataLast,
averageVec.begin(), initValue, std::plus<TYPE>(),
SquaredDifference<TYPE>()) ;
tmp /= (TYPE) (dataLast - dataFirst - 1);
return tmp;
}

I don't like the idea of having a vector full of identical values just
for using it with inner_product. I _like_ the idea of using std
algorithms, though.

Try:

template<typename TYPE>
struct SqDiffAccumulator
{
TYPE const average;
SqDiffAccumulator(TYPE const newAverage)
: average(newAverage)
{}
TYPE operator() (TYPE const sum, TYPE const currentValue) const
{
const TYPE difference = currentValue - average;
return sum + difference * difference;
}
};


TYPE const sqSum = std::accumulate(
dataFirst, dataLast,
initValue,
SqDiffAccumulator(average)
);

Frank
 
J

Jerry Coffin

[ ... ]
I don't like the idea of having a vector full of identical values just
for using it with inner_product. I _like_ the idea of using std
algorithms, though.

There is another way: make something that acts like a vector full of
identical values:

// warning: untested code
template <typename T>
class single_value {
T v_;
size_t size_;
public:
single_value(T v, size_t size) : v_(v), size_(size) {}

double operator[](size_t index) {
return v_;
}

friend iterator;

class iterator {
size_t pos_;
single_value v_;
public:
iterator(single_value const &v, size_t pos = 0)
: v_(v), pos_(pos)
{}

iterator &operator++() {
++pos;
return *this;
}

double operator*() { return v_.v_; }
bool operator==(iterator const &other) {
return pos_ == other.pos_;
}
};

iterator begin() { return iterator(this); }
iterator end() { return iterator(this, size_); }
}

Another possibility would be to use valarrays. They were really intended
for this kind of computational work, and can often handle it quite
cleanly:

// Warning: only minimally tested
template <class T>
T const variance(T *values, size_t size) {
std::valarray<T> const v(values, size);

T average = v.sum() / v.size();

std::valarray<T> diffs = v-average;

diffs *= diffs;

return diffs.sum()/diffs.size();
}

As this is written at the moment, it accepts a pointer and size, but it
would be fairly easy to allow it to accept input as a pair of iterators
or such. Given the OP's problem, using valarrays throughout might be a
possibility as well, and I didn't want to spend a lot of code on
something that might easily not matter.

Whether this is better or worse is open to some question -- on one hand,
it does temporarily store all the (squared) deviations from the mean, so
it uses extra memory that isn't strictly necessary. OTOH, I think it's
much easier to read and understand than either the original code or your
code. That's likely good in educational code, but not so good for a
library that will be used heavily but only rarely read. OTOH, for big
vector-based machines (e.g. Crays) most valarray operations are easy to
run on the vector processer, in which case they should be quite fast.
 
F

Frank Birbacher

Hi!

Jerry said:
There is another way: make something that acts like a vector full of
identical values:

:D Nice idea!

I thought about it: Your partial implementation needs to be completed.
This is a type of thing nobody does regularly. Thus it'll take much time
to get it correct. OTOH you only need a single iterator since
inner_product does not care about an end of the second sequence.
// Warning: only minimally tested
template <class T>
T const variance(T *values, size_t size) {
std::valarray<T> const v(values, size);

T average = v.sum() / v.size();

std::valarray<T> diffs = v-average;

diffs *= diffs;

return diffs.sum()/diffs.size();
}

Wow, I didn't even know about valarrays.
Whether this is better or worse is open to some question

There are numerous ways of calculating the variance. Hopefully we get
those soon:
http://boost-sandbox.sourceforge.net/libs/accumulators/doc/html/index.html

Frank
 
J

Jerry Coffin

Hi!



:D Nice idea!

I thought about it: Your partial implementation needs to be completed.
This is a type of thing nobody does regularly. Thus it'll take much time
to get it correct. OTOH you only need a single iterator since
inner_product does not care about an end of the second sequence.

This is true for inner_product, but not for other purposes -- then
again, I'm not sure there are many uses for this "container". :)

In any case, I don't see much reason not to support 'end()' -- about
all it requires is that you store the size, which strikes me as
sufficiently trivial that you just aren't gaining much of anything by
eliminating it.
Wow, I didn't even know about valarrays.

Many people don't, or have only barely heard of them -- they've largely
been overshadowed by other parts of the standard library.
There are numerous ways of calculating the variance. Hopefully we get
those soon:
http://boost-sandbox.sourceforge.net/libs/accumulators/doc/html/index.html

Yup; that's probably more open to generalization and extension.
 
R

Richard Herring

Frank Birbacher said:
Hi!



:D Nice idea!

I thought about it: Your partial implementation needs to be completed.
This is a type of thing nobody does regularly. Thus it'll take much time
to get it correct. OTOH you only need a single iterator since
inner_product does not care about an end of the second sequence.

So you don't even need the vector part. How about just something that
acts like an iterator over an (infinite) sequence of identical values:

// Warning - untested code
template <typename T>
class IteratorToConstant {
public:
IteratorToConstant(T x) : v_(x) {}
T & operator*() const { return v_; }
IteratorToConstant & operator++() { return *this; }
IteratorToConstant & operator++(int) { return *this; }
private:
T v_;
};

Again, just a skeleton - you might want to add operator-- etc.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top