Heap overflow/corruption problem in an arbitrary precision class

M

Martin the Third

Hi, I need some help!

I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".

The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail:

I'm trying to write a += or + operator overload. Originally I wrote a
+= function that simply expanded the mantissa of the left operand to
hold the entire result, and then return itself. This causes a heap
overflow, for example, with the following numbers:
0.0000000000000000001 + 13123123123.123123123

Here, the mantissa of the left operand is "1" but the right is
"13123123123123123123". When the memory is allocated, the left
mantissa is allocated, then the right one is allocated, so their
addresses are very close - the left operand has a tiny mantissa. When
I expand that mantissa to hold the entire result, this obviously
overflows into the right operand's mantissa, and I either get really
funky answers or a heap overflow and a segfault.

The solution, I thought, was to write a + operator overload instead.
Then, I'd have a left and right operand, but I wouldn't touch them at
all - instead I'd create a local "ipfloat result" and then store the
answer in result, and return result. This works!

ipfloat a, b;
a+b; //no errors!

But...when used in a real situation:
ipfloat a(1232.12312);
ipfloat b(123123.231231);
ipfloat c;
c = a+b;

There are, once again, heap overflows causing corruption. Here, the
ipfloat result that I'm returning from the addition is trying to be
stored in c, but c was allocated _before_ the operation took place and
so before result was created. Then the assignment operator tries to
expand c (which has a size of 2 - "0\0") to hold the whole answer, and
overwrites data in result, so again I have corruption, or overflow (or
both).

The only solution I can think of is to use memmove() in my assignment
operator. Is that viable? I'd really appreciate any help with how to
restructure this so that I can make it stable.

Thank you! I'll gladly post code if it will help.
 
K

Kai-Uwe Bux

Martin said:
Hi, I need some help!

I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".

The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail: [snip]
Thank you! I'll gladly post code if it will help.

Since my crystal ball is in repair, code would be highly appreciated.

Short of that, consider using std::vector< char > instead of char*. Very
likely, your problems stem from mistakes in pointer handling (allocation,
deallocation, _and_ reallocation).


Best

Kai-Uwe Bux
 
M

Martin the Third

Martin said:
Hi, I need some help!
I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".
The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail: [snip]
Thank you! I'll gladly post code if it will help.

Since my crystal ball is in repair, code would be highly appreciated.

Short of that, consider using std::vector< char > instead of char*. Very
likely, your problems stem from mistakes in pointer handling (allocation,
deallocation, _and_ reallocation).

Best

Kai-Uwe Bux

Here are the relevant functions. Keep in mind that I stopped my +
function halfway through, so it doesn't actually add yet. (I have
older versions that do...)

ipfloat::ipfloat(char* ch)
{
int len = strlen(ch);
int i = 0, j = 0; //i keeps place in ch, j keeps
place in man
sign = (ch[0]=='-')?'-':'+'; //assign the sign
int startZeros = 0, trailZeros = 0, manLen, decPos, decOffset;
while(ch!='\0' && ch!='.') //loop until we find the decimal,
whether explicitly typed or not
i++;
decPos = i;
i=0;
while(ch=='0' || ch=='.') //count number of 0's at the
beginning of the number
{
startZeros++; //this now contains 1 more than sz
if theres a dec
i++;
}
i=0;
while(ch[len-1-i]=='0') //count the number of trailing 0's
{
trailZeros++;
i++;
}
manLen = len - startZeros - trailZeros - 1;
man = (char*)malloc(sizeof(char)*(manLen+1));
for(i=startZeros; i<(startZeros+manLen+1); i++, j++) //copy
correct data from ch to man
{
if(ch!='.')
man[j] = ch;
else
j--;
}
man[j] = '\0';
decOffset = (startZeros>=decPos)?0:1; //combine this with
statement below? remove variable?
exp = decPos - startZeros - decOffset;
}
ipfloat::ipfloat(const ipfloat &rhs){
if(this!=&rhs)
{
setMan(rhs.getMan());
setExp(rhs.getExp());
setSign(rhs.getSign());
}
}
ipfloat::ipfloat(int size, char sgn)
{
man = (char*)malloc(sizeof(char)*size);
sign = sgn;
}
ipfloat::~ipfloat(){
free(man);
}
inline const char* const ipfloat::getMan() const{return man;}
inline const int ipfloat::getExp() const{return exp;}
inline const char ipfloat::getSign() const{return sign;}
inline void ipfloat::setManSize(int size)
{
free(man);
man = (char*)malloc(sizeof(char)*(size));
}
inline void ipfloat::setManChar(int pos, char ch)
{
man[pos] = ch;
}
inline void ipfloat::setMan(const char* const ch){
free(man);
man = (char*)malloc(sizeof(char)*(strlen(ch)+1));
strcpy(man, ch);
}
inline void ipfloat::setExp(int e){
exp = e;
}
inline void ipfloat::setSign(char ch){
sign = ch;
}
ipfloat &ipfloat::eek:perator =(const ipfloat &rhs){
if(this!=&rhs)
{
setMan(rhs.getMan());
setExp(rhs.getExp());
setSign(rhs.getSign());
}
return *this;
}
ipfloat ipfloat::eek:perator +(const ipfloat &rhs)
{
int ctr, c, d;
int thisBefore, thisAfter, rhsBefore, rhsAfter, resLen;
thisBefore = (getExp()>=0) ? getExp() + 1 : 1;
thisAfter = (strlen(getMan())<=getExp()) ? abs((int)
(strlen(getMan()) + getExp() - thisBefore)) : strlen(getMan()) -
getExp() - 1;
rhsBefore = (rhs.getExp()>=0) ? rhs.getExp() + 1 : 1;
rhsAfter = (strlen(rhs.getMan())<=rhs.getExp()) ? abs((int)
(strlen(rhs.getMan()) + rhs.getExp() - thisBefore)) :
strlen(rhs.getMan()) - rhs.getExp() - 1;
resLen = ((thisBefore>=rhsBefore)?thisBefore:rhsBefore)+
((thisAfter>=rhsAfter)?thisAfter:rhsAfter);
ipfloat result(resLen + 1, rhs.getSign());
result.setManChar(resLen, '\0');
for(int i=0; i<resLen; i++)
result.setManChar(i,'0');
cout<<result.getMan()<<endl;
if(thisBefore>rhsBefore)
{
for(ctr = 0; ctr<(thisBefore-rhsBefore); ctr++)
result.setManChar(ctr, getMan()[ctr]);
}
if(rhsBefore>thisBefore)
{
for(ctr = 0; ctr<(rhsBefore-thisBefore); ctr++)
result.setManChar(ctr, rhs.getMan()[ctr]);
}
if(thisAfter>rhsAfter)
{
c=((thisBefore>=rhsBefore)?thisBefore:rhsBefore) + rhsAfter; //
tracks result
d=strlen(getMan())-(thisAfter-
rhsAfter); //tracks this
while(getMan()[d]!='\0')
{
result.setManChar(c, getMan()[d]);
c++;
d++;
}
}
if(rhsAfter>thisAfter)
{
c=((rhsBefore>=thisBefore)?rhsBefore:thisBefore) +
thisAfter; //tracks result
d=strlen(rhs.getMan())-(rhsAfter-
thisAfter); //tracks this
while(rhs.getMan()[d]!='\0')
{
result.setManChar(c, rhs.getMan()[d]);
c++;
d++;
}
}
cout<<result.getMan()<<endl;
return result;
}

Left out some extra constructors, an itoa() for linux compatibility,
and a function for output.

I should also note that although c = a + b doesn't work, ifploat c = a
+ b; does. If I initialize it to the other value, it seems to work.

I'd rather implement it without vectors, for performance reasons. I'll
be iterating this structure hundreds of thousands of times, and I need
the fastest I can get.
 
K

Kai-Uwe Bux

Martin said:
Martin said:
Hi, I need some help!
I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".
The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail: [snip]
Thank you! I'll gladly post code if it will help.

Since my crystal ball is in repair, code would be highly appreciated.

Short of that, consider using std::vector< char > instead of char*. Very
likely, your problems stem from mistakes in pointer handling (allocation,
deallocation, _and_ reallocation).
[snip]
I'd rather implement it without vectors, for performance reasons. I'll
be iterating this structure hundreds of thousands of times, and I need
the fastest I can get.

Before I dive into your code, let me point out that char* will very likely
be less efficient that vector<char> in your particular implementation. At
various places you use strlen to get the length of the mantissa. That reads
through the whole thing just to get the length. A vector will have that
number stored and return it in constant time. The notions that char* is
faster, is more often than not an illusion

Also, if performance matters, you should go with one of the available
libraries. You will have a very hard time to come even within ballpark
proximity of their performance.


Best

Kai-Uwe Bux
 
M

Martin the Third

Martin said:
Martin the Third wrote:
Hi, I need some help!
I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".
The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail:
[snip]
Thank you! I'll gladly post code if it will help.
Since my crystal ball is in repair, code would be highly appreciated.
Short of that, consider using std::vector< char > instead of char*. Very
likely, your problems stem from mistakes in pointer handling (allocation,
deallocation, _and_ reallocation).
[snip]
I'd rather implement it without vectors, for performance reasons. I'll
be iterating this structure hundreds of thousands of times, and I need
the fastest I can get.

Before I dive into your code, let me point out that char* will very likely
be less efficient that vector<char> in your particular implementation. At
various places you use strlen to get the length of the mantissa. That reads
through the whole thing just to get the length. A vector will have that
number stored and return it in constant time. The notions that char* is
faster, is more often than not an illusion

Also, if performance matters, you should go with one of the available
libraries. You will have a very hard time to come even within ballpark
proximity of their performance.

Best

Kai-Uwe Bux


I can cut the number of calls to strlen() down, by just doing it once
and storing that value, or by calculating it from other values already
determined. I'll look in to vectors for this, but it seems that the
overhead would do more harm than good. (And I say this not really
knowing what the overhead will be).

Performance matters, but so does learning how to do this. This is as
much about expanding my skills with c++ as it is about creating a
functional piece of code. In general, is there a better way to do
this? I was thinking that an int array might prove more useful, then I
could pack more numbers in each array element.

And in the end when I finish this, if it turns out to just be too
slow, I'm sure I'll use apfloat or gmp or something else. But I would
really like to make this work, first.
 
K

Kai-Uwe Bux

Martin said:
Martin said:
Hi, I need some help!
I'm writing an infinite-precision floating point library called
ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
quick overview: I'm storing numbers as a char* mantissa, an int
exponent and a sign. For example, the number "123.45" would have a
mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
storing it as "1.2345x10^2".
The mantissa is allocated dynamically at runtime to allow arbitrary
precision in my numbers. I'm having trouble with heap overflows; let
me explain in more detail: [snip]
Thank you! I'll gladly post code if it will help.

Since my crystal ball is in repair, code would be highly appreciated.

Short of that, consider using std::vector< char > instead of char*. Very
likely, your problems stem from mistakes in pointer handling (allocation,
deallocation, _and_ reallocation).

Best

Kai-Uwe Bux

Here are the relevant functions. Keep in mind that I stopped my +
function halfway through, so it doesn't actually add yet. (I have
older versions that do...)

Ok. Here are some comments on the code. Since you did not post a complete
example, I test whether they address the problem you are experiencing.

ipfloat::ipfloat(char* ch)
{
int len = strlen(ch);
int i = 0, j = 0; //i keeps place in ch, j keeps
place in man
sign = (ch[0]=='-')?'-':'+'; //assign the sign
int startZeros = 0, trailZeros = 0, manLen, decPos, decOffset;
while(ch!='\0' && ch!='.') //loop until we find the decimal,
whether explicitly typed or not
i++;
decPos = i;
i=0;
while(ch=='0' || ch=='.') //count number of 0's at the
beginning of the number
{
startZeros++; //this now contains 1 more than sz
if theres a dec
i++;
}
i=0;
while(ch[len-1-i]=='0') //count the number of trailing 0's
{
trailZeros++;
i++;
}
manLen = len - startZeros - trailZeros - 1;
man = (char*)malloc(sizeof(char)*(manLen+1));
for(i=startZeros; i<(startZeros+manLen+1); i++, j++) //copy
correct data from ch to man
{
if(ch!='.')
man[j] = ch;
else
j--;
}
man[j] = '\0';
decOffset = (startZeros>=decPos)?0:1; //combine this with
statement below? remove variable?
exp = decPos - startZeros - decOffset;
}
ipfloat::ipfloat(const ipfloat &rhs){
if(this!=&rhs)


Within a copy-constructor, this test can never yield true (unless you
explicitly invoke the copy constructor with placement new re-constructing
an element from itself, which might formally be undefined behavior anyway).
{
setMan(rhs.getMan());

It looks as though setMan frees the pointer member man, which is not yet
initialized in this constructor. You might be freeing memory that you don't
own. Try:

man=0;
setMan( rhs.getMan() );

or put "man()" in the initializer list.

You might want to run your program in valgrind to find where you use
un-initialized variables.

setExp(rhs.getExp());
setSign(rhs.getSign());
}
}
ipfloat::ipfloat(int size, char sgn)
{
man = (char*)malloc(sizeof(char)*size);
sign = sgn;
}
ipfloat::~ipfloat(){
free(man);
}
inline const char* const ipfloat::getMan() const{return man;}
inline const int ipfloat::getExp() const{return exp;}
inline const char ipfloat::getSign() const{return sign;}
inline void ipfloat::setManSize(int size)
{
free(man);
man = (char*)malloc(sizeof(char)*(size));

sizeof(char) is guaranteed to be 1 by the standard.

}
inline void ipfloat::setManChar(int pos, char ch)
{
man[pos] = ch;
}
inline void ipfloat::setMan(const char* const ch){
free(man);
man = (char*)malloc(sizeof(char)*(strlen(ch)+1));
strcpy(man, ch);
}
inline void ipfloat::setExp(int e){
exp = e;
}
inline void ipfloat::setSign(char ch){
sign = ch;
}
ipfloat &ipfloat::eek:perator =(const ipfloat &rhs){
if(this!=&rhs)
{
setMan(rhs.getMan());
setExp(rhs.getExp());
setSign(rhs.getSign());
}
return *this;
}
ipfloat ipfloat::eek:perator +(const ipfloat &rhs)
{
int ctr, c, d;
int thisBefore, thisAfter, rhsBefore, rhsAfter, resLen;
thisBefore = (getExp()>=0) ? getExp() + 1 : 1;
thisAfter = (strlen(getMan())<=getExp()) ? abs((int)
(strlen(getMan()) + getExp() - thisBefore)) : strlen(getMan()) -
getExp() - 1;
rhsBefore = (rhs.getExp()>=0) ? rhs.getExp() + 1 : 1;
rhsAfter = (strlen(rhs.getMan())<=rhs.getExp()) ? abs((int)
(strlen(rhs.getMan()) + rhs.getExp() - thisBefore)) :
strlen(rhs.getMan()) - rhs.getExp() - 1;
resLen = ((thisBefore>=rhsBefore)?thisBefore:rhsBefore)+
((thisAfter>=rhsAfter)?thisAfter:rhsAfter);
ipfloat result(resLen + 1, rhs.getSign());
result.setManChar(resLen, '\0');
for(int i=0; i<resLen; i++)
result.setManChar(i,'0');
cout<<result.getMan()<<endl;
if(thisBefore>rhsBefore)
{
for(ctr = 0; ctr<(thisBefore-rhsBefore); ctr++)
result.setManChar(ctr, getMan()[ctr]);
}
if(rhsBefore>thisBefore)
{
for(ctr = 0; ctr<(rhsBefore-thisBefore); ctr++)
result.setManChar(ctr, rhs.getMan()[ctr]);
}
if(thisAfter>rhsAfter)
{
c=((thisBefore>=rhsBefore)?thisBefore:rhsBefore) + rhsAfter; //
tracks result
d=strlen(getMan())-(thisAfter-
rhsAfter); //tracks this
while(getMan()[d]!='\0')
{
result.setManChar(c, getMan()[d]);
c++;
d++;
}
}
if(rhsAfter>thisAfter)
{
c=((rhsBefore>=thisBefore)?rhsBefore:thisBefore) +
thisAfter; //tracks result
d=strlen(rhs.getMan())-(rhsAfter-
thisAfter); //tracks this
while(rhs.getMan()[d]!='\0')
{
result.setManChar(c, rhs.getMan()[d]);
c++;
d++;
}
}
cout<<result.getMan()<<endl;
return result;
}

[snip]


Hope this helps

Kai-Uwe Bux
 
M

Martin the Third

I fixed the immediate problem with heap overflow, but I still
appreciate input and suggestions about the overall design. Thank you!
 
M

Martin the Third

Ah, I didn't see your reply when I posted last. I solved my problem by
setting man=NULL in the default constructor, so essentially what you
were saying, although I think I do need to go check everywhere I free
and possibly add that elsewhere. Freeing null causes no problems (by
design).

Thanks for pointing out the problem with the copy constructor; I mixed
up my check for self assignment. It should have been in the the
operator= overload.

I know sizeof(char) is supposed to be 1, is that actually enforced on
most systems? If so I'll remove all of my sizeof() calls. Thanks for
your input, it's well appreciated.
 
K

Kai-Uwe Bux

Martin the Third wrote:

[snip]
I know sizeof(char) is supposed to be 1, is that actually enforced on
most systems? If so I'll remove all of my sizeof() calls.
[snip]

It is required to be 1 on all conforming implementations. Moreover, I am not
aware of any implementation that is non-conforming in this regard.


Best

Kai-Uwe Bux
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top