Heap overflow/corruption problem in an arbitrary precision class

Discussion in 'C++' started by Martin the Third, Jun 12, 2008.

  1. Hi, I need some help!

    I'm writing an infinite-precision floating point library called
    ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    quick overview: I'm storing numbers as a char* mantissa, an int
    exponent and a sign. For example, the number "123.45" would have a
    mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    storing it as "1.2345x10^2".

    The mantissa is allocated dynamically at runtime to allow arbitrary
    precision in my numbers. I'm having trouble with heap overflows; let
    me explain in more detail:

    I'm trying to write a += or + operator overload. Originally I wrote a
    += function that simply expanded the mantissa of the left operand to
    hold the entire result, and then return itself. This causes a heap
    overflow, for example, with the following numbers:
    0.0000000000000000001 + 13123123123.123123123

    Here, the mantissa of the left operand is "1" but the right is
    "13123123123123123123". When the memory is allocated, the left
    mantissa is allocated, then the right one is allocated, so their
    addresses are very close - the left operand has a tiny mantissa. When
    I expand that mantissa to hold the entire result, this obviously
    overflows into the right operand's mantissa, and I either get really
    funky answers or a heap overflow and a segfault.

    The solution, I thought, was to write a + operator overload instead.
    Then, I'd have a left and right operand, but I wouldn't touch them at
    all - instead I'd create a local "ipfloat result" and then store the
    answer in result, and return result. This works!

    ipfloat a, b;
    a+b; //no errors!

    But...when used in a real situation:
    ipfloat a(1232.12312);
    ipfloat b(123123.231231);
    ipfloat c;
    c = a+b;

    There are, once again, heap overflows causing corruption. Here, the
    ipfloat result that I'm returning from the addition is trying to be
    stored in c, but c was allocated _before_ the operation took place and
    so before result was created. Then the assignment operator tries to
    expand c (which has a size of 2 - "0\0") to hold the whole answer, and
    overwrites data in result, so again I have corruption, or overflow (or
    both).

    The only solution I can think of is to use memmove() in my assignment
    operator. Is that viable? I'd really appreciate any help with how to
    restructure this so that I can make it stable.

    Thank you! I'll gladly post code if it will help.
     
    Martin the Third, Jun 12, 2008
    #1
    1. Advertising

  2. Martin the Third

    Kai-Uwe Bux Guest

    Martin the Third wrote:

    > Hi, I need some help!
    >
    > I'm writing an infinite-precision floating point library called
    > ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    > quick overview: I'm storing numbers as a char* mantissa, an int
    > exponent and a sign. For example, the number "123.45" would have a
    > mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    > storing it as "1.2345x10^2".
    >
    > The mantissa is allocated dynamically at runtime to allow arbitrary
    > precision in my numbers. I'm having trouble with heap overflows; let
    > me explain in more detail:

    [snip]
    > Thank you! I'll gladly post code if it will help.


    Since my crystal ball is in repair, code would be highly appreciated.

    Short of that, consider using std::vector< char > instead of char*. Very
    likely, your problems stem from mistakes in pointer handling (allocation,
    deallocation, _and_ reallocation).


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jun 13, 2008
    #2
    1. Advertising

  3. On Jun 12, 6:11 pm, Kai-Uwe Bux <> wrote:
    > Martin the Third wrote:
    > > Hi, I need some help!

    >
    > > I'm writing an infinite-precision floating point library called
    > > ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    > > quick overview: I'm storing numbers as a char* mantissa, an int
    > > exponent and a sign. For example, the number "123.45" would have a
    > > mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    > > storing it as "1.2345x10^2".

    >
    > > The mantissa is allocated dynamically at runtime to allow arbitrary
    > > precision in my numbers. I'm having trouble with heap overflows; let
    > > me explain in more detail:

    > [snip]
    > > Thank you! I'll gladly post code if it will help.

    >
    > Since my crystal ball is in repair, code would be highly appreciated.
    >
    > Short of that, consider using std::vector< char > instead of char*. Very
    > likely, your problems stem from mistakes in pointer handling (allocation,
    > deallocation, _and_ reallocation).
    >
    > Best
    >
    > Kai-Uwe Bux


    Here are the relevant functions. Keep in mind that I stopped my +
    function halfway through, so it doesn't actually add yet. (I have
    older versions that do...)

    ipfloat::ipfloat(char* ch)
    {
    int len = strlen(ch);
    int i = 0, j = 0; //i keeps place in ch, j keeps
    place in man
    sign = (ch[0]=='-')?'-':'+'; //assign the sign
    int startZeros = 0, trailZeros = 0, manLen, decPos, decOffset;
    while(ch!='\0' && ch!='.') //loop until we find the decimal,
    whether explicitly typed or not
    i++;
    decPos = i;
    i=0;
    while(ch=='0' || ch=='.') //count number of 0's at the
    beginning of the number
    {
    startZeros++; //this now contains 1 more than sz
    if theres a dec
    i++;
    }
    i=0;
    while(ch[len-1-i]=='0') //count the number of trailing 0's
    {
    trailZeros++;
    i++;
    }
    manLen = len - startZeros - trailZeros - 1;
    man = (char*)malloc(sizeof(char)*(manLen+1));
    for(i=startZeros; i<(startZeros+manLen+1); i++, j++) //copy
    correct data from ch to man
    {
    if(ch!='.')
    man[j] = ch;
    else
    j--;
    }
    man[j] = '\0';
    decOffset = (startZeros>=decPos)?0:1; //combine this with
    statement below? remove variable?
    exp = decPos - startZeros - decOffset;
    }
    ipfloat::ipfloat(const ipfloat &rhs){
    if(this!=&rhs)
    {
    setMan(rhs.getMan());
    setExp(rhs.getExp());
    setSign(rhs.getSign());
    }
    }
    ipfloat::ipfloat(int size, char sgn)
    {
    man = (char*)malloc(sizeof(char)*size);
    sign = sgn;
    }
    ipfloat::~ipfloat(){
    free(man);
    }
    inline const char* const ipfloat::getMan() const{return man;}
    inline const int ipfloat::getExp() const{return exp;}
    inline const char ipfloat::getSign() const{return sign;}
    inline void ipfloat::setManSize(int size)
    {
    free(man);
    man = (char*)malloc(sizeof(char)*(size));
    }
    inline void ipfloat::setManChar(int pos, char ch)
    {
    man[pos] = ch;
    }
    inline void ipfloat::setMan(const char* const ch){
    free(man);
    man = (char*)malloc(sizeof(char)*(strlen(ch)+1));
    strcpy(man, ch);
    }
    inline void ipfloat::setExp(int e){
    exp = e;
    }
    inline void ipfloat::setSign(char ch){
    sign = ch;
    }
    ipfloat &ipfloat::eek:perator =(const ipfloat &rhs){
    if(this!=&rhs)
    {
    setMan(rhs.getMan());
    setExp(rhs.getExp());
    setSign(rhs.getSign());
    }
    return *this;
    }
    ipfloat ipfloat::eek:perator +(const ipfloat &rhs)
    {
    int ctr, c, d;
    int thisBefore, thisAfter, rhsBefore, rhsAfter, resLen;
    thisBefore = (getExp()>=0) ? getExp() + 1 : 1;
    thisAfter = (strlen(getMan())<=getExp()) ? abs((int)
    (strlen(getMan()) + getExp() - thisBefore)) : strlen(getMan()) -
    getExp() - 1;
    rhsBefore = (rhs.getExp()>=0) ? rhs.getExp() + 1 : 1;
    rhsAfter = (strlen(rhs.getMan())<=rhs.getExp()) ? abs((int)
    (strlen(rhs.getMan()) + rhs.getExp() - thisBefore)) :
    strlen(rhs.getMan()) - rhs.getExp() - 1;
    resLen = ((thisBefore>=rhsBefore)?thisBefore:rhsBefore)+
    ((thisAfter>=rhsAfter)?thisAfter:rhsAfter);
    ipfloat result(resLen + 1, rhs.getSign());
    result.setManChar(resLen, '\0');
    for(int i=0; i<resLen; i++)
    result.setManChar(i,'0');
    cout<<result.getMan()<<endl;
    if(thisBefore>rhsBefore)
    {
    for(ctr = 0; ctr<(thisBefore-rhsBefore); ctr++)
    result.setManChar(ctr, getMan()[ctr]);
    }
    if(rhsBefore>thisBefore)
    {
    for(ctr = 0; ctr<(rhsBefore-thisBefore); ctr++)
    result.setManChar(ctr, rhs.getMan()[ctr]);
    }
    if(thisAfter>rhsAfter)
    {
    c=((thisBefore>=rhsBefore)?thisBefore:rhsBefore) + rhsAfter; //
    tracks result
    d=strlen(getMan())-(thisAfter-
    rhsAfter); //tracks this
    while(getMan()[d]!='\0')
    {
    result.setManChar(c, getMan()[d]);
    c++;
    d++;
    }
    }
    if(rhsAfter>thisAfter)
    {
    c=((rhsBefore>=thisBefore)?rhsBefore:thisBefore) +
    thisAfter; //tracks result
    d=strlen(rhs.getMan())-(rhsAfter-
    thisAfter); //tracks this
    while(rhs.getMan()[d]!='\0')
    {
    result.setManChar(c, rhs.getMan()[d]);
    c++;
    d++;
    }
    }
    cout<<result.getMan()<<endl;
    return result;
    }

    Left out some extra constructors, an itoa() for linux compatibility,
    and a function for output.

    I should also note that although c = a + b doesn't work, ifploat c = a
    + b; does. If I initialize it to the other value, it seems to work.

    I'd rather implement it without vectors, for performance reasons. I'll
    be iterating this structure hundreds of thousands of times, and I need
    the fastest I can get.
     
    Martin the Third, Jun 13, 2008
    #3
  4. Martin the Third

    Kai-Uwe Bux Guest

    Martin the Third wrote:

    > On Jun 12, 6:11 pm, Kai-Uwe Bux <> wrote:
    >> Martin the Third wrote:
    >> > Hi, I need some help!

    >>
    >> > I'm writing an infinite-precision floating point library called
    >> > ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    >> > quick overview: I'm storing numbers as a char* mantissa, an int
    >> > exponent and a sign. For example, the number "123.45" would have a
    >> > mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    >> > storing it as "1.2345x10^2".

    >>
    >> > The mantissa is allocated dynamically at runtime to allow arbitrary
    >> > precision in my numbers. I'm having trouble with heap overflows; let
    >> > me explain in more detail:

    >> [snip]
    >> > Thank you! I'll gladly post code if it will help.

    >>
    >> Since my crystal ball is in repair, code would be highly appreciated.
    >>
    >> Short of that, consider using std::vector< char > instead of char*. Very
    >> likely, your problems stem from mistakes in pointer handling (allocation,
    >> deallocation, _and_ reallocation).

    [snip]
    > I'd rather implement it without vectors, for performance reasons. I'll
    > be iterating this structure hundreds of thousands of times, and I need
    > the fastest I can get.


    Before I dive into your code, let me point out that char* will very likely
    be less efficient that vector<char> in your particular implementation. At
    various places you use strlen to get the length of the mantissa. That reads
    through the whole thing just to get the length. A vector will have that
    number stored and return it in constant time. The notions that char* is
    faster, is more often than not an illusion

    Also, if performance matters, you should go with one of the available
    libraries. You will have a very hard time to come even within ballpark
    proximity of their performance.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jun 13, 2008
    #4
  5. On Jun 12, 9:26 pm, Kai-Uwe Bux <> wrote:
    > Martin the Third wrote:
    > > On Jun 12, 6:11 pm, Kai-Uwe Bux <> wrote:
    > >> Martin the Third wrote:
    > >> > Hi, I need some help!

    >
    > >> > I'm writing an infinite-precision floating point library called
    > >> > ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    > >> > quick overview: I'm storing numbers as a char* mantissa, an int
    > >> > exponent and a sign. For example, the number "123.45" would have a
    > >> > mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    > >> > storing it as "1.2345x10^2".

    >
    > >> > The mantissa is allocated dynamically at runtime to allow arbitrary
    > >> > precision in my numbers. I'm having trouble with heap overflows; let
    > >> > me explain in more detail:
    > >> [snip]
    > >> > Thank you! I'll gladly post code if it will help.

    >
    > >> Since my crystal ball is in repair, code would be highly appreciated.

    >
    > >> Short of that, consider using std::vector< char > instead of char*. Very
    > >> likely, your problems stem from mistakes in pointer handling (allocation,
    > >> deallocation, _and_ reallocation).

    > [snip]
    > > I'd rather implement it without vectors, for performance reasons. I'll
    > > be iterating this structure hundreds of thousands of times, and I need
    > > the fastest I can get.

    >
    > Before I dive into your code, let me point out that char* will very likely
    > be less efficient that vector<char> in your particular implementation. At
    > various places you use strlen to get the length of the mantissa. That reads
    > through the whole thing just to get the length. A vector will have that
    > number stored and return it in constant time. The notions that char* is
    > faster, is more often than not an illusion
    >
    > Also, if performance matters, you should go with one of the available
    > libraries. You will have a very hard time to come even within ballpark
    > proximity of their performance.
    >
    > Best
    >
    > Kai-Uwe Bux



    I can cut the number of calls to strlen() down, by just doing it once
    and storing that value, or by calculating it from other values already
    determined. I'll look in to vectors for this, but it seems that the
    overhead would do more harm than good. (And I say this not really
    knowing what the overhead will be).

    Performance matters, but so does learning how to do this. This is as
    much about expanding my skills with c++ as it is about creating a
    functional piece of code. In general, is there a better way to do
    this? I was thinking that an int array might prove more useful, then I
    could pack more numbers in each array element.

    And in the end when I finish this, if it turns out to just be too
    slow, I'm sure I'll use apfloat or gmp or something else. But I would
    really like to make this work, first.
     
    Martin the Third, Jun 13, 2008
    #5
  6. Martin the Third

    Kai-Uwe Bux Guest

    Martin the Third wrote:

    > On Jun 12, 6:11 pm, Kai-Uwe Bux <> wrote:
    >> Martin the Third wrote:
    >> > Hi, I need some help!

    >>
    >> > I'm writing an infinite-precision floating point library called
    >> > ipfloat (I know infinite is a misnomer - but arbitrary was taken). A
    >> > quick overview: I'm storing numbers as a char* mantissa, an int
    >> > exponent and a sign. For example, the number "123.45" would have a
    >> > mantissa of 12345, an exponent of 2, and a sign of +. I'm essentially
    >> > storing it as "1.2345x10^2".

    >>
    >> > The mantissa is allocated dynamically at runtime to allow arbitrary
    >> > precision in my numbers. I'm having trouble with heap overflows; let
    >> > me explain in more detail:

    >> [snip]
    >> > Thank you! I'll gladly post code if it will help.

    >>
    >> Since my crystal ball is in repair, code would be highly appreciated.
    >>
    >> Short of that, consider using std::vector< char > instead of char*. Very
    >> likely, your problems stem from mistakes in pointer handling (allocation,
    >> deallocation, _and_ reallocation).
    >>
    >> Best
    >>
    >> Kai-Uwe Bux

    >
    > Here are the relevant functions. Keep in mind that I stopped my +
    > function halfway through, so it doesn't actually add yet. (I have
    > older versions that do...)


    Ok. Here are some comments on the code. Since you did not post a complete
    example, I test whether they address the problem you are experiencing.


    > ipfloat::ipfloat(char* ch)
    > {
    > int len = strlen(ch);
    > int i = 0, j = 0; //i keeps place in ch, j keeps
    > place in man
    > sign = (ch[0]=='-')?'-':'+'; //assign the sign
    > int startZeros = 0, trailZeros = 0, manLen, decPos, decOffset;
    > while(ch!='\0' && ch!='.') //loop until we find the decimal,
    > whether explicitly typed or not
    > i++;
    > decPos = i;
    > i=0;
    > while(ch=='0' || ch=='.') //count number of 0's at the
    > beginning of the number
    > {
    > startZeros++; //this now contains 1 more than sz
    > if theres a dec
    > i++;
    > }
    > i=0;
    > while(ch[len-1-i]=='0') //count the number of trailing 0's
    > {
    > trailZeros++;
    > i++;
    > }
    > manLen = len - startZeros - trailZeros - 1;
    > man = (char*)malloc(sizeof(char)*(manLen+1));
    > for(i=startZeros; i<(startZeros+manLen+1); i++, j++) //copy
    > correct data from ch to man
    > {
    > if(ch!='.')
    > man[j] = ch;
    > else
    > j--;
    > }
    > man[j] = '\0';
    > decOffset = (startZeros>=decPos)?0:1; //combine this with
    > statement below? remove variable?
    > exp = decPos - startZeros - decOffset;
    > }
    > ipfloat::ipfloat(const ipfloat &rhs){
    > if(this!=&rhs)


    Within a copy-constructor, this test can never yield true (unless you
    explicitly invoke the copy constructor with placement new re-constructing
    an element from itself, which might formally be undefined behavior anyway).

    > {
    > setMan(rhs.getMan());


    It looks as though setMan frees the pointer member man, which is not yet
    initialized in this constructor. You might be freeing memory that you don't
    own. Try:

    man=0;
    setMan( rhs.getMan() );

    or put "man()" in the initializer list.

    You might want to run your program in valgrind to find where you use
    un-initialized variables.


    > setExp(rhs.getExp());
    > setSign(rhs.getSign());
    > }
    > }
    > ipfloat::ipfloat(int size, char sgn)
    > {
    > man = (char*)malloc(sizeof(char)*size);
    > sign = sgn;
    > }
    > ipfloat::~ipfloat(){
    > free(man);
    > }
    > inline const char* const ipfloat::getMan() const{return man;}
    > inline const int ipfloat::getExp() const{return exp;}
    > inline const char ipfloat::getSign() const{return sign;}
    > inline void ipfloat::setManSize(int size)
    > {
    > free(man);
    > man = (char*)malloc(sizeof(char)*(size));


    sizeof(char) is guaranteed to be 1 by the standard.


    > }
    > inline void ipfloat::setManChar(int pos, char ch)
    > {
    > man[pos] = ch;
    > }
    > inline void ipfloat::setMan(const char* const ch){
    > free(man);
    > man = (char*)malloc(sizeof(char)*(strlen(ch)+1));
    > strcpy(man, ch);
    > }
    > inline void ipfloat::setExp(int e){
    > exp = e;
    > }
    > inline void ipfloat::setSign(char ch){
    > sign = ch;
    > }
    > ipfloat &ipfloat::eek:perator =(const ipfloat &rhs){
    > if(this!=&rhs)
    > {
    > setMan(rhs.getMan());
    > setExp(rhs.getExp());
    > setSign(rhs.getSign());
    > }
    > return *this;
    > }
    > ipfloat ipfloat::eek:perator +(const ipfloat &rhs)
    > {
    > int ctr, c, d;
    > int thisBefore, thisAfter, rhsBefore, rhsAfter, resLen;
    > thisBefore = (getExp()>=0) ? getExp() + 1 : 1;
    > thisAfter = (strlen(getMan())<=getExp()) ? abs((int)
    > (strlen(getMan()) + getExp() - thisBefore)) : strlen(getMan()) -
    > getExp() - 1;
    > rhsBefore = (rhs.getExp()>=0) ? rhs.getExp() + 1 : 1;
    > rhsAfter = (strlen(rhs.getMan())<=rhs.getExp()) ? abs((int)
    > (strlen(rhs.getMan()) + rhs.getExp() - thisBefore)) :
    > strlen(rhs.getMan()) - rhs.getExp() - 1;
    > resLen = ((thisBefore>=rhsBefore)?thisBefore:rhsBefore)+
    > ((thisAfter>=rhsAfter)?thisAfter:rhsAfter);
    > ipfloat result(resLen + 1, rhs.getSign());
    > result.setManChar(resLen, '\0');
    > for(int i=0; i<resLen; i++)
    > result.setManChar(i,'0');
    > cout<<result.getMan()<<endl;
    > if(thisBefore>rhsBefore)
    > {
    > for(ctr = 0; ctr<(thisBefore-rhsBefore); ctr++)
    > result.setManChar(ctr, getMan()[ctr]);
    > }
    > if(rhsBefore>thisBefore)
    > {
    > for(ctr = 0; ctr<(rhsBefore-thisBefore); ctr++)
    > result.setManChar(ctr, rhs.getMan()[ctr]);
    > }
    > if(thisAfter>rhsAfter)
    > {
    > c=((thisBefore>=rhsBefore)?thisBefore:rhsBefore) + rhsAfter; //
    > tracks result
    > d=strlen(getMan())-(thisAfter-
    > rhsAfter); //tracks this
    > while(getMan()[d]!='\0')
    > {
    > result.setManChar(c, getMan()[d]);
    > c++;
    > d++;
    > }
    > }
    > if(rhsAfter>thisAfter)
    > {
    > c=((rhsBefore>=thisBefore)?rhsBefore:thisBefore) +
    > thisAfter; //tracks result
    > d=strlen(rhs.getMan())-(rhsAfter-
    > thisAfter); //tracks this
    > while(rhs.getMan()[d]!='\0')
    > {
    > result.setManChar(c, rhs.getMan()[d]);
    > c++;
    > d++;
    > }
    > }
    > cout<<result.getMan()<<endl;
    > return result;
    > }


    [snip]


    Hope this helps

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jun 13, 2008
    #6
  7. I fixed the immediate problem with heap overflow, but I still
    appreciate input and suggestions about the overall design. Thank you!
     
    Martin the Third, Jun 13, 2008
    #7
  8. Ah, I didn't see your reply when I posted last. I solved my problem by
    setting man=NULL in the default constructor, so essentially what you
    were saying, although I think I do need to go check everywhere I free
    and possibly add that elsewhere. Freeing null causes no problems (by
    design).

    Thanks for pointing out the problem with the copy constructor; I mixed
    up my check for self assignment. It should have been in the the
    operator= overload.

    I know sizeof(char) is supposed to be 1, is that actually enforced on
    most systems? If so I'll remove all of my sizeof() calls. Thanks for
    your input, it's well appreciated.
     
    Martin the Third, Jun 13, 2008
    #8
  9. Martin the Third

    Kai-Uwe Bux Guest

    Martin the Third wrote:

    [snip]
    > I know sizeof(char) is supposed to be 1, is that actually enforced on
    > most systems? If so I'll remove all of my sizeof() calls.

    [snip]

    It is required to be 1 on all conforming implementations. Moreover, I am not
    aware of any implementation that is non-conforming in this regard.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jun 13, 2008
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Thinkit
    Replies:
    8
    Views:
    528
    Thinkit
    Dec 24, 2003
  2. Replies:
    2
    Views:
    425
    Alf P. Steinbach
    Dec 13, 2004
  3. Replies:
    2
    Views:
    3,817
    Walter
    Apr 29, 2005
  4. Replies:
    1
    Views:
    614
  5. Replies:
    5
    Views:
    183
    Roger Pack
    Feb 5, 2009
Loading...

Share This Page