integer abs() overflow

J

Jonathan Lee

Hi all,
I have a piece of code which I kindof overlooked that converts a
signed long to an unsigned long plus sign bit. I was just doing this:

unsigned long abs(signed long a) {
unsigned long b;
b = static_cast<unsigned long>(b < 0 ? -b : b);
return b;
}

But then I remember that in two's complement, -b might not exist. For
example, -32768 is representable in a 16-bit two's complement type.
But -(-32768) = 32768 *isn't*.

To make things worse, there are other representations than two's
complement. So I figure the most portable way to find out if I'm out
of range is to check the relative magnitudes of numeric_limits<T>::max
() and min(). But then I've gone in a circle.. if I could check *that*
then I'd have some way of calculating absolute value. :/

Any one got a good idea how to do this cleanly and absolutely
portably? Or should I just add a check for the two's complement
extreme, and trust that the other possibilities are sign/magnitude and
one's complement where the problem won't happen? Should I just leave
those CPUs that use biased representation out? (Not that I can name a
processor that does that... )

--Jonathan
 
J

Jonathan Lee

Use std::abs.
It exhibits the same problem (see below). It seems
that std::abs returns a signed value, not unsigned.
I'm using a 32-bit Intel CPU with G++ 4.3.2

--Jonathan

// Example ----------------------------------------
#include <cstdlib>
#include <iostream>
#include <limits>

using ::std::cout;
using ::std::endl;

int main() {
int x = std::numeric_limits<int>::min();
cout << std::abs(x) << endl;
cout << std::abs(x + 1) << endl;
}

// Output ----------------------------------------
-2147483648
2147483647
 
J

Juha Nieminen

Jonathan said:
unsigned long abs(signed long a) {
unsigned long b;
b = static_cast<unsigned long>(b < 0 ? -b : b);
return b;
}

Btw, that looks needlessly complicated. What's wrong with:

unsigned long abs(signed long a) {
But then I remember that in two's complement, -b might not exist. For
example, -32768 is representable in a 16-bit two's complement type.
But -(-32768) = 32768 *isn't*.

This is literally a hardware limitation, and there is no way around
it, so I don't think there's nothing you can do. The only thing you have
to decide is whether the erroneous situation should be detected (eg. by
throwing an exception) or whether it should simply fail silently.
 
J

Jonathan Lee

  Btw, that looks needlessly complicated.

It is. It was just an example.
  This is literally a hardware limitation,

Well, yes and no. I recognize that 16-bit 2's complement can't hold
32768. But unsigned can, and I need my result to be unsigned anyway. I
can work around this by first incrementing by one, flipping the sign,
casting, then incrementing by one again. Ex.,

-32768 ---inc---> -32767 ---abs---> 32767
---cast---> 32767U ---inc---> 32768U

It's just I have to detect when I need to do this. I think Pete's
suggestion of using std::abs can do this portably.

--Jonathan
 
K

Kai-Uwe Bux

Jonathan said:
Hi all,
I have a piece of code which I kindof overlooked that converts a
signed long to an unsigned long plus sign bit. I was just doing this:

unsigned long abs(signed long a) {
unsigned long b;
b = static_cast<unsigned long>(b < 0 ? -b : b);
return b;
}

But then I remember that in two's complement, -b might not exist. For
example, -32768 is representable in a 16-bit two's complement type.
But -(-32768) = 32768 *isn't*.

To make things worse, there are other representations than two's
complement. So I figure the most portable way to find out if I'm out
of range is to check the relative magnitudes of numeric_limits<T>::max
() and min(). But then I've gone in a circle.. if I could check *that*
then I'd have some way of calculating absolute value. :/

First off, there are some guarantees from the C++ standard about the
representation of signed integers. I don't remember the exact reasoning,
but the upshot is that only three are available for a compliant
implementations: two complement, one complement, and signed magnitude.
Any one got a good idea how to do this cleanly and absolutely
portably?

What about

unsigned long abs ( signed long a ) {
if ( a < 0 ) {
unsigned long result = -1L - a;
++ result;
return ( result );
}
return ( a );
}

Or should I just add a check for the two's complement
extreme, and trust that the other possibilities are sign/magnitude and
one's complement where the problem won't happen?

Yup, those are the only possible choices.
Should I just leave
those CPUs that use biased representation out? (Not that I can name a
processor that does that... )

Ok, now it's time to look things up. Here is a quote from the standard
[3.9.1/7]:

The representations of integral types shall define values by use of a pure
binary numeration system.44) [Example: this International Standard permits
2?s complement, 1?s complement and signed magnitude representations for
integral types. ]

The footnote 44) defines:

A positional representation for integers that uses the binary digits 0 and
1, in which the values represented by successive bits are additive, begin
with 1, and are multiplied by successive integral power of 2, except
perhaps for the bit with the highest position.
(Adapted from the American National Dictionary for Information Processing
Systems.)


Best

Kai-Uwe Bux
 
J

Jonathan Lee

No, the code that uses it exhibits the same problem. <g>

Then I misunderstood you. I thought you were suggesting that it would
do the conversion for me. Instead you were suggesting that it would
allow me to detect values that were out of range. I suppose then I
could do

unsigned long my_abs(long a) {
unsigned long c = 0;
long const m = std::numeric_limits<long>::max();
long b = std::abs(a);

while (b < 0) { // move b into range for std::abs()
c += static_cast<unsigned long>(m);
b += m;
b = std::abs(b);
}
c += static_cast<unsigned long>(b);
return c;
}

I guess I would still have to check for c overflowing.

I thought there would be something simpler.

Regards
--Jonathan
 
J

Jonathan Lee

Yup, those are the only possible choices.

Perfect! That's exactly what I needed to know. Then I can just do
(much as you suggested):

return static_cast said:
Ok, now it's time to look things up. Here is a quote from the standard
[3.9.1/7]:

Much obliged!

--Jonathan
 
J

Jonathan Lee

  return static_cast<unsigned long>(a < 0 ? -(a + 1L) + 1L : a);

Ugh.. no can't do that. You had it right.

if (a < 0) {
return 1UL + static_cast<unsigned long>(-1L-a));
} else return static_cast<unsigned long>(a);
 
S

SG

Ugh.. no can't do that. You had it right.

if (a < 0) {
    return 1UL + static_cast<unsigned long>(-1L-a));

} else return static_cast<unsigned long>(a);

unsigned long my_abs(signed long x)
{
unsigned long u = x; // u === x mod 2^N
if (x<0) return ~u+1; else return u;
}

C++ allows other representations than two's complement for signed
numbers but a conversion to an unsigned integer always results in a
bit pattern that matches two's complement. This enables you to negate
the number with ~u+1.

I'd still prefer std::abs, though.

Cheers!
SG
 
J

James Kanze

No, the code that uses it exhibits the same problem. <g>
Yes, that's what it's required to do.

Note however that his original code didn't. It returned the
corresponding unsigned type.
If the returned value is negative then the result wasn't
representable as a non-negative value. The point, though, is
that this is managed for you by the standard library, so you
don't have to deal with the details of your particular
implementation's representation.

The standard library function has undefined behavior if passed a
value for which the absolute value is not representable. That's
not necessarily "dealing with the details of your particular
implementation's representation" in an acceptable fashion.

Given that the standard does defined conversion of signed to
unsigned, I think something like the following would work:

return ( input >= - std::numeric_limits< long >::max()
&& input < 0 )
? static_cast< unsigned long >( -input )
: static_cast< unsigned long >( input ) ;

Of course, in practice, if the machine doesn't use 2's
complement, or uses 2's complement without any checking, then
just casting the results of the standard abs() to the unsigned
type will do the trick. And off hand, I've never heard of an
implementation which did any checking here.
 
S

SG

The standard library function has undefined behavior if passed a
value for which the absolute value is not representable.  That's
not necessarily "dealing with the details of your particular
implementation's representation" in an acceptable fashion.

Given that the standard does defined conversion of signed to
unsigned, I think something like the following would work:

    return ( input >= - std::numeric_limits< long >::max()
             && input < 0 )
        ?   static_cast< unsigned long >( -input )
        :   static_cast< unsigned long >( input ) ;

or better yet:

unsigned long my_abs(signed long input) {
return (input<0)
? -static_cast<unsigned long>(input)
: static_cast<unsigned long>(input);
}

and without a branch:

unsigned long my_abs(signed long input) {
const unsigned long u = input;
const unsigned sign = u >> (
std::numeric_limits<unsigned long>::digits-1);
return (u ^ -sign) + sign;
}

Cheers!
SG
 
S

SG

and without a branch:

  unsigned long my_abs(signed long input) {
    const unsigned long u = input;
    const unsigned sign = u >> (
^
I forgot "long" here
      std::numeric_limits<unsigned long>::digits-1);
    return (u ^ -sign) + sign;
  }

Oops. The variable called "sign" should also be an unsigned long. The
code above fails in case the number of bits in unsigned and unsigned
long differs.

Cheers!
SG
 
K

Kai-Uwe Bux

SG said:
unsigned long my_abs(signed long x)
{
unsigned long u = x; // u === x mod 2^N
if (x<0) return ~u+1; else return u;

I like the idea, but why not

if (x<0) return -u; else return u;
}

C++ allows other representations than two's complement for signed
numbers but a conversion to an unsigned integer always results in a
bit pattern that matches two's complement. This enables you to negate
the number with ~u+1.

True, but using unary - conveys intent more clearly.


Best

Kai-Uwe Bux
 
P

Pascal J. Bourguignon

Jonathan Lee said:
It exhibits the same problem (see below). It seems
that std::abs returns a signed value, not unsigned.
I'm using a 32-bit Intel CPU with G++ 4.3.2

--Jonathan

// Example ----------------------------------------
#include <cstdlib>
#include <iostream>
#include <limits>

using ::std::cout;
using ::std::endl;

int main() {
int x = std::numeric_limits<int>::min();
cout << std::abs(x) << endl;
cout << std::abs(x + 1) << endl;
}

// Output ----------------------------------------
-2147483648
2147483647


#include <limits.h>

/* INT_MIN (-INT_MAX - 1) */
/* INT_MAX 2147483647 */
/* UINT_MAX 4294967295U */

unsigned int uabs(int x){
if(0<=x){
return(x);
}else{
if(-INT_MAX<=INT_MIN){
return(-x);
}else{
unsigned int d=-(x+INT_MAX);
return(d+INT_MAX);
}
}
}

#include <iostream>
using namespace std;
int main(){
cout<<"|INT_MIN| ="<<uabs(INT_MIN)<<endl;
cout<<"|INT_MIN+1| ="<<uabs(INT_MIN+1)<<endl;
return(0);
}

/*
-*- mode: compilation; default-directory: "/tmp/" -*-
Compilation started at Mon Jul 6 15:04:44

SRC="/tmp/s.c++" ; EXE="s" ; g++ -g3 -ggdb3 -o ${EXE} ${SRC} && ./${EXE} && echo status = $?
|INT_MIN| =2147483648
|INT_MIN+1| =2147483647
status = 0

Compilation finished at Mon Jul 6 15:04:45
*/
 
J

Jonathan Lee

Whoops, I utterly missed your point.

That's fine. I appreciate the advice anyway.
Of course, this assumes that -std::numeric_limits<long>::max() can be
represented as a long.

See, and that's what got me started on the whole question. Several
people since last night have posted code of the form

if (x < -maxLong) { /* blah */ }

but I wasn't sure at the time of the original post that -maxLong was
guaranteed to be representable. Similar to what you pointed out, I
thought there could be a type like {-1, 0, 1, 2} where -max = -2 isn't
representable. It seems now that if we're limited to the three
representations then that situation could never happen, and the "if"
just above is always valid.

Regards
--Jonathan
 
J

Jonathan Lee

or better yet:

...

and without a branch:

...

Cheers!
SG

Thanks SG, for the alternatives. In your previous post you said that
you would "still prefer std::abs"; could you say why? Several
solutions to this problem have been posted without it, and they appear
to be valid (read: standard conforming).

--Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top