From unsigned int to int and back

P

Philliam Auriemma

I am writing a program and I have a function that returns an unsigned
int, but I need to turn it into an int. Can I just do that by doing
like

unsigned int a = 1;
b = (int)a;

As long as a is not too large to fit inside a regular int? Or will
this lose some data or something?
Also when getting an int, can I just do the opposite of the above:

c = (unsigned int)b;

?
 
R

Robert Fendt

As long as a is not too large to fit inside a regular int? Or will
this lose some data or something?

Note: actually your question is more C than C++. To add a C++
specific note, consider using static_cast<>, reinterpret_cast<>
and the like instead of the 'automagic' C cast. The C++ form
allows you to specify exactly what you want to achieve, thus
enabling better diagnostics from the compiler.

Casting if the target is large enough for the number will always
work. If it is not, it will still 'work', but the result is
unspecified. I.e. this works without problem (on a compiler that
supports long long, as do at least GCC and VC++):

unsigned long long val1 = 10;
char val2 = static_cast<char>(val1);

but the result of this is another matter:

unsigned short x = 16384;
unsigned char y = static_cast<unsigned char>(x);

I think the standard specifies for integer casts that the number
is 'capped' on a bit level. Usually, the above example will thus
yield y==0, since 16384==100000000000000b;

However just casting from signed to unsigned is (IIRC)
guaranteed to work even without loss of data, meaning that for
any int 'x' the following holds true:

static_cast<int>(static_cast<unsigned>(x)) == x

This even works if for example x is negative. In that case
though there's a another caveat: the semantics of interpreting a
negative number as unsigned are AFAIK implementation-defined.
The sizes of most base types (and AFAIK even the exact bit
representation format of a neg. number) are not enforced by the
standard (it only specifies constraints on value ranges), so one
has to be extra-careful.

Regards,
Robert
 
J

James Kanze

I am writing a program and I have a function that returns an
unsigned int, but I need to turn it into an int. Can I just do
that by doing like
unsigned int a = 1;
b = (int)a;
As long as a is not too large to fit inside a regular int?

If the value can be represented in an int (i.e. it is less than
or equal to INT_MAX), the conversion is not only well formed,
but fully defined. If the value is in the range
INT_MAX+1...UINT_MAX, the results of the conversion are
implementation defined, and may result in an implementation
defined signal, but in practice, on most machines today,
you'll end up with a negative number that, when converted back
to an unsigned, will result in the same value.
Or will this lose some data or something?
Also when getting an int, can I just do the opposite of the above:
c = (unsigned int)b;

That's well defined. If the value is negative, the result is
UINT_MAX+1 + b. Basically, the same bit pattern on 2's
complement machines (which are most of those you're likely to
see---as far as I know, Unisys mainframes are the only non 2's
complement machines being sold today).
 
J

James Kanze

And thus spake Philliam Auriemma <[email protected]>
Fri, 12 Feb 2010 18:19:20 -0800 (PST):
Note: actually your question is more C than C++.

What makes you say that? The code is fully legal C++, and the
formal requirements for it are slightly different in C and in
C++ (although in practice, almost everyone does something that
is legal in both).
To add a C++ specific note, consider using static_cast<>,
reinterpret_cast<>

The only new-style cast which would be legal here would be
static_cast, and most people I know continue to prefer either
the old style casts or function style casts when neither
pointers nor references are involved.
Casting if the target is large enough for the number will
always work.

For some definition of "work" and "large enough". If floating
point is involved, the rules obviously change.
If it is not, it will still 'work', but the result is
unspecified. I.e. this works without problem (on a compiler
that supports long long, as do at least GCC and VC++):
unsigned long long val1 = 10;
char val2 = static_cast<char>(val1);

This is guaranteed to work, on all compilers, since 10 can
always be represented in a char.
but the result of this is another matter:
unsigned short x = 16384;
unsigned char y = static_cast<unsigned char>(x);

This is also guaranteed to "work", with strictly defined
semantics.
I think the standard specifies for integer casts that the
number is 'capped' on a bit level. Usually, the above example
will thus yield y==0, since 16384==100000000000000b;

The results of the conversion are the initial value modulo 2^n,
where n is the number of significant bits in a char (at least
8).
However just casting from signed to unsigned is (IIRC)
guaranteed to work even without loss of data,

For some meaning of "loss of data". There's no guarantee of a
round trip.
meaning that for any int 'x' the following holds true:
static_cast<int>(static_cast<unsigned>(x)) == x
This even works if for example x is negative.

There are machines where this doesn't hold.
In that case though there's a another caveat: the semantics of
interpreting a negative number as unsigned are AFAIK
implementation-defined.

The semantics of interpreting any type as another type is
undefined behavior, unless a character type is involved.
According to the standard; the standard also implies, in its
language concerning reinterpret_cast, that implementations
should do something reasonable (because otherwise
reinterpret_cast is useless).
 
P

Philliam Auriemma

What makes you say that?  The code is fully legal C++, and the
formal requirements for it are slightly different in C and in
C++ (although in practice, almost everyone does something that
is legal in both).


The only new-style cast which would be legal here would be
static_cast, and most people I know continue to prefer either
the old style casts or function style casts when neither
pointers nor references are involved.


For some definition of "work" and "large enough".  If floating
point is involved, the rules obviously change.


This is guaranteed to work, on all compilers, since 10 can
always be represented in a char.


This is also guaranteed to "work", with strictly defined
semantics.


The results of the conversion are the initial value modulo 2^n,
where n is the number of significant bits in a char (at least
8).


For some meaning of "loss of data".  There's no guarantee of a
round trip.


There are machines where this doesn't hold.


The semantics of interpreting any type as another type is
undefined behavior, unless a character type is involved.
According to the standard; the standard also implies, in its
language concerning reinterpret_cast, that implementations
should do something reasonable (because otherwise
reinterpret_cast is useless).

Wow thank you all for the thoughtful and extremely thorough replies.
This makes my life a lot easier.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top