Question re. integral promotion, signed->unsigned

F

Fred Ma

Hello,

I've been trying to clear up a confusion about
integer promotions during expression evaluation.
I've checked the C FAQ and C++ FAQ (they are
different languages, but I was hoping one would
clear up the confusion), as well as googling
groups and the web.

The confusion is that for a binary operator,
where both operands are of integral type.
If either operand is unsigned long, the other
operand is converted to unsigned long. The
confusing thing is that this has a higher
priority than checking for longs. Similarly,
unsigned ints are checked for before ints.
So it is likely that an int is converted to
an unsigned int, or a long is converted to an
unsigned long. I'm getting this from Schildt's
"The Complete Reference C++", 3rd ed.. I
realize his book is not exactly lauded here, but
I also checked an old draft of the c++ standard:
http://www.csci.csusb.edu/dick/c++std/cd2/conv.html,
and it says something similar (but not the same).
In item 1 of section 4.5, they also say that if
something can't fit into an int, it is interpretted
as an unsigned int.

Why is it OK to reinterpret a signed integral type as
its unsigned counterpart? If it's a negative number,
then all of a sudden, the reinterpretation yields a
large positive number. I'm assuming that the signed
number is modelled as a 2's complement number, even
though it may be implemented differently in hardware.
Likewise, the unsigned is modelled as a straight binary
coded decimal (BCD). Does the sign bit of the 2's
complement number becomes the MSB of the unsigned
number, with all other bits remaining unchanged?

I've TA'd digital electronics before, so I know that
the mechanics of adding a 2'c complement negative
number is the same as adding the BCD interpretation of
those same bits (as was pointed out in one of my google
groups searches). It seems (to me, and perhaps
wrongly) that the rule of converting signed integral
types to their unsigned counterparts is due to lack of
a better way to handle it. Perhaps the reasoning was
that the resulting bits would still be accurate if one
operand was negative while the other was positive.

I was wondering if anyone could confirm or correct my
understanding of this, and maybe offer some insight as
to why this order of promotion is desirable?

Fred

P.S. An interesting thing is that Schildt points out an
exception. If one operand is a long while the other is
an unsigned int whose value can't fit into a long (e.g.
on a platform where both had the same number of bits),
then both operands convert to unsigned longs. Again, the
conversion from a signed integrap type to an unsigned
counterpart.
 
D

Dan Pop

In said:
The confusion is that for a binary operator,
where both operands are of integral type.
If either operand is unsigned long, the other
operand is converted to unsigned long. The
confusing thing is that this has a higher
priority than checking for longs. Similarly,
unsigned ints are checked for before ints.
So it is likely that an int is converted to
an unsigned int, or a long is converted to an
unsigned long.

This is correct.
Why is it OK to reinterpret a signed integral type as
its unsigned counterpart? If it's a negative number,
then all of a sudden, the reinterpretation yields a
large positive number.

Consider the alternative: the unsigned might have to converted to signed,
but such a conversion isn't well defined if the value of the unsigned
cannot be represented by the signed. In such cases, the result is
usually a negative value, and this isn't any better than the actual
scenario you have described above.

The moral of the story: the programmer MUST know what he's doing when
combining signed and unsigned operands.
I'm assuming that the signed
number is modelled as a 2's complement number, even
though it may be implemented differently in hardware.

No need for such an assumption: the result of the conversion is well
defined, regardless of the representation of negative values.
Likewise, the unsigned is modelled as a straight binary
coded decimal (BCD).

The standard does not allow this. The unsigned must be using a pure
binary encoding.
Does the sign bit of the 2's
complement number becomes the MSB of the unsigned
number, with all other bits remaining unchanged?

If the signed value is negative, it is added the maximum value that
can be represented by the unsigned type plus one. This is true regardless
of representation, but, if the representation is two's complement, no
operation needs to be actually performed: the bit pattern of the signed
is merely reinterpreted as unsigned.

While this conversion is well defined and yields the same result,
regardless of implementation, converting an unsigned value that cannot
be represented by a signed type yields an implementation-defined result
(or, in C99, may even raise a signal). So, the standard has chosen the
well defined conversion for this case, which is a good thing.

Dan
 
F

Fred Ma

Dan said:
Consider the alternative: the unsigned might have to converted to signed,
but such a conversion isn't well defined if the value of the unsigned
cannot be represented by the signed. In such cases, the result is
usually a negative value, and this isn't any better than the actual
scenario you have described above.

The moral of the story: the programmer MUST know what he's doing when
combining signed and unsigned operands.

Yes, it certainly seems so.
No need for such an assumption: the result of the conversion is well
defined, regardless of the representation of negative values.

I use it as a conceptual aid, though I realize that it may
not reflect actual implementation.
The standard does not allow this. The unsigned must be using a pure
binary encoding.

Sorry, getting my terminology mixed up. I meant binary encoding.
If the signed value is negative, it is added the maximum value that
can be represented by the unsigned type plus one. This is true regardless
of representation, but, if the representation is two's complement, no
operation needs to be actually performed: the bit pattern of the signed
is merely reinterpreted as unsigned.

While this conversion is well defined and yields the same result,
regardless of implementation, converting an unsigned value that cannot
be represented by a signed type yields an implementation-defined result
(or, in C99, may even raise a signal). So, the standard has chosen the
well defined conversion for this case, which is a good thing.

Yes, and if the standard had defined it the other way (so that
converting unsigned->signed yields a reinterprettation of the
potentially fictitious 2's complement representation), then that would
be well defined too. It seems like an arbitrary choice. Thanks
for confirming how it works.

Fred
 
D

Dan Pop

In said:
Yes, and if the standard had defined it the other way (so that
converting unsigned->signed yields a reinterprettation of the
potentially fictitious 2's complement representation), then that would
be well defined too.

That would not be possible, given that the standard doesn't require
two's complement representation and other allowed representations (one's
complement and sign-magnitude) don't have a range as wide as two's
complement.
It seems like an arbitrary choice.

It's less arbitrary than it seems to you. OTOH, the way integral
promotions (not to be confused with the usual arithmetic conversions)
work is based on an arbitrary and suboptimal choice (value preserving
instead of the more natural signedness preserving).

Dan
 
F

Fred Ma

Dan said:
That would not be possible, given that the standard doesn't require
two's complement representation and other allowed representations (one's
complement and sign-magnitude) don't have a range as wide as two's
complement.

Well, as I said, I'm using two's complement more as a pictorial
guide. Couldn't they just as easily define unsigned->signed
conversion as subtraction of the maxium number representable
by the unsigned type?
It's less arbitrary than it seems to you. OTOH, the way integral
promotions (not to be confused with the usual arithmetic conversions)
work is based on an arbitrary and suboptimal choice (value preserving
instead of the more natural signedness preserving).

Actually, they way they do it seems to make sense to me. There
is no way to avoid losing information, so why bother pretending
to try? Why not just make to operations work properly for a small
anticipatable subset of cases? The cases that they target is
probably motivated simple hardware visualization e.g. if someone
was using C integers to emulate hardware. This is done in DSP
that is eventually meant to be implemented as hardware. OTOH,
trying to preserve information (even signedness) may give a softer
failure, or less gross error, but the argument against that is
similar to the argument of wanting a programming bug to be easy
to find. You want a bug to have clear and overt symptoms. The
more dramatic the error and its outward signs, the better.

Sort of makes sense, eh?

Fred
 
D

Dan Pop

In said:
Actually, they way they do it seems to make sense to me. There
is no way to avoid losing information, so why bother pretending
to try?

You don't know what you're talking about: no loss of information is
possible in the integral promotions.

Dan
 
O

Old Wolf

The confusion is that for a binary operator,
This is correct.

I don't think it's correct for the binary operator ">>", when
the right-hand operand is an unsigned long
(either that, or my compiler is buggy)
The moral of the story: the programmer MUST know what he's doing when
combining signed and unsigned operands.

A subset of the moral: the programmer MUST know what he's doing
when using C.
 
F

Fred Ma

Dan said:
You don't know what you're talking about: no loss of information is
possible in the integral promotions.

Dan

I don't see why you say no information is lost. You basically
lost the original value of the number by adding the maximum
value representable by the unsigned integer. Sure, you can get
it back, but that entails that you keep track of which operands
in an expression were subjected to this reinterprettation of the
bits. That is extra information you need, which is just another
manifestation of lost information.

Fred
 
D

Dave Thompson

(Binary arithmetic, comparison or bitwise operator; the "usual
arithmetic conversions" do not apply for shifts and && and || . Or
(noncompound) assignment or comma, which are syntactically binary
although not what people usually think of as binary operations.)
This is correct.
Nit: correct for C89, where ulong is the highest type; in C99 only if
the other operand does not have rank higher than ulong.

Otherwise concur.


- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top