# conversion of signed integer to unsigned integer

Discussion in 'C Programming' started by junky_fellow@yahoo.co.in, Jun 17, 2005.

1. ### Guest

[ N869 6.3.1.3 ]

When a value with integer type is converted to another integer type
other than _Bool,
if the new type is unsigned, the value is converted by repeatedly
adding or subtracting one more than the maximum value that can be
represented in the new type until the value is in the range of the new
type.

Thanx ...

, Jun 17, 2005

2. ### Eric SosmanGuest

wrote:

> Can anybody please explain this:
>
> [ N869 6.3.1.3 ]
>
> When a value with integer type is converted to another integer type
> other than _Bool,
> if the new type is unsigned, the value is converted by repeatedly
> adding or subtracting one more than the maximum value that can be
> represented in the new type until the value is in the range of the new
> type.

unsigned short us; /* assume USHRT_MAX == 65535 */
us = -1000000;

The variable is unsigned, and not capable of expressing a
negative value. The assignment must set it to a non-negative
value, and 6.3.1.3 describes how that value is computed:

-1000000
+65536 (USHRT_MAX+1)
========
-934464
+65536
========
-868928
:
:
========
-16960
+65536
========

Various computational shortcuts are available; all those

--
Eric Sosman
lid

Eric Sosman, Jun 17, 2005

3. ### Guest

Eric Sosman wrote:
> wrote:
>
> > Can anybody please explain this:
> >
> > [ N869 6.3.1.3 ]
> >
> > When a value with integer type is converted to another integer type
> > other than _Bool,
> > if the new type is unsigned, the value is converted by repeatedly
> > adding or subtracting one more than the maximum value that can be
> > represented in the new type until the value is in the range of the new
> > type.

>
> unsigned short us; /* assume USHRT_MAX == 65535 */
> us = -1000000;
>
> The variable is unsigned, and not capable of expressing a
> negative value. The assignment must set it to a non-negative
> value, and 6.3.1.3 describes how that value is computed:
>
> -1000000
> +65536 (USHRT_MAX+1)
> ========
> -934464
> +65536
> ========
> -868928
> :
> :
> ========
> -16960
> +65536
> ========
>
> Various computational shortcuts are available; all those
> additions need not actually be peformed to get to the answer.
>
> --
> Eric Sosman
> lid

Consider,
signed char sc = -4; /* binary = 11111100 */
unsigned char uc = sc;

Now, if I print the value of sc it is 252 (binary 11111100).

So, if you see no conversion has been done. The bit values in
"uc" are exactly the same as in "sc".

Then, whats the need of performing so many additions/subtractions ?

, Jun 17, 2005
4. ### Jean-Claude ArbautGuest

> Consider,
> signed char sc = -4; /* binary = 11111100 */
> unsigned char uc = sc;
>
> Now, if I print the value of sc it is 252 (binary 11111100).
>
> So, if you see no conversion has been done. The bit values in
> "uc" are exactly the same as in "sc".

Great confusion between bit pattern and actual value, here.
First, 252 = -4 + 256, needed to get a valid unsigned char.
Second, not all machines are 2-complement, though many are

As Eric said, there are computational shortcuts, and you
should note they depend on the machine (hence, on the
implementation of the standard). It's pure coincidence
if a signed char and an unsigned char have actually the same
representation for "s" and "256+s" respectively.

Jean-Claude Arbaut, Jun 17, 2005
5. ### Jean-Claude ArbautGuest

Le 17/06/2005 15:49, dans BED8A19F.50FE%,
« Jean-Claude Arbaut » <> a écrit :

>
>> Consider,
>> signed char sc = -4; /* binary = 11111100 */
>> unsigned char uc = sc;
>>
>> Now, if I print the value of sc it is 252 (binary 11111100).
>>
>> So, if you see no conversion has been done. The bit values in
>> "uc" are exactly the same as in "sc".

>
> Great confusion between bit pattern and actual value, here.
> First, 252 = -4 + 256, needed to get a valid unsigned char.
> Second, not all machines are 2-complement, though many are
>
> As Eric said, there are computational shortcuts, and you
> should note they depend on the machine (hence, on the
> implementation of the standard). It's pure coincidence
> if a signed char and an unsigned char have actually the same
> representation for "s" and "256+s" respectively.
>

Just for reference:

ISO 9899-1999, section 6.2.6.2#2 p39

Jean-Claude Arbaut, Jun 17, 2005
6. ### MeGuest

> Can anybody please explain this:
>
> When a value with integer type is converted to another integer type
> other than _Bool,
> if the new type is unsigned, the value is converted by repeatedly
> adding or subtracting one more than the maximum value that can be
> represented in the new type until the value is in the range of the new
> type.

It's basically describing a mod operation. Lets say you want to convert
any random signed integer to an unsigned int, a table of those values
looks like:

....
UINT_MAX+2 1
UINT_MAX+1 0
UINT_MAX UINT_MAX
UINT_MAX-1 UINT_MAX-1
UINT_MAX-2 UINT_MAX-2
....
2 2
1 1
0 0
-1 UINT_MAX
-2 UINT_MAX-1
....
-UINT_MAX+1 2
-UINT_MAX 1
-UINT_MAX-1 0
-UINT_MAX-2 UINT_MAX
-UINT_MAX-3 UINT_MAX-1
....

Where the left side is the signed integer value and the right side is
the resulting value when converted to unsigned int. I'm sure you can
figure it out from there.

Me, Jun 17, 2005
7. ### MacGuest

On Fri, 17 Jun 2005 06:35:09 -0700, junky_fellow wrote:

>
>
> Eric Sosman wrote:
>> wrote:
>>
>> > Can anybody please explain this:
>> >
>> > [ N869 6.3.1.3 ]
>> >
>> > When a value with integer type is converted to another integer type
>> > other than _Bool,
>> > if the new type is unsigned, the value is converted by repeatedly
>> > adding or subtracting one more than the maximum value that can be
>> > represented in the new type until the value is in the range of the new
>> > type.

>>
>> unsigned short us; /* assume USHRT_MAX == 65535 */
>> us = -1000000;
>>
>> The variable is unsigned, and not capable of expressing a
>> negative value. The assignment must set it to a non-negative
>> value, and 6.3.1.3 describes how that value is computed:
>>
>> -1000000
>> +65536 (USHRT_MAX+1)
>> ========
>> -934464
>> +65536
>> ========
>> -868928
>> :
>> :
>> ========
>> -16960
>> +65536
>> ========
>>
>> Various computational shortcuts are available; all those
>> additions need not actually be peformed to get to the answer.
>>
>> --
>> Eric Sosman
>> lid

>
> Consider,
> signed char sc = -4; /* binary = 11111100 */
> unsigned char uc = sc;
>
> Now, if I print the value of sc it is 252 (binary 11111100).
>
> So, if you see no conversion has been done. The bit values in
> "uc" are exactly the same as in "sc".
>
> Then, whats the need of performing so many additions/subtractions ?

junky,

Eric is pretty sharp. He dumbed down his answer a bit because he was
afraid of confusing you. Your first post made it look like you were prone
to confusion. Your second post has not changed that appearance.

But now you are coming back like a smart-alec. ;-)

Even so, nothing in Eric's post is incorrect, as far as I can see, and

He specifically said that the additions and/or subtractions need not
actually be performed to get the answer.

In the case of typical architectures, converting from unsigned to signed
of the same size may well be a no-op. The conversion really just means
that the compiler will change how it thinks of the bit pattern, not the
bit-pattern itself. As it turns out, this behavior satisfies the
mathematical rules laid out in the standard. This is probably not a
coincidence. I believe the intent of the rule was to force any non two's
complement architectures to emulate two's complement behavior. This is
convenient for programmers.

Even in the cases where a conversion from signed to unsigned involves
types of different sizes, typical architectures will have minimal work to
do to perform the conversion as specified in the standard. Non-typical
architectures, if there really are any, might have to do some arithmetic.

--Mac

Mac, Jun 17, 2005
8. ### Jean-Claude ArbautGuest

Re : conversion of signed integer to unsigned integer

Le 17/06/2005 18:20, dans , « Mac »
<> a écrit :

> On Fri, 17 Jun 2005 06:35:09 -0700, junky_fellow wrote:

> In the case of typical architectures, converting from unsigned to signed
> of the same size may well be a no-op. The conversion really just means
> that the compiler will change how it thinks of the bit pattern, not the
> bit-pattern itself.

Only on 2-complement architectures, but the standard envisages three
possibilities.

On other machines, a negative signed char and its unsigned char counterpart
cannot have same bit pattern.

> As it turns out, this behavior satisfies the
> mathematical rules laid out in the standard. This is probably not a
> coincidence. I believe the intent of the rule was to force any non two's
> complement architectures to emulate two's complement behavior.

I don't see why. The additions required by the standard are on mathematical
values, not on registers. Sections 6.2.6.2 and 6.3.1.3 do no rely
particularly on (or emulate) 2-complement behaviour.

> This is
> convenient for programmers.
>
> Even in the cases where a conversion from signed to unsigned involves
> types of different sizes, typical architectures will have minimal work to
> do to perform the conversion as specified in the standard. Non-typical
> architectures, if there really are any, might have to do some arithmetic.

At least there were. IBM 704 had 36 bits words, with 1 sign bit and 35 bits
of magnitude. There may be more modern machines with same kind of
arithmetic, I just looked for one Reference is: "IBM 704, Manual
Of Operation, 1955" at www.bitsavers.org. Of course nothing to do with the C
language, just an example of a different machine. If anybody knows of modern
ones, I'm of course interested.

Jean-Claude Arbaut, Jun 17, 2005
9. ### Lawrence KirbyGuest

On Fri, 17 Jun 2005 06:35:09 -0700, junky_fellow wrote:

....

> Consider,
> signed char sc = -4; /* binary = 11111100 */

That is the representation on your implementation, it may be something
else on another implementation.

> unsigned char uc = sc;
>
> Now, if I print the value of sc it is 252 (binary 11111100).
>
> So, if you see no conversion has been done.

Yes it has. You had a value of -4, now you have a value of 252. A very
real conversion has happened that has produced a different value.

> The bit values in
> "uc" are exactly the same as in "sc".

That's a happy coincidence, well not entirely a coincidence. The
conversion rules are designed to be efficiently implementable on common
architectures as well as being useful.

The underlying representation is not important, the result of the
conversion is defined on VALUES. You stared with a value and following the
conversion rules you added (UCHAR_MAX+1) in this case 256 to produce a
result of 252. You didn't need to know anything about the underlying
representation to determine that. On a 1's complement implementation -4
would be represented in 8 bits as 11111011, but uc = sc would still
produce the result 252 because the conversion is defined in terms of
value. On such systems the implementation must change the representation
to produce the correct result. That's the price you pay for portability
and consistent results.

> Then, whats the need of performing so many additions/subtractions ?

The additions/subtractions are just a means in the standard to specify
what the correct result should be. In practice a compiler would not
perform lots of additions or subtractions to actually calculate the result.

As you've noted in a lot of cases it doesn't have to do anything at all
except reinterpret a bit pattern according to a new type.

Lawrence

Lawrence Kirby, Jun 17, 2005
10. ### MacGuest

Re: Re : conversion of signed integer to unsigned integer

On Fri, 17 Jun 2005 18:46:03 +0200, Jean-Claude Arbaut wrote:

> Le 17/06/2005 18:20, dans , « Mac »
> <> a écrit :
>
>> On Fri, 17 Jun 2005 06:35:09 -0700, junky_fellow wrote:

>
>
>> In the case of typical architectures, converting from unsigned to signed
>> of the same size may well be a no-op. The conversion really just means
>> that the compiler will change how it thinks of the bit pattern, not the
>> bit-pattern itself.

>
> Only on 2-complement architectures, but the standard envisages three
> possibilities.

Right. As far as I'm concerned, all typical architectures use two's
complement representations. Also note I say "may well be," not "is" or
"must be."

[snip]
>> As it turns out, this behavior satisfies the
>> mathematical rules laid out in the standard. This is probably not a
>> coincidence. I believe the intent of the rule was to force any non two's
>> complement architectures to emulate two's complement behavior.

>
> I don't see why. The additions required by the standard are on mathematical
> values, not on registers. Sections 6.2.6.2 and 6.3.1.3 do no rely
> particularly on (or emulate) 2-complement behaviour.
>
>

I'm only talking about the conversion from signed to unsigned here. The
rule doesn't explicitly say that the result must be the same as if two's
complement representation is used, but that is the result. Why would this
be a coincidence?

Probably the folks writing the standard did not want to leave
signed-to-unsigned conversions implementation-defined, so they specified
the behavior to be the most natural thing for two's-complement machines.
This is just a guess on my part.

I did not mean to make any claims regarding any other arithmetic
issues.

>
>> This is
>> convenient for programmers.
>>
>> Even in the cases where a conversion from signed to unsigned involves
>> types of different sizes, typical architectures will have minimal work to
>> do to perform the conversion as specified in the standard. Non-typical
>> architectures, if there really are any, might have to do some arithmetic.

>
> At least there were. IBM 704 had 36 bits words, with 1 sign bit and 35 bits
> of magnitude. There may be more modern machines with same kind of
> arithmetic, I just looked for one Reference is: "IBM 704, Manual
> Of Operation, 1955" at www.bitsavers.org. Of course nothing to do with the C
> language, just an example of a different machine. If anybody knows of modern
> ones, I'm of course interested.

On a system which uses sign-magnitude representation, aren't all positive
integers represented the same way, regardless of whether the type is
signed or unsigned? Or is the sign convention that 1 is positive?

Anyway, I know there are lots of architectures out there, but I hesitate
to call most of them typical. And non two's-complement machines seem to be
getting rarer with every passing decade. Note that I am not advocating
ignoring the standard, or writing code which has undefined behavior.

--Mac

Mac, Jun 17, 2005
11. ### Jean-Claude ArbautGuest

Le 17/06/2005 23:03, dans , « Mac »
<> a écrit :

>> Only on 2-complement architectures, but the standard envisages three
>> possibilities.

>
> Right. As far as I'm concerned, all typical architectures use two's
> complement representations. Also note I say "may well be," not "is" or
> "must be."
>
> [snip]
>>> As it turns out, this behavior satisfies the
>>> mathematical rules laid out in the standard. This is probably not a
>>> coincidence. I believe the intent of the rule was to force any non two's
>>> complement architectures to emulate two's complement behavior.

>>
>> I don't see why. The additions required by the standard are on mathematical
>> values, not on registers. Sections 6.2.6.2 and 6.3.1.3 do no rely
>> particularly on (or emulate) 2-complement behaviour.
>>
>>

>
> I'm only talking about the conversion from signed to unsigned here. The
> rule doesn't explicitly say that the result must be the same as if two's
> complement representation is used, but that is the result. Why would this
> be a coincidence?
>
> Probably the folks writing the standard did not want to leave
> signed-to-unsigned conversions implementation-defined, so they specified
> the behavior to be the most natural thing for two's-complement machines.
> This is just a guess on my part.

I didn't understand that way the first time. Your guess is quite
reasonable.

> I did not mean to make any claims regarding any other arithmetic
> issues.

>
>
> On a system which uses sign-magnitude representation, aren't all positive
> integers represented the same way, regardless of whether the type is
> signed or unsigned? Or is the sign convention that 1 is positive?

Yes, but we were interested in the conversion -4 -> 252, so *negative*
signed chars. In case you convert a positive signed char to an unsigned
char, section 6.3.1.3#1 says the value shall not change if it is
representable. Since signed/unsigned types have the same size by 6.2.5#6,
I assume a positive signed char is always representable as an unsigned
char. Hence the *value* won't change. Now for the representation:
section 6.2.6.2#2 says there is sign correction only when the sign bit
is one, this means a positive signed char always have sign bit 0,
hence there is *nothing* to do during conversion. I hope I got right
in my interpretation of the standard. Otherwise, a guru will soon yell
at me, _again_ ;-)

> Anyway, I know there are lots of architectures out there, but I hesitate
> to call most of them typical.

Well I do too, but saying so one is often accused of thinking
there are only wintels in the world

> And non two's-complement machines seem to be
> getting rarer with every passing decade. Note that I am not advocating
> ignoring the standard, or writing code which has undefined behavior.

You say that to ME !!! Well, if you read my recent posts, you'll
see I am not advocating enforcing the standard too strongly ;-)
I am merely discovering the standard, and I must admit it's a shame
I have programmed in C for years without knowing a line from it.
It's arid at first glance, but it deserves deeper reading.
Wow, I said *that* ? ;-)

Jean-Claude Arbaut, Jun 17, 2005
12. ### peteGuest

wrote:

> Consider,
> signed char sc = -4; /* binary = 11111100 */
> unsigned char uc = sc;
>
> Now, if I print the value of sc it is 252 (binary 11111100).

I think you mean uc

> So, if you see no conversion has been done.
> The bit values in "uc" are exactly the same as in "sc".

That doesn't matter.
252 isn't -4.
A conversion has been done.

> Then, whats the need of performing so many additions/subtractions ?

-4 could also be either 11111011 or 10000100

The subtractions are a procedure that produces
the correct result regardless of representation.

When the represention is known, then easier ways can be used,
like interpreting
((unsigned char)sc)
as
(*(unsigned char *)&sc),
as your two's complement system may.

--
pete

pete, Jun 18, 2005
13. ### Guest

Re: Re : conversion of signed integer to unsigned integer

Mac wrote:
> On Fri, 17 Jun 2005 18:46:03 +0200, Jean-Claude Arbaut wrote:
>
> > Le 17/06/2005 18:20, dans , « Mac »
> > <> a écrit :
> >
> >> On Fri, 17 Jun 2005 06:35:09 -0700, junky_fellow wrote:

> >
> >
> >> In the case of typical architectures, converting from unsigned to signed
> >> of the same size may well be a no-op. The conversion really just means
> >> that the compiler will change how it thinks of the bit pattern, not the
> >> bit-pattern itself.

> >
> > Only on 2-complement architectures, but the standard envisages three
> > possibilities.

>
> Right. As far as I'm concerned, all typical architectures use two's
> complement representations. Also note I say "may well be," not "is" or
> "must be."
>
> [snip]
> >> As it turns out, this behavior satisfies the
> >> mathematical rules laid out in the standard. This is probably not a
> >> coincidence. I believe the intent of the rule was to force any non two's
> >> complement architectures to emulate two's complement behavior.

> >
> > I don't see why. The additions required by the standard are on mathematical
> > values, not on registers. Sections 6.2.6.2 and 6.3.1.3 do no rely
> > particularly on (or emulate) 2-complement behaviour.
> >
> >

>
> I'm only talking about the conversion from signed to unsigned here. The
> rule doesn't explicitly say that the result must be the same as if two's
> complement representation is used, but that is the result. Why would this
> be a coincidence?
>
> Probably the folks writing the standard did not want to leave
> signed-to-unsigned conversions implementation-defined, so they specified
> the behavior to be the most natural thing for two's-complement machines.
> This is just a guess on my part.
>
> I did not mean to make any claims regarding any other arithmetic
> issues.
>
> >
> >> This is
> >> convenient for programmers.
> >>
> >> Even in the cases where a conversion from signed to unsigned involves
> >> types of different sizes, typical architectures will have minimal work to
> >> do to perform the conversion as specified in the standard. Non-typical
> >> architectures, if there really are any, might have to do some arithmetic.

> >
> > At least there were. IBM 704 had 36 bits words, with 1 sign bit and 35 bits
> > of magnitude. There may be more modern machines with same kind of
> > arithmetic, I just looked for one Reference is: "IBM 704, Manual
> > Of Operation, 1955" at www.bitsavers.org. Of course nothing to do with the C
> > language, just an example of a different machine. If anybody knows of modern
> > ones, I'm of course interested.

>
> On a system which uses sign-magnitude representation, aren't all positive
> integers represented the same way, regardless of whether the type is
> signed or unsigned? Or is the sign convention that 1 is positive?
>
> Anyway, I know there are lots of architectures out there, but I hesitate
> to call most of them typical. And non two's-complement machines seem to be
> getting rarer with every passing decade. Note that I am not advocating
> ignoring the standard, or writing code which has undefined behavior.
>
> --Mac

OK. I got it. I had this doubt because I thought that 2's complement is
the
only way to represent the negative integer. That is why I was wondering
why
we need to do so many operations to convert a signed int to unsigned
int.

But, in practical, is there any significance of converting signed int
to
unsigned int ? Do we ever do this in real world ?
If we don't and it doesn't have any practical significance, then why
not give
an error just at compile time ?

Similarly, is there any rule for converting an unsigned char to signed
char.
For eg: How, unsigned char = 0xFF will be converted to signed char ?
And is there any significance of this ?

, Jun 18, 2005
14. ### Clark S. Cox IIIGuest

Re: Re : conversion of signed integer to unsigned integer

On 2005-06-18 04:46:02 -0400, said:

>
>
> Mac wrote:
>> On Fri, 17 Jun 2005 18:46:03 +0200, Jean-Claude Arbaut wrote:
>>
>> On a system which uses sign-magnitude representation, aren't all positive
>> integers represented the same way, regardless of whether the type is
>> signed or unsigned? Or is the sign convention that 1 is positive?
>>
>> Anyway, I know there are lots of architectures out there, but I hesitate
>> to call most of them typical. And non two's-complement machines seem to be
>> getting rarer with every passing decade. Note that I am not advocating
>> ignoring the standard, or writing code which has undefined behavior.
>>
>> --Mac

>
> OK. I got it. I had this doubt because I thought that 2's complement is
> the
> only way to represent the negative integer. That is why I was wondering
> why
> we need to do so many operations to convert a signed int to unsigned
> int.

We don't actually *need* all of the operations, as long as we get the
same result as we would have had we performed them all.

> But, in practical, is there any significance of converting signed int
> to
> unsigned int ? Do we ever do this in real world ?

Sure:

unsigned int u;
u = 1; //Converts the signed int (1) to an unsigned int

> If we don't and it doesn't have any practical significance, then why
> not give
> an error just at compile time ?
>
> Similarly, is there any rule for converting an unsigned char to signed
> char.

No, that is implementation defined. From the standard:
"Otherwise, the new type is signed and the value cannot be represented
in it; either the
result is implementation-defined or an implementation-defined signal is
raised. "

For eg: How, unsigned char = 0xFF will be converted to signed char ?
> And is there any significance of this ?
>

--
Clark S. Cox, III

Clark S. Cox III, Jun 18, 2005
15. ### CBFalconerGuest

Re: Re : conversion of signed integer to unsigned integer

wrote:
>

.... snip ...
>
> But, in practical, is there any significance of converting signed
> int to unsigned int ? Do we ever do this in real world ? If we
> don't and it doesn't have any practical significance, then why
> not give an error just at compile time ?
>
> Similarly, is there any rule for converting an unsigned char to
> signed char. For eg: How, unsigned char = 0xFF will be
> converted to signed char ? And is there any significance of this ?

If the unsigned entities value doesn't fit in the signed value,
behaviour is undefined. So conversion in that direction is
dangerous.