promotion and narrowing integer conversion

L

lovecreatesbeauty

I was reading on c-faq value preserving and unsigned preserving,
though I didn't find much detail discussion about this on K&R2 and
C:ARM5. Some posts quote these two preserving being term in standard
document. Does it mean that it's ok for mediocrity C programmer (eg.
me) to put focus on promotion conversion (converted to lager type) and
narrowing conversion?

The c-faq states that both value and unsigned preserving are for
promotion conversion, but narrowing conversion may happen in value
preserving as well.

There's warning on Ln 36, but none on Ln 31 and 32, why?

Does this trivial code demonstrate most kinds of integer conversion
cases in C? If it doesn't, could you please suggest more situation?

Narrowing conversion may lead to data loseing, so this is the only
dangrous conversion, right?

Thank you for your time.


$ cat a.c
#include <stdio.h>
#include <limits.h>

int main(void)
{
char c = CHAR_MAX;
unsigned char uc = UCHAR_MAX;
short s = SHRT_MAX;
unsigned short us = USHRT_MAX;
int i = INT_MAX;
unsigned int ui = UINT_MAX;
long l = LONG_MAX;
unsigned long ul = ULONG_MAX;

printf("\n");
printf("LIMITS: c:%d, uc:%u, s:%d, us:%u\n"
"LIMITS: i:%d, ui:%u, l:%ld, ul:%lu\n",
c, uc, s, us, i, ui, l, ul);

/* promotion */
printf("\n");
printf("i = s: %d, ", i = s);
printf("i = us: %d, ", i = us);
printf("ui = s: %u, ", ui = s);
printf("ui = us: %u\n", ui = us);

/* narrowing */
printf("\n");
printf("c = s: %d, ", c = s);
printf("c = us: %d, ", c = us);
printf("uc = s: %d, ", uc = s); /*Ln:31*/
printf("uc = us: %d\n", uc = us);

printf("\n");
printf("uc = UCHAR_MAX (%u) + 1: %u, ",
UCHAR_MAX, uc = UCHAR_MAX + 1); /* Ln:36 */
printf("uc = -1: %d\n", uc = -1);
return 0;
}

$ make && ./a.out
gcc -ansi -pedantic -Wall -W -c -o a.o a.c
a.c: In function ‘main’:
a.c:36: warning: large integer implicitly truncated to unsigned type
gcc a.o -o a.out

LIMITS: c:127, uc:255, s:32767, us:65535
LIMITS: i:2147483647, ui:4294967295, l:2147483647, ul:4294967295

i = s: 32767, i = us: 65535, ui = s: 32767, ui = us: 65535

c = s: -1, c = us: -1, uc = s: 255, uc = us: 255

uc = UCHAR_MAX (255) + 1: 0, uc = -1: 255
$
 
B

Ben Bacarisse

I was reading on c-faq value preserving and unsigned preserving,
though I didn't find much detail discussion about this on K&R2 and
C:ARM5. Some posts quote these two preserving being term in standard
document. Does it mean that it's ok for mediocrity C programmer (eg.
me) to put focus on promotion conversion (converted to lager type) and
narrowing conversion?

Sorry, I don't know what that means.
The c-faq states that both value and unsigned preserving are for
promotion conversion, but narrowing conversion may happen in value
preserving as well.

There's warning on Ln 36, but none on Ln 31 and 32, why?

You are not asking for enough warnings. I get them on 24, 29, 30, 31,
32 and 27 as well (I use -Wconversion when testing code).
Does this trivial code demonstrate most kinds of integer conversion
cases in C? If it doesn't, could you please suggest more situation?

Converting a value of one signed integer type to another that can't
represent that value is implementation defined and may raise a
signal. It's not explicitly undefined behaviour, but it is about a
close as you can get. I.e. you are observing what one implementation
decides to do, not what C says will happen.
Narrowing conversion may lead to data loseing, so this is the only
dangrous conversion, right?

No. See above.

<snip code>
 
L

lovecreatesbeauty

If INT_MAX equals UCHAR_MAX then (1 + UCHAR_MAX) is undefined.

Provide ULONG_MAX is lager than UCHAR_MAX, will the expression (1 +
UCHAR_MAX) be promoted to type unsigned long?

Will (ul + UCHAR_MAX) be type of unsigned long provide unsigned long
ul and ULONG_MAX is lager than UCHAR_MAX; ?
 
F

frank

I was reading on c-faq value preserving and unsigned preserving,
though I didn't find much detail discussion about this on K&R2 and
C:ARM5. Some posts quote these two preserving being term in standard
document. Does it mean that it's ok for mediocrity C programmer (eg.
me) to put focus on promotion conversion (converted to lager type) and
narrowing conversion?

The c-faq states that both value and unsigned preserving are for
promotion conversion, but narrowing conversion may happen in value
preserving as well.

There's warning on Ln 36, but none on Ln 31 and 32, why?

Does this trivial code demonstrate most kinds of integer conversion
cases in C? If it doesn't, could you please suggest more situation?

Narrowing conversion may lead to data loseing, so this is the only
dangrous conversion, right?

Thank you for your time.


$ cat a.c
#include <stdio.h>
#include <limits.h>

int main(void)
{
char c = CHAR_MAX;
unsigned char uc = UCHAR_MAX;
short s = SHRT_MAX;
unsigned short us = USHRT_MAX;
int i = INT_MAX;
unsigned int ui = UINT_MAX;
long l = LONG_MAX;
unsigned long ul = ULONG_MAX;

printf("\n");
printf("LIMITS: c:%d, uc:%u, s:%d, us:%u\n"
"LIMITS: i:%d, ui:%u, l:%ld, ul:%lu\n",
c, uc, s, us, i, ui, l, ul);

/* promotion */
printf("\n");
printf("i = s: %d, ", i = s);
printf("i = us: %d, ", i = us);
printf("ui = s: %u, ", ui = s);
printf("ui = us: %u\n", ui = us);

/* narrowing */
printf("\n");
printf("c = s: %d, ", c = s);
printf("c = us: %d, ", c = us);
printf("uc = s: %d, ", uc = s); /*Ln:31*/
printf("uc = us: %d\n", uc = us);

printf("\n");
printf("uc = UCHAR_MAX (%u) + 1: %u, ",
UCHAR_MAX, uc = UCHAR_MAX + 1); /* Ln:36 */
printf("uc = -1: %d\n", uc = -1);
return 0;
}

$ make && ./a.out
gcc -ansi -pedantic -Wall -W -c -o a.o a.c
a.c: In function ‘main’:
a.c:36: warning: large integer implicitly truncated to unsigned type
gcc a.o -o a.out

LIMITS: c:127, uc:255, s:32767, us:65535
LIMITS: i:2147483647, ui:4294967295, l:2147483647, ul:4294967295

i = s: 32767, i = us: 65535, ui = s: 32767, ui = us: 65535

c = s: -1, c = us: -1, uc = s: 255, uc = us: 255

uc = UCHAR_MAX (255) + 1: 0, uc = -1: 255
$

lc,

I'm glad you wrote this, because I'd been putting off writing it myself.
I prefer to use commands for contemporary C as opposed to the ones
that you have that would be more appropriate for C90:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra lc1.c -o out; ./out
lc1.c: In function ‘main’:
lc1.c:36: warning: large integer implicitly truncated to unsigned type

LIMITS: c:127, uc:255, s:32767, us:65535
LIMITS: i:2147483647, ui:4294967295, l:2147483647, ul:4294967295

i = s: 32767, i = us: 65535, ui = s: 32767, ui = us: 65535

c = s: -1, c = us: -1, uc = s: 255, uc = us: 255

uc = UCHAR_MAX (255) + 1: 0, uc = -1: 255
dan@dan-desktop:~/source$ cat lc1.c
#include <stdio.h>
#include <limits.h>

int main(void)
{
char c = CHAR_MAX;
unsigned char uc = UCHAR_MAX;
short s = SHRT_MAX;
unsigned short us = USHRT_MAX;
int i = INT_MAX;
unsigned int ui = UINT_MAX;
long l = LONG_MAX;
unsigned long ul = ULONG_MAX;

printf("\n");
printf("LIMITS: c:%d, uc:%u, s:%d, us:%u\n"
"LIMITS: i:%d, ui:%u, l:%ld, ul:%lu\n",
c, uc, s, us, i, ui, l, ul);

/* promotion */
printf("\n");
printf("i = s: %d, ", i = s);
printf("i = us: %d, ", i = us);
printf("ui = s: %u, ", ui = s);
printf("ui = us: %u\n", ui = us);

/* narrowing */
printf("\n");
printf("c = s: %d, ", c = s);
printf("c = us: %d, ", c = us);
printf("uc = s: %d, ", uc = s); /*Ln:31*/
printf("uc = us: %d\n", uc = us);

printf("\n");
printf("uc = UCHAR_MAX (%u) + 1: %u, ",
UCHAR_MAX, uc = UCHAR_MAX + 1); /* Ln:36 */
printf("uc = -1: %d\n", uc = -1);
return 0;
}



// gcc -std=c99 -Wall -Wextra lc1.c -o out; ./out
dan@dan-desktop:~/source$

I think the possiblities are all right here:

Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion
is needed.
Otherwise, if both operands have signed integer types or both
have unsigned
integer types, the operand with the type of lesser integer
conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or
equal to the rank of the type of the other operand, then the
operand with
signed integer type is converted to the type of the operand with
unsigned
integer type.
Otherwise, if the type of the operand with signed integer type
can represent
all of the values of the type of the operand with unsigned
integer type, then
the operand with unsigned integer type is converted to the type
of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
 
L

lovecreatesbeauty

...snip...

I add minimals in my new test this time. There're warnings when shorts
are assigned to smaller chars by implicit conversion; No warnings
signaled when shorts are assigned to lager ints.

1. Assigning / converting a lager value to an object with smaller type
is losing data and dangerous. And it's undefined behavior and should
be avoided, right?

2. I heard assigning a value to a smaller unsigned value is not
undefined behavior, though it's dangerous, right?

3. How about doing these assigning or conversion in a cast (explicit
conversion)?

4. What's the rule in handling data value in conversion?

4.1 which is undefined behavior and should be avoided?

4.2 which is well defined and can be performed smoothly?

Thank you for your time


$ cat a.c
#include <stdio.h>
#include <limits.h>

int main(void)
{
char c;
unsigned char uc;
int i;
unsigned int ui;

printf("LIMITS:\n"
"SCHAR_MIN: %d, SCHAR_MAX: %d, UCHAR_MAX: %d,\n"
"CHAR_MIN : %d, CHAR_MAX : %d,\n"
"SHRT_MIN: %d, SHRT_MAX: %d, USHRT_MAX: %d,\n"
"INT_MIN : %d, INT_MAX : %d, UINT_MAX : %u,\n"
"LONG_MIN: %ld, LONG_MAX: %ld, ULONG_MAX: %lu\n",
SCHAR_MIN, SCHAR_MAX, UCHAR_MAX,
CHAR_MIN, CHAR_MAX,
SHRT_MIN, SHRT_MAX, USHRT_MAX,
INT_MIN, INT_MAX, UINT_MAX,
LONG_MIN, LONG_MAX, ULONG_MAX);

/* assign shorts to chars */
printf("\n");
printf("c = SHRT_MIN: %d, ", c = SHRT_MIN); /* Ln 25 */
printf("c = SHRT_MAX: %d, ", c = SHRT_MAX);
printf("c = USHRT_MAX: %d\n", c = USHRT_MAX);

printf("uc = SHRT_MIN: %d, ", uc = SHRT_MIN); /* Ln 29 */
printf("uc = SHRT_MAX: %d, ", uc = SHRT_MAX);
printf("uc = USHRT_MAX: %d\n", uc = USHRT_MAX);

/* assign shorts to ints */
printf("\n");
printf("i = SHRT_MIN: %d, ", i = SHRT_MIN);
printf("i = SHRT_MAX: %d, ", i = SHRT_MAX);
printf("i = USHRT_MAX: %d\n", i = USHRT_MAX);

printf("ui = SHRT_MIN: %d, ", ui = SHRT_MIN);
printf("ui = SHRT_MAX: %d, ", ui = SHRT_MAX);
printf("ui = USHRT_MAX: %d\n", ui = USHRT_MAX);
return 0;
}

$ make && ./a.out
gcc -ansi -pedantic -Wall -W -c -o a.o a.c
a.c: In function ‘main’:
a.c:25: warning: overflow in implicit constant conversion
a.c:26: warning: overflow in implicit constant conversion
a.c:27: warning: overflow in implicit constant conversion
a.c:29: warning: large integer implicitly truncated to unsigned type
a.c:30: warning: large integer implicitly truncated to unsigned type
a.c:31: warning: large integer implicitly truncated to unsigned type
gcc a.o -o a.out
LIMITS:
SCHAR_MIN: -128, SCHAR_MAX: 127, UCHAR_MAX: 255,
CHAR_MIN : -128, CHAR_MAX : 127,
SHRT_MIN: -32768, SHRT_MAX: 32767, USHRT_MAX: 65535,
INT_MIN : -2147483648, INT_MAX : 2147483647, UINT_MAX : 4294967295,
LONG_MIN: -2147483648, LONG_MAX: 2147483647, ULONG_MAX: 4294967295

c = SHRT_MIN: 0, c = SHRT_MAX: -1, c = USHRT_MAX: -1
uc = SHRT_MIN: 0, uc = SHRT_MAX: 255, uc = USHRT_MAX: 255

i = SHRT_MIN: -32768, i = SHRT_MAX: 32767, i = USHRT_MAX: 65535
ui = SHRT_MIN: -32768, ui = SHRT_MAX: 32767, ui = USHRT_MAX: 65535
$
 
L

lovecreatesbeauty

printf("ui = SHRT_MIN: %d, ", ui = SHRT_MIN);
printf("ui = SHRT_MAX: %d, ", ui = SHRT_MAX);
printf("ui = USHRT_MAX: %d\n", ui = USHRT_MAX);

printf("ui = SHRT_MIN: %u, ", ui = SHRT_MIN);
printf("ui = SHRT_MAX: %u, ", ui = SHRT_MAX);
printf("ui = USHRT_MAX: %lu\n", ui = USHRT_MAX);
ui = SHRT_MIN: -32768, ui = SHRT_MAX: 32767, ui = USHRT_MAX: 65535

ui = SHRT_MIN: 4294934528, ui = SHRT_MAX: 32767, ui = USHRT_MAX: 65535
 
T

Tim Rentsch

pete said:
It is not undefined behavior to assign an out of range value
to an integer type object.

Strictly speaking (and under the presumption that the type being
converted to is a signed integer type), it's implementation-defined
behavior, but the consequence of such implementation-defined behavior
can be an undefined result.
 
L

lovecreatesbeauty

Integer promotions only go to either (signed int) or (unsigned int).

Which way only depends upon
if INT_MAX is greater than or equal to UCHAR_MAX.

Thank you, pete.
I read more and might get it. Please correct me as usual. Literal
integers are type of int. (1 + UCHAR_MAX) are type of 1 /*literal
one*/ and won't go to long or unsigned long.
(1U  + UCHAR_MAX) is of type (unsigned int).
(1LU + UCHAR_MAX) is of type (long unsigned int).

Yes thank you, I also get it this time. 1LU is type of unsigned long
and the entire expression is therefor type of unsigned long also.
 
L

lovecreatesbeauty

It is not undefined behavior to assign an out of range value
to an integer type object.

I think the result might not be foreseen, should I avoid this?
It's tricky to assign an out of range value to an unsigned type,
but not unheard of.

Tricky? And you had not heard of this? Is this just my silly thoughts
alone.
You might lose warnings with a cast.

This expression: ((unsigned char)-1)
appears in some of my code examples.

But I think I may lose data.

What's good and what's bad, what should I avoid when I write numeric
expressions?
 
E

Ersek, Laszlo

I think the result might not be foreseen, should I avoid this?

Yes. At least whenever I write up an integer arithmetic expression, I
like to be clear about promotions and implicit conversions and possible
value ranges of all subexpressions.


I use ((size_t)-1) all the time.

What's good and what's bad, what should I avoid when I write numeric
expressions?

If you also do floating point, I think you should be aware of at least
the floating point environment and any accumulated error that has crept
in. My personal approach is that unless I do actual numerical analysis
in C, I won't touch the floating point types.


* David Goldberg: What Every Computer Scientist Should Know About
Floating-Point Arithmetic

http://docs.sun.com/source/806-3568/ncg_goldberg.html


* David Monniaux: The Pitfalls of Verifying Floating-Point Computations

http://portal.acm.org/citation.cfm?id=1353446

Page 12:2:
shall be particularly interested in programs written in the C
programming language, because this language is often favoured for
embedded systems. We shall in particular discuss some implications of
the most recent standard of that language, "C99" [ISO/IEC 1999].<<

....

Page 12:4:
beliefs about how floating-point computations behave and what one can
safely suppose about them for program analysis, an opposite
misconception exists: that floating-point is inherently so complex and
tricky that it is impossible to do any kind of sound analysis, or do any
kind of sound reasoning, on programs using floating-point, except
perhaps in very simple cases.<<


Cheers,
lacos
 
K

Keith Thompson

I add minimals in my new test this time. There're warnings when shorts
are assigned to smaller chars by implicit conversion; No warnings
signaled when shorts are assigned to lager ints.

1. Assigning / converting a lager value to an object with smaller type
is losing data and dangerous. And it's undefined behavior and should
be avoided, right?

2. I heard assigning a value to a smaller unsigned value is not
undefined behavior, though it's dangerous, right?

This is fully explained in section 6.3.1 of the C99 standard.
Grab the latest draft
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>
if you haven't already.

To summarize:

If the target type is _Bool (bool), the result is 0 (false) if the
source value is zero, 1 (true) if the source value is non-zero

Otherwise, if the source value fits in the target type, the conversion
just gives you the same value (possibly rounded if the target is a
floating-point type).

If the source value *doesn't* fit in the target type:

Otherwise, if either the source or the target is floating-point,
the behavior is undefined.

Otherwise, if the target is unsigned, the result is reduced modulo
MAX+1 (in effect the high-order bits are discarded).

Otherwise (the target type is signed and the source value doesn't
fit), either the result is implementation-defined or (new in C99)
an implementation-defined signal is raised.

I *think* I got all that right. If any of this contradicts the
standard, then the standard is right and I'm wrong.

I strongly advise treating the implementation-defined result or
implementation-defined signal for an out-of-range signed target as
if it were undefined behavior. Assume it will make demons fly out
of your nose. If you intend your code to be at all portable, just
don't let a signed conversion overflow happen in the first place.
(Or you can choose to depend on implementation-defined behavior,
but then you're on your own.)
3. How about doing these assigning or conversion in a cast (explicit
conversion)?

That makes no difference; the behavior of a conversion is exactly the
same whether it's implicit or explicit. (And thank you for getting
the terminology right.)
4. What's the rule in handling data value in conversion?

Be Careful.
4.1 which is undefined behavior and should be avoided?

4.2 which is well defined and can be performed smoothly?

See above.
 
K

Keith Thompson

Thank you, pete.
I read more and might get it. Please correct me as usual. Literal
integers are type of int. (1 + UCHAR_MAX) are type of 1 /*literal
one*/ and won't go to long or unsigned long.

The type of an integer constant depends on its value. See C99 6.4.4.1
for details, particularly the table on the 3rd page of that section.
Any integer constants not exceeding INT_MAX is of type int.

As for (1 + UCHAR_MAX), I assume that "of type 1" was a typo.

1 is of type int (C99 6.4.4.1). UCHAR_MAX is of whatever
type unsigned char is promoted to (C99 5.2.4.2.1p1). On most
implementations, where UCHAR_MAX <= INT_MAX, this is int, so
(1 + UCHAR_MAX) is of type int. If UCHAR_MAX > INT_MAX
(possible only if CHAR_BIT is at least 16), then UCHAR_MAX is
of type unsigned int, so (1 + UCHAR_MAX) is of type unsigned int
(and has the value 0).

(I think it may be theoretically possible for UCHAR_MAX to exceed
UINT_MAX if CHAR_BIT >= 16 and unsigned int has enough padding bits,
but that would be a really bizarre implementation which would have
serious problems just evaluating UCHAR_MAX; I'm not sure even the
DS9K goes that far.)
Yes thank you, I also get it this time. 1LU is type of unsigned long
and the entire expression is therefor type of unsigned long also.

Right.
 
T

Tim Rentsch

Keith Thompson said:
[...]
(I think it may be theoretically possible for UCHAR_MAX to exceed
UINT_MAX if CHAR_BIT >= 16 and unsigned int has enough padding bits,
[...]

It isn't, because of the rule about value ranges: "For any two
integer types with the same signedness and different integer
conversion rank (see 6.3.1.1), the range of values of the type with
smaller integer conversion rank is a subrange of the values of the
other type."

(That's 6.2.5p7.)
 
K

Keith Thompson

Tim Rentsch said:
Keith Thompson said:
[...]
(I think it may be theoretically possible for UCHAR_MAX to exceed
UINT_MAX if CHAR_BIT >= 16 and unsigned int has enough padding bits,
[...]

It isn't, because of the rule about value ranges: "For any two
integer types with the same signedness and different integer
conversion rank (see 6.3.1.1), the range of values of the type with
smaller integer conversion rank is a subrange of the values of the
other type."

(That's 6.2.5p7.)

Thanks, you're right.
 
L

lovecreatesbeauty

This is fully explained in section 6.3.1 of the C99 standard.
Grab the latest draft
    <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>
if you haven't already.

To summarize:

If the target type is _Bool (bool), the result is 0 (false) if the
source value is zero, 1 (true) if the source value is non-zero

Otherwise, if the source value fits in the target type, the conversion
just gives you the same value (possibly rounded if the target is a
floating-point type).

If the source value *doesn't* fit in the target type:

    Otherwise, if either the source or the target is floating-point,
    the behavior is undefined.

    Otherwise, if the target is unsigned, the result is reduced modulo
    MAX+1 (in effect the high-order bits are discarded).

    Otherwise (the target type is signed and the source value doesn't
    fit), either the result is implementation-defined or (new in C99)
    an implementation-defined signal is raised.

I *think* I got all that right.  If any of this contradicts the
standard, then the standard is right and I'm wrong.

Keith, thank you very much for writing this post. It's great help to
me. I'll need some time to digest what you've written and apply it
when I write C code. Maybe I'll post my new code in question here to
ask for correction in future.
I strongly advise treating the implementation-defined result or
implementation-defined signal for an out-of-range signed target as
if it were undefined behavior.  Assume it will make demons fly out
of your nose.  If you intend your code to be at all portable, just
don't let a signed conversion overflow happen in the first place.
(Or you can choose to depend on implementation-defined behavior,
but then you're on your own.)

I totally agree with you and think this is good advice. I'm afraid of
breaking my own C code without knowing the undefined behavior already
involved in my code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top