condition true or false? -> (-1 < sizeof("test"))

J

John Reye

Hello,

please can someone explain why
(-1 < sizeof("test"))
is false.

What do you think the following will print?

/***********************/
#include <stdio.h>

int main(void)
{
int i;

if (-1 < sizeof("test1")) {
printf("Cool line 1\n");
}

if (-1 < -1 + sizeof("test2")) {
printf("Cool line 2\n");
}

for (i = -1; i >= -3; i--) {
if (i < i + 1U) {
printf("Here %d\n", i);
}
}
return 0;
}




I am truly shocked and amazed.

Is there any coding suggestion that will (in future) save me half an
hour of sprinkling printf's like wild and thinking my compiler is
buggy, and that logic has just died?!!!!!!

What do I need to know to avoid these surprises and is there some
"coding style" that can guard against it.

Thanks!




For the real-world code, that caused my confusion:
#include <stdio.h>
#include <limits.h>

#if EOF != -1
#error EOF is not -1
#endif

int arr[UCHAR_MAX+2]; // storage for every unsigned char, and one
additional value

#define NUM_ARRAY_ELEMENTS(arr) (sizeof(arr)/sizeof(arr[0]))

int main(void)
{
int i;
for (i = EOF; i < EOF + NUM_ARRAY_ELEMENTS(arr); i++) {
printf("%d\n", i);
arr[i+1] = i;
}
return 0;
}
 
J

John Reye

For some more (similar) surprises:

#include <stdio.h>

int main(void)
{
int i;
char arr[1000];

if (-5 < sizeof(arr))
printf("Will this print?\n");

if (-99999999999999999 < sizeof(arr))
printf("What about this?\n");

return 0;
}
 
B

BartC

Robert Wessel said:
On Thu, 17 May 2012 01:40:08 -0700 (PDT), John Reye
Somewhat simplified*, C's "usual arithmetic conversions" convert
signed integers to unsigned when paired with an unsigned value for
some operator.
As to coding techniques... Don't compare signed and unsigned numbers
unless you really mean it. Your compiler should flag that sort of
thing at higher warning levels.

I always thought C was really a multitude of languages, not one, when the
myriad compiler options are taken into accounts.

Having one module allow mixed arithmetic, and not another, is a good
example. (Someone else uses a different compiler and different switches, and
different things will happen.)

MSVC will flag that sort of thing at
-W2 and above (and there a way to turn on that specific warning even
at -W1). In any event, you should really be using -W3 (assuming MSVC)
as a minimum anyway (-W4 can be somewhat painful). For GCC, turn on
"-Wconversion", which, I think, gets turn on with "-Wall".

-Wall doesn't seem to turn on -Wconversion.

However, -Wconversion doesn't give a warning in this example:

unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n\n", a, b, a<b);
printf("%d<%d = %d\n\n", 4, b, 4<b);
printf("%u<%d = %d\n\n", a, -2, a<-2);
printf("%d<%d = %d\n\n", 4, -2, 4<-2);

which gives conflicting results. (Presumably an integer literal such as "4"
is assumed to be signed? Having it as unsigned would be more intuitive, as
it would be impossible for it to be negative.)
 
B

Ben Bacarisse

BartC said:
I always thought C was really a multitude of languages, not one, when the
myriad compiler options are taken into accounts.

Having one module allow mixed arithmetic, and not another, is a good
example. (Someone else uses a different compiler and different switches, and
different things will happen.)

That's an odd way to look at it. Mixed arithmetic is always allowed. I
don't regard the presence of more or fewer warnings as indicating that I
am using a different language!

(Presumably an integer literal such as "4"
is assumed to be signed? Having it as unsigned would be more intuitive, as
it would be impossible for it to be negative.)

But that would turn x < 4 into an unsigned comparison. Worse, -1 would
always be unsigned (and large). You'd need to alter a lot of details if
you change something as fundamental as the type of a literal constant.
 
G

gwowen

I am truly shocked and amazed.

Utterly horrible isn't it? A bizarre misfeature.
Is there any coding suggestion that will (in future) save me half an
hour of sprinkling printf's like wild and thinking my compiler is
buggy, and that logic has just died?!!!!!!

What do I need to know to avoid these surprises and is there some
"coding style" that can guard against it.

All I can suggest is to turn the compiler warnings on to the
absolutely highest possible, and understand why they warn. Compiled
with

gcc -Wall -Wextra -Werror

for example, the above code will fail to compile and the diagnostic
will be something like:

"Error: comparison between signed and unsigned types"

It'll also flag things like
if(x = 1){
...
}

where you (probably) meant

if(x==1){
...
}

and tell you what to do if you *meant* if(x=1)...
 
A

Andreas Perstinger

However, -Wconversion doesn't give a warning in this example:

You need -Wsign-compare (or -Wextra) with gcc in your example.
unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n\n", a, b, a<b);
printf("%d<%d = %d\n\n", 4, b, 4<b);
printf("%u<%d = %d\n\n", a, -2, a<-2);
printf("%d<%d = %d\n\n", 4, -2, 4<-2);

which gives conflicting results.

Why are the results conflicting?

1) a < b: you are comparing unsigned int and signed int -> signed int
gets converted to unsigned -> 4 is smaller than -2 + UMAX_INT + 1

2) 4 < b: your are comparing a decimal integer constant (which is
normally of type signed int) with signed int -> no implicit conversion
-> 4 isn't smaller than -2

3) a < -2: you are comparing unsigned int with a decimal integer
constant (type signed int) -> -2 is converted to unsigned int -> 4 is
smaller than -2 + UMAX_INT + 1

4) 4 < -2: you are comparing a decimal integer constant with another
decimal integer constant -> no implicit conversion -> 4 isn't smaller
than -2

I think your problem is that in example 1 and 3 you print the signed
values (b and -2) with the %d format specifier but to see what's going
on you have to use %u because they get converted to unsigned int before
the comparison is evaluated:

printf("%u<%u = %d\n\n", a, b, a<b);
printf("%d<%d = %d\n\n", 4, b, 4<b);
printf("%u<%u = %d\n\n", a, -2, a<-2);
printf("%d<%d = %d\n\n", 4, -2, 4<-2);

Bye, Andreas
 
E

Eric Sosman

Hello,

please can someone explain why
(-1< sizeof("test"))
is false.

Because of the "usual arithmetic conversions," 6.3.1.8.
Most arithmetic operators require operands of the same type,
so for differing types the UAC's operate to reconcile them
before the operation is performed. In this case you have an
int and a size_t, and the UAC's convert the int to size_t:

if ( (size_t)-1 < sizeof("test") )

Since size_t is an unsigned type, (size_t)-1 is the largest
value the type can represent, and is a good deal greater
than (size_t)5.

Aside: I suppose that on a perverse implementation the
outcome might be different. If (size_t)-1 is mathematically
no greater than INT_MAX the conversion would go the other way.
The size_t would convert to an int before the comparison and
you'd have:

if ( -1 < (int)sizeof("test") )

I've never heard of an implementation where size_t is so
narrow, and I'm not 100% sure it would be conforming -- but
I'm not 100% sure it would be forbidden, either.
Is there any coding suggestion that will (in future) save me half an
hour of sprinkling printf's like wild and thinking my compiler is
buggy, and that logic has just died?!!!!!!

What do I need to know to avoid these surprises and is there some
"coding style" that can guard against it.

Try cranking up the warning levels on your compiler. If
you use gcc, "-W -Wall" will produce warnings for comparisons
between signed and unsigned types (there's surely a more specific
"-Wsomething" for just that particular warning, but I can't be
bothered to go ferret out just what it is).
 
B

BartC

Andreas Perstinger said:
You need -Wsign-compare (or -Wextra) with gcc in your example.


Why are the results conflicting?

1) a < b: you are comparing unsigned int and signed int -> signed int
gets converted to unsigned -> 4 is smaller than -2 + UMAX_INT + 1
I think your problem is that in example 1 and 3 you print the signed
values (b and -2) with the %d format specifier but to see what's going

Because they are signed!
on you have to use %u because they get converted to unsigned int before
the comparison is evaluated:

printf("%u<%u = %d\n\n", a, b, a<b);

This is the point. You're just demonstrating here *how* it manages to print
the wrong result!

But the fact is that b *is* signed, and needs to be displayed with %d. So in
my original example, the 4 and -2 were printed correctly in each case.

Also my gcc didn't give any warnings despite using -Wconversion.
 
B

BartC

Ben Bacarisse said:
But that would turn x < 4 into an unsigned comparison. Worse, -1 would
always be unsigned (and large).

1 would be unsigned.

-1, assuming constant folding by the compiler, would be equivalent to a
signed integer literal of "-1".

(If not, then it remains the negation of unsigned 1, performed at runtime.
For this purpose, negating an unsigned value would need to be allowed, and I
can't see a problem with that, except the usual overflow issues).
 
E

Eric Sosman

I always thought C was really a multitude of languages, not one, when the
myriad compiler options are taken into accounts.

Not a multitude of languages, but a multitude of implementations.
Changing the compiler options is equivalent to changing the compiler.
The Standard is silent on whether different implementations must
interoperate smoothly, but the compiler's documentation should tell
you which sets of flags are and are not compatible. (For example,
it might be impossible to mix "-ILP32" and "-LP64" modules in the
same executable, even if "the same" compiler generates both.)
which gives conflicting results. (Presumably an integer literal such as "4"
is assumed to be signed? Having it as unsigned would be more intuitive, as
it would be impossible for it to be negative.)

Well, "4" is not an integer literal, but that's a markup issue :)

The types of integer constants depend on their magnitude and
on the notation used, as described in 6.4.4.1p5. The constant 4
has type int, and is signed even though the value is positive.
If you think it would be "more intuitive" for the 4 to be unsigned,
please ponder `-8 / 4' and `-8 / 4u' and explain why having them
be identical would be unsurprising.
 
B

BartC

Eric Sosman said:
On 5/17/2012 6:16 AM, BartC wrote:
The types of integer constants depend on their magnitude and
on the notation used, as described in 6.4.4.1p5. The constant 4
has type int, and is signed even though the value is positive.
If you think it would be "more intuitive" for the 4 to be unsigned,
please ponder `-8 / 4' and `-8 / 4u' and explain why having them
be identical would be unsurprising.

If you're going to make integer literals be unsigned by default, then you
would probably also change mixed arithmetic to be signed and not unsigned.
 
A

Andreas Perstinger

Because they are signed!


This is the point. You're just demonstrating here *how* it manages to print
the wrong result!

But the fact is that b *is* signed, and needs to be displayed with %d. So in
my original example, the 4 and -2 were printed correctly in each case.

I thought you were interested in the behaviour of the comparison,
weren't you? At least that was the problem of the OP.

Of course b as such is signed but in regard to the comparison with an
unsigned value it gets converted to unsigned int. Therefore I think
printing b as int just adds to the confusion.
Also my gcc didn't give any warnings despite using -Wconversion.

$ cat test.c
#include <stdio.h>

int main(void)
{
unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n\n", a, b, a<b);
printf("%d<%d = %d\n\n", 4, b, 4<b);
printf("%u<%d = %d\n\n", a, -2, a<-2);
printf("%d<%d = %d\n\n", 4, -2, 4<-2);

return 0;
}
$ gcc -o test -Wconversion test.c
$ gcc -o test -Wextra test.c
test.c: In function ‘main’:
test.c:8:37: warning: comparison between signed and unsigned integer
expressions
test.c:10:38: warning: comparison between signed and unsigned integer
expressions
$ gcc --version
gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2

Bye, Andreas
 
S

Stefan Ram

Eric Sosman said:
Well, "4" is not an integer literal, but that's a markup issue :)

That's why I write:

,,an int literal, such as »4«,`` and

,,a string literal, such as »"4"«,``

. I am giving classes for beginners. The "»«" quotes make sure that
my students will not type in these quotes into their C programs,
because they do not know how to enter them using their keyboards.
Moreover, these quotes cannot be confused for C string quote »""«.
Moreover, they are part of the widespread charset ISO-8859-1,
so it is justifiable to use the in Usenet posts tagged

Content-Type: text/plain; charset=ISO-8859-1
 
B

BartC

Andreas Perstinger said:
On 2012-05-17 14:00, BartC wrote:
Of course b as such is signed but in regard to the comparison with an
unsigned value it gets converted to unsigned int. Therefore I think
printing b as int just adds to the confusion.

Printing signed b as "%u" would probably itself have induced a comment, and
generated confusion of it's own.
$ gcc -o test -Wconversion test.c
$ gcc -o test -Wextra test.c

OK, so there are even more warning levels beyond -Wall and -Wconversion.
I thought you were interested in the behaviour of the comparison, weren't
you? At least that was the problem of the OP.

(Regarding the OP's problem, the suggestion I made elsewhere, that mixed
arithmetic should be signed not unsigned, would have fixed it!

Well, except in the unlikely scenario that sizeof("test") was bigger than
2GB or so.)
 
T

Tim Prince

Because of the "usual arithmetic conversions," 6.3.1.8.
Most arithmetic operators require operands of the same type,
so for differing types the UAC's operate to reconcile them
before the operation is performed. In this case you have an
int and a size_t, and the UAC's convert the int to size_t:

if ( (size_t)-1 < sizeof("test") )

Since size_t is an unsigned type, (size_t)-1 is the largest
value the type can represent, and is a good deal greater
than (size_t)5.

Aside: I suppose that on a perverse implementation the
outcome might be different. If (size_t)-1 is mathematically
no greater than INT_MAX the conversion would go the other way.
The size_t would convert to an int before the comparison and
you'd have:

if ( -1 < (int)sizeof("test") )

I've never heard of an implementation where size_t is so
narrow, and I'm not 100% sure it would be conforming -- but
I'm not 100% sure it would be forbidden, either.

I've tried to work with an organization whose coding policies were
incompatible with size_t taking a wider data type than int. The
assignment probably was punishment for my background in Fortran, which
may still account for not fully understanding the possibilities you
describe.
 
J

James Kuyper

On 05/17/2012 05:38 AM, Robert Wessel wrote:
....
Somewhat simplified*, C's "usual arithmetic conversions" convert
signed integers to unsigned when paired with an unsigned value for
some operator. ....
*an except exists when all possible values of the unsigned type can be
represented in the signed type, but it doesn't apply here

The first part of the usual arithmetic conversions is to apply the
integer promotions, which means that if int can represent all values of
the unsigned type, it will be promoted to int (6.3.1.1p2). Your
assertion to the contrary notwithstanding, that can apply here: SIZE_MAX
< INT_MAX is possible, though rather unusual.

I believe that the exception you're referring to is the one that applies
after the integer promotions, and only if they don't convert the
unsigned value to 'int'. It applies only if the integer conversion rank
of the unsigned type is lower than the rank of the signed type. For
instance, if UINT_MAX < LLONG_MAX (which is pretty likely to be true,
but not a requirement), the expression -1LL < 5U will be evaluated using
long long, and will therefore be true.
 
J

James Kuyper

On 05/17/2012 07:38 AM, Eric Sosman wrote:
....
outcome might be different. If (size_t)-1 is mathematically
no greater than INT_MAX the conversion would go the other way.
The size_t would convert to an int before the comparison and
you'd have:

if ( -1 < (int)sizeof("test") )

I've never heard of an implementation where size_t is so
narrow, and I'm not 100% sure it would be conforming -- but
I'm not 100% sure it would be forbidden, either.

Can you give a justification for your doubts about whether such an
implementation could be conforming?

The lower limit for SIZE_MAX is 65535, and there's no upper limit for
INT_MAX, so I don't see why an implementation where SIZE_MAX < INT_MAX
could not be fully conforming.
 
B

Ben Bacarisse

BartC said:
1 would be unsigned.

-1, assuming constant folding by the compiler, would be equivalent to
a signed integer literal of "-1".

(If not, then it remains the negation of unsigned 1, performed at
runtime. For this purpose, negating an unsigned value would need to be
allowed, and I can't see a problem with that, except the usual
overflow issues).

This discussion is confusing because it is not clear what changes you
are thinking of. Are you proposing that the operand of - not be subject
to integer promotion, or are you proposing to change how integer
promotion is defined?

What about x < 4 being an unsigned compression? Are proposing a change
to the definition of <, to the usual arithmetic conversions, or something
else?

If all you are saying that there is probably a language a bit like C in
which integer constants are all unsigned, and that such a language might
have fewer surprises for people learning it, then I am happy to agree.
 
B

BartC

This discussion is confusing because it is not clear what changes you
are thinking of. Are you proposing that the operand of - not be subject
to integer promotion, or are you proposing to change how integer
promotion is defined?

I hadn't planned to propose any changes! Just clarifying the signedness of
4, to make sense of my examples where the < was giving conflicting results
even though values being compared were apparently the same, yet there was no
warning given. (As it turns out, gcc needs -Wextra to give the warning.)
What about x < 4 being an unsigned compression? Are proposing a change
to the definition of <, to the usual arithmetic conversions, or something
else?

If literals are unsigned, then yes, it would cause problems *because* of C
using unsigned modes for mixed arithmetic. So it would be a massive change
compared to, say, getting rid of trigraphs, which hardly anyone would
notice.
If all you are saying that there is probably a language a bit like C in
which integer constants are all unsigned, and that such a language might
have fewer surprises for people learning it, then I am happy to agree.

(Actually I am working on such a language. There, I found things fell into
place better if literals were unsigned. Also it will use signed arithmetic
where operands are mixed, having briefly considered doing what C does.

That doesn't remove all problems, but I felt it was generally more useful.
And would have given the expected result in "-1<sizeof("test")", whether
size_t was signed or not.)
 
J

James Kuyper

On 05/17/2012 06:16 AM, BartC wrote:
....
However, -Wconversion doesn't give a warning in this example:

unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n\n", a, b, a<b);
printf("%d<%d = %d\n\n", 4, b, 4<b);
printf("%u<%d = %d\n\n", a, -2, a<-2);
printf("%d<%d = %d\n\n", 4, -2, 4<-2);

which gives conflicting results. (Presumably an integer literal such as "4"
is assumed to be signed? Having it as unsigned would be more intuitive, as
it would be impossible for it to be negative.)

Would you be less confused by the following:

printf("%u<%u = %d\n\n", a, (unsigned)b, a <b);
printf("%u<%u = %d\n\n", 4U, (unsigned)b, 4U<b);
printf("%u<%u = %d\n\n", a, (unsigned)-2, a <-2);
printf("%u<%u = %d\n\n", 4U, (unsigned)-2, 4U<-2);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top