condition true or false? -> (-1 < sizeof("test"))

Tim Rentsch · May 17, 2012

Eric Sosman said:
Hello,

please can someone explain why
(-1< sizeof("test"))
is false.

Click to expand...

[snip]

Aside: I suppose that on a perverse implementation the
outcome might be different. If (size_t)-1 is mathematically
no greater than INT_MAX the conversion would go the other way.
The size_t would convert to an int before the comparison and
you'd have:

if ( -1 < (int)sizeof("test") )

I've never heard of an implementation where size_t is so
narrow, and I'm not 100% sure it would be conforming -- but
I'm not 100% sure it would be forbidden, either.

More specifically, if the integer conversion rank of size_t is
less than the integer conversion rank of int, and if the two
limits involved satisfy SIZE_MAX <= INT_MAX, then expressions
like '-1 < sizeof x' can be true. For size_t, the Standard
requires only that it be an unsigned integer type and that its
maximum value be at least 65535. So, it is certainly possible
for an implementation to satisfy the listed conditions and
still be conforming.

Eric Sosman · May 18, 2012

1 would be unsigned.

-1, assuming constant folding by the compiler, would be equivalent to a
signed integer literal of "-1".

There is no such thing as a "signed integer literal of `-1'."

More exactly, there is no such thing as an "integer literal."
Of course, the "integer-constant" exists and is well-defined, but
even so `-1' is not an "integer-constant."

Quiz question: Write a constant whose type is `int' and whose
value is negative one, without resorting to implementation-defined
or undefined or locale-specific behavior. (Hint: There is at least
one completely portable way to do this.)

(If not, then it remains the negation of unsigned 1, performed at
runtime. For this purpose, negating an unsigned value would need to be
allowed, and I can't see a problem with that, except the usual overflow
issues).

Negation of unsigned 1 (which can be written `-1u') is already
defined in C, although there are implementation-defined aspects.
In particular, there are no "overflow issues," usual or otherwise.

BartC, your whinings about C and your ideas of how to improve
it would be far more credible if there were evidence that you knew
some C in the first place. Go, and correct that deficiency, and
you will receive more respectful attention than you now merit.

Eric Sosman · May 18, 2012

On 05/17/2012 07:38 AM, Eric Sosman wrote:
...

Can you give a justification for your doubts about whether such an
implementation could be conforming?

Just a reluctance to assert the contradiction, and then come
a cropper when a language lawyer combines the fourth sentence of
6.2.6.3p5 with the second of 5.1.3.2p4 and the entirety of 4.4 to
show that one or the other conclusion is definite.

In short, laziness.

The lower limit for SIZE_MAX is 65535, and there's no upper limit for
INT_MAX, so I don't see why an implementation where SIZE_MAX< INT_MAX
could not be fully conforming.

Nor do I, but "meddle not in the affairs of Standards, for they
are subtle and quick to anger." I have in the past asserted "the
Standard requires" or "the Standard forbids" only to be shown wrong,
so I now try to refrain from such assertions when they're not central
to the matter at hand. Trying to avoid unforced errors, if you like.

Kaz Kylheku · May 18, 2012

Quiz question: Write a constant whose type is `int' and whose
value is negative one, without resorting to implementation-defined
or undefined or locale-specific behavior. (Hint: There is at least
one completely portable way to do this.)

You didn't say "integer constant". In the grammar, a constant
generates one of four productions, one of which is "enumeration constant".

Hence:

enum x { y = -1; };

y; // <- constant, type int, value -1.

Eric Sosman · May 18, 2012

You didn't say "integer constant".

Indeed, I did not. That's what's known as a Clue.

In the grammar, a constant
generates one of four productions, one of which is "enumeration constant".

Give the man a cigar!

As far as I know, an enum named constant is the only portable
way to produce a negative integer constant in C. On some systems,
constructs like '£' or 'pound' or L'sterling' might be negative
constants, but they're non-portable. (I believe that '\377' must
be positive, although some pre-Standard compilers I dimly recall
made it negative, IIDimlyRC.)

BartC · May 18, 2012

Eric Sosman said:
On 5/17/2012 8:05 AM, BartC wrote:

Negation of unsigned 1 (which can be written `-1u') is already
defined in C, although there are implementation-defined aspects.
In particular, there are no "overflow issues," usual or otherwise.

That's true; the value of -3000000000u on my 32-bit C is well-defined;
completely wrong, but well-defined according to the Standard.

Actually only lcc-win32, out of my handful of C compilers, bothers to tell
me that that expression has an overflow.

BartC, your whinings about C and your ideas of how to improve
it would be far more credible if there were evidence that you knew
some C in the first place. Go, and correct that deficiency, and
you will receive more respectful attention than you now merit.

The 'whinings' were to do with being dependent on compiler options for
figuring why programs like this:

unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n", a, b, a<b);
printf("%d<%d = %d\n", 4, b, 4<b);
printf("%u<%d = %d\n", a, -2, a<-2);
printf("%d<%d = %d\n", 4, -2, 4<-2);

(notice the integer literals, or constants, or whatever you like to call
them today, have been correctly displayed as signed values) produce output
like this:

4<-2 = 1
4<-2 = 0
4<-2 = 1
4<-2 = 0

You don't need to know any C, or any language, for it to raise eyebrows. And
as it happened, I had trouble getting any of my four compilers to give any
warning, until someone told me to try -Wextra on gcc.

BartC, your whinings about C and your ideas of how to improve
it would be far more credible if there were evidence that you knew
some C in the first place.

How much C does someone need to know, to complain about -1 being silently
converted to something like 4294967295?

A lot of my 'whinings' are backed up by people who know the language
inside-out. And although nothing can be done because the Standard is always
right, and the language is apparently set in stone, at least discussion
about various pitfalls can increase awareness.

James Kuyper · May 18, 2012

....
That's true; the value of -3000000000u on my 32-bit C is well-defined;
completely wrong, but well-defined according to the Standard.

The standard defines what "right" means in the context of C code. If
-3000000000u has the value which the standard says it must have, then it
is the right value, even if it's not the value you think the standard
should specify.

....

unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n", a, b, a<b);
printf("%d<%d = %d\n", 4, b, 4<b);
printf("%u<%d = %d\n", a, -2, a<-2);
printf("%d<%d = %d\n", 4, -2, 4<-2);

(notice the integer literals, or constants, or whatever you like to call
them today, have been correctly displayed as signed values) produce output
like this:

4<-2 = 1
4<-2 = 0
4<-2 = 1
4<-2 = 0

You don't need to know any C, or any language, for it to raise eyebrows.

You're printing out the values of "b" or -2, rather than the values
which are being compared with "a". a<b does not compare the value of "a"
with the value of "b", it compares it with the value of (unsigned)b.
Similarly for a < -2. You need to know C to realize this fact. If you
print out the actual values being compared, you'll see nothing surprising.

How much C does someone need to know, to complain about -1 being silently
converted to something like 4294967295?

Not much; in fact, the less you know about C, the easier it is to feel
the need to complain about such things.

A lot of my 'whinings' are backed up by people who know the language
inside-out. And although nothing can be done because the Standard is always
right, and the language is apparently set in stone, at least discussion
about various pitfalls can increase awareness.

The language has not been set in stone; C2011 made a lot of changes.
However, this is a very fundamental feature of the language. The
standard already allows a warning for such code; if your compiler
doesn't provide one, complain to the compiler vendor. The C standard
could mandate a diagnostic only at the cost of rendering the behavior
undefined if the code is compiled and executed despite generation of
that diagnostic. Such a change would break a LOT of code, and would
therefore be unacceptable.

Ben Bacarisse · May 18, 2012

BartC said:
That's true; the value of -3000000000u on my 32-bit C is well-defined;
completely wrong, but well-defined according to the Standard.

Why is this wrong? Negation preserves the type of the operand which
makes it a closed operation on a group -- it's simple to explain and
mathematically justifiable. Why is your idea of what '-' should mean
for unsigned operands better than anyone else's?

Actually only lcc-win32, out of my handful of C compilers, bothers to tell
me that that expression has an overflow.

That's because there is none, but I am not just making a semantic point
about "wrap-around" vs. "overflow". I don't think a compiler should
warn about negating an unsigned value because it makes perfect sense --
both doing it and the result you get. Unsigned arithmetic in C is
arithmetic modulo 2^n, where n is the width of the unsigned type. This
is, in almost every case, exactly what I want it to be. I *want* the
arithmetic operators to have the meaning they do for unsigned integer
types. Singling out unary '-' would make it a mess.

Also, look at the way you expressed this. It sounds very dismissive.
Do you really think all those compiler writers simply thought "let's not
bother to warn about this"? Is it not possible they had sound,
well-supported reasons for writing the compiler they did? There are
more kinds of arithmetic than are dreamt of in your philosophy.

The 'whinings' were to do with being dependent on compiler options for
figuring why programs like this:

unsigned int a=4;
signed int b=-2;

printf("%u<%d = %d\n", a, b, a<b);
printf("%d<%d = %d\n", 4, b, 4<b);
printf("%u<%d = %d\n", a, -2, a<-2);
printf("%d<%d = %d\n", 4, -2, 4<-2);

(notice the integer literals, or constants, or whatever you like to call
them today, have been correctly displayed as signed values) produce output
like this:

4<-2 = 1
4<-2 = 0
4<-2 = 1
4<-2 = 0

You don't need to know any C, or any language, for it to raise
eyebrows.

No, but it helps. The same goes for all sorts of operations in all
programming languages. None of them are free of surprises for those not
in the know. Does C have more scope for surprise? Probably, but it's
not certain. Haskell, for example, has operators that are as close to
what you want as is possible, but you might find lots of other things
about it surprising.

In all these complaints you don't present a cohesive alternative. C's
rules for determining a common type are (despite the presentation of
them in the standard) relatively simple. They stem from a design that
tries to keep close to most hardware can do without extra hidden costs.

as it happened, I had trouble getting any of my four compilers to give any
warning, until someone told me to try -Wextra on gcc.

That's another issue altogether. One of the first things I do when
using any compiler is to find out how to get it to tell me as much as
possible about my code. I always plan to shut it up later when I'm
more familiar with the language, but I rarely do.

How much C does someone need to know, to complain about -1 being silently
converted to something like 4294967295?

You are mixing issues again. C does not say the conversion must be
silent and most compilers will help you out here. If you don't like how
they do that, complain to the compiler writers.

As for what C does say, that in signed to unsigned comparisons -1 is
(usually) converted to unsigned, I think some understanding of C is
needed for the complaint to be taken seriously. I would only complain
about language feature X if I was pretty sure I knew why it was designed
that way (even if that turns out to be simply "the designer made a
mistake") and I could demonstrate a better design that meets the same
objectives. I am not saying that people who don't know C can't express
an opinion about it, but that's not the same as declaring a language
feature "wrong" or suggesting that it's obvious that it should be done
differently.

No hardware that I know can compare all signed and unsigned integers in
the way that you seem to want. Something has to give -- either more
code has to be generated or some surprising cases will occur.
Explaining what you want need not be a lot of work. You could just say
C should do it like X does it. Surely at least one programming language
as got integer arithmetic and comparison right as far as you are
concerned.

<snip>

BartC · May 18, 2012

Why is this wrong? Negation preserves the type of the operand which
makes it a closed operation on a group -- it's simple to explain and
mathematically justifiable. Why is your idea of what '-' should mean
for unsigned operands better than anyone else's?

My idea is that when you take a positive number, and negate it, that you
usually end up with a negative number! My apologies if that sounds too
far-fetched!

Translated into a language with a simple type system like C's, that would
imply that applying negation to an unsigned type, would necessarily need to
yield a signed result. Unless you apply a totally different meaning to
negation than is usually understood.

You are mixing issues again. C does not say the conversion must be
silent and most compilers will help you out here. If you don't like how
they do that, complain to the compiler writers.

As for what C does say, that in signed to unsigned comparisons -1 is
(usually) converted to unsigned, I think some understanding of C is
needed for the complaint to be taken seriously. I would only complain
about language feature X if I was pretty sure I knew why it was designed
that way (even if that turns out to be simply "the designer made a
mistake") and I could demonstrate a better design that meets the same
objectives.

(I have a fledgling language and compiler, which while it doesn't do much at
present, can at least take:

println -3000000000

and output -3000000000. The type of 3000000000 is 32-bit unsigned. The type
of -3000000000 is 64-bit signed. I'm not interested in selling or making
available this language, but it's just serving here as an example of a
slightly different approach to doing things.)

No hardware that I know can compare all signed and unsigned integers in
the way that you seem to want. Something has to give -- either more
code has to be generated or some surprising cases will occur.
Explaining what you want need not be a lot of work. You could just say
C should do it like X does it. Surely at least one programming language
as got integer arithmetic and comparison right as far as you are
concerned.

Hardware either does sign-less add/subtract, or offers signed/unsigned
multiply/divide. So it's mostly a language issue. Since this last has
already been sorted out for C, this is just for interest.

Using fixed-width types for operands and results, there are always going to
be some problems when doing arithmetic: values could overflow for example.
This is acceptable in this context; dealing properly with overflow can be
difficult, you might decide to leave it to a higher level language than
we're dealing with here. (And C decides that unsigned overflow isn't
overflow at all.)

But then you have mixed-signed arithmetic. The proper way to deal with this
is to convert both sides to a common type that can contain both kinds of
value. But this is not always practical (maybe there is no larger type), and
it can be inefficient. So we compromise by keeping the same width, not ideal
because you can't reliably convert one type to the other without possible
overflows. But ignoring overflow for the minute:

You're doing arithmetic between a number that can be negative, on a number
that is positive. The result of course could be negative or positive.
Naturally, you'd expect the result to be a signed type so that it could
accommodate both. But C decides the result is always going to be positive!

Taking 32-bit ints as an example: although there are always going to be
problems with overflows, at least using signed arithmetic, things will work
out if the signed operand is within roughly - 2 billion to 2 billion, the
unsigned one is within 0 to to 2 billion, and the result is within -2
billion to 2 billion. That will include the OP's -1<4 or -1<8 comparison in
addition to many, many ordinary everyday expressions.

But with unsigned arithmetic, both operands and result must lie within 0 to
2 billion, to give a sensible result. There's a strong chance of the wrong
result if *any* negative value, such as -1, is involved. Unless you are
lucky, for example by performing addition, and the result happens to be
positive.

(I did once ask for C's rationale for using unsigned mode for mixed
arithmetic, but I didn't get an answer that explained why.)

Ben Bacarisse · May 19, 2012

BartC said:
My idea is that when you take a positive number, and negate it, that you
usually end up with a negative number! My apologies if that sounds too
far-fetched!

- is not negation, or at least not only negation. That symbol is used
in lots of contexts to mean all sorts of things. In C it means both
negation and "additive inverse" -- the latter being nothing more than
negation in a finite group. In unsigned arithmetic, the additive
inverse is not a negative number (so people shy away from calling it
negation).

The meaning you accept is the one that does not correspond to any usual
mathematical meaning. Signed negation can overflow, making it only an
approximation to the mathematical version. C's unsigned negation always
works and corresponds exactly to the mathematical notion on which it is
modeled.

Translated into a language with a simple type system like C's, that would
imply that applying negation to an unsigned type, would necessarily need to
yield a signed result. Unless you apply a totally different meaning to
negation than is usually understood.

I don't want to use programming languages that can only perform
operations as they are "usually understood". The "usual" meaning is
already covered by C's signed arithmetic, so C is adding something here.
I've used "scare quotes" because I'm not 100% sure what you mean by
usual. If it's statistical -- ask 100 people and give '-' only the
meaning the majority would expect -- I don't want that to be a factor in
the design of C.

(I have a fledgling language and compiler, which while it doesn't do much at
present, can at least take:

println -3000000000

and output -3000000000. The type of 3000000000 is 32-bit unsigned. The type
of -3000000000 is 64-bit signed. I'm not interested in selling or making
available this language, but it's just serving here as an example of a
slightly different approach to doing things.)

That does not give me any clue about why you think C is "wrong". It
tells me what you've done in some language with no context. Does this
language have the same design objectives that C had or had?

Hardware either does sign-less add/subtract, or offers signed/unsigned
multiply/divide. So it's mostly a language issue. Since this last has
already been sorted out for C, this is just for interest.

Using fixed-width types for operands and results, there are always going to
be some problems when doing arithmetic: values could overflow for example.
This is acceptable in this context; dealing properly with overflow can be
difficult, you might decide to leave it to a higher level language than
we're dealing with here. (And C decides that unsigned overflow isn't
overflow at all.)

You brought up compares (well, the OP did but you extended the example)
and so I talked about compares. It would be simpler to stick to
explaining what C could do that's "better" whilst sticking to C's design
objectives. You've gone off to talk about arithmetic in general. What
do you think C could do with mixed-type compares?

But then you have mixed-signed arithmetic. The proper way to deal with this
is to convert both sides to a common type that can contain both kinds of
value. But this is not always practical (maybe there is no larger type), and
it can be inefficient. So we compromise by keeping the same width, not ideal
because you can't reliably convert one type to the other without possible
overflows. But ignoring overflow for the minute:

You're doing arithmetic between a number that can be negative, on a number
that is positive. The result of course could be negative or positive.
Naturally, you'd expect the result to be a signed type so that it could
accommodate both. But C decides the result is always going to be positive!

Taking 32-bit ints as an example: although there are always going to be
problems with overflows, at least using signed arithmetic, things will work
out if the signed operand is within roughly - 2 billion to 2 billion, the
unsigned one is within 0 to to 2 billion, and the result is within -2
billion to 2 billion. That will include the OP's -1<4 or -1<8 comparison in
addition to many, many ordinary everyday expressions.

But with unsigned arithmetic, both operands and result must lie within 0 to
2 billion, to give a sensible result. There's a strong chance of the
wrong result if *any* negative value, such as -1, is involved. Unless
you are lucky, for example by performing addition, and the result
happens to be positive.

Where is this going? I know all of the above. I disagree with some of
it (for example that C's unsigned arithmetic produces results that are
not sensible). None of it helps me to see why you think C is wrong.

BartC · May 19, 2012

- is not negation, or at least not only negation. That symbol is used
in lots of contexts to mean all sorts of things. In C it means both
negation and "additive inverse" -- the latter being nothing more than
negation in a finite group. In unsigned arithmetic, the additive
inverse is not a negative number (so people shy away from calling it
negation).

You mean like this:

unsigned int a,b;

a=1;
b=-a;

printf("A=%u\n",a);
printf("B=%u\n",b);
printf("A+B=%u\n",a+b);

producing (for example):

A=1
B=4294967295
A+B=0 /* A+(-A) */ ?

The meaning you accept is the one that does not correspond to any usual
mathematical meaning. Signed negation can overflow, making it only an
approximation to the mathematical version. C's unsigned negation always
works and corresponds exactly to the mathematical notion on which it is
modeled.

The trouble is that when you look at why people use unsigned numbers, it's
probably not because they particularly want 'closed sets', they usually just
want an extra bit of precision. If that case they are just ordinary numbers
that don't happen to be negative.

For example the unsigned int result (size_t) of 'sizeof'. Here surely the
thing of interest is the quantity involved, not the bit pattern. If you add
the sizes of two large objects, you might expect the result to sometimes
overflow, and not be smaller than either!

(But overall you're right: the way C has dealt with this is actually quite
neat. Perhaps it wasn't so crazy after all..)

[it] can at least take:

println -3000000000

and output -3000000000. The type of 3000000000 is 32-bit unsigned. The
type
of -3000000000 is 64-bit signed.

Click to expand...

That does not give me any clue about why you think C is "wrong". It
tells me what you've done in some language with no context. Does this
language have the same design objectives that C had or had?

(Yes. It's a version of older languages I used when I couldn't get hold of C
for various practical reasons. (But I had the book..)

It is a bit more open-ended than C in the way expressions are evaluated: in
C, usually the result of an operation has to be shoe-horned into a type
determined by the programmer. This can hide some of it's idiosyncrasies, so
if in my example above, b was signed, or printed with "%d", then it would
magically be 'converted' into the -1 that is expected.

With an open expression as in my print statement, more care is needed by the
language to get the expected results. But programming is simpler; it took a
bit of of trial and error to get my -3000000000 printed in C!
(printf("%lld\n",-3000000000ll)

)

You brought up compares (well, the OP did but you extended the example)
and so I talked about compares. It would be simpler to stick to
explaining what C could do that's "better" whilst sticking to C's design
objectives. You've gone off to talk about arithmetic in general. What
do you think C could do with mixed-type compares?

Presumably the language could not be changed so that it does something other
than unsigned compare. It that case, perhaps make it much more likely that a
warning is given. It's been mentioned that this a compiler issue, and not a
language one, but I suspect that compilers don't warn, without being
explicitly told to do so (and unleashing a host of other warnings), because
the language doesn't take the issue seriously.

If a positive number is compared with one that is quite likely to be
negative, but you assume that both are positive, then there's a very good
chance the result will be wrong! Especially in the example in the subject
line, where it is *known* it's going to be wrong.

(Well, except using Digital Mars' C compiler, where (-1<sizeof("test")) was
true, and 'size_t' was unsigned from what I could determine. Is it possible
it has a bug because it gives the right answer?)

James Kuyper · May 19, 2012

The trouble is that when you look at why people use unsigned numbers, it's
probably not because they particularly want 'closed sets', they usually just
want an extra bit of precision. If that case they are just ordinary numbers
that don't happen to be negative.

That's a bad reason for making that decision; the existence of people
who make that bad decision should not influence the design of the language.

Presumably the language could not be changed so that it does something other
than unsigned compare. ...

That would break a lot of existing code which was correctly written to
take into consideration the way the language is actually designed to
work. so you're right that it's unlikely to be changed.

... It that case, perhaps make it much more likely that a
warning is given. It's been mentioned that this a compiler issue, and not a
language one, but I suspect that compilers don't warn, without being
explicitly told to do so (and unleashing a host of other warnings), because
the language doesn't take the issue seriously.

Compilers routinely warn about things the standard does not require them
to diagnose. Compiler makers are sensitive to the needs of their users;
the ones who aren't, tend not to have many users. If you want a warning,
let your vendor know.

....

(Well, except using Digital Mars' C compiler, where (-1<sizeof("test")) was
true, and 'size_t' was unsigned from what I could determine. Is it possible
it has a bug because it gives the right answer?)

It can't be a bug because it gives the right answer. It could be a bug
if it give the answer you expected it to give. However, if SIZE_MAX <
INT_MAX, the right answer would be the same as the one you expected it
to give. What are the values of SIZE_MAX and INT_MAX for that
implementation?

Richard Damon · May 19, 2012

The language has not been set in stone; C2011 made a lot of changes.
However, this is a very fundamental feature of the language. The
standard already allows a warning for such code; if your compiler
doesn't provide one, complain to the compiler vendor. The C standard
could mandate a diagnostic only at the cost of rendering the behavior
undefined if the code is compiled and executed despite generation of
that diagnostic. Such a change would break a LOT of code, and would
therefore be unacceptable.

The language could define another "level" of messages besides just a
single type of "diagnostic". There are several things now in the
language which if the standard had language to allow it to require the
emitting of a "warning" (vs an "error") could help programmers. In many
cases "good" compilers already implement them. One key is that the
implementation should need to define how to distinguish a "warning"
diagnostic from an "error" diagnostic, and allow for possible other
levels of messages (like "informational") which the standard doesn't
define/mandate.

Programs that created an "error" would have undefined behavior if
executed (if that was even possible), while programs which only
generated "warnings" should be able to be executed.

BartC · May 19, 2012

James Kuyper said:
On 05/19/2012 06:14 AM, BartC wrote:

That's a bad reason for making that decision; the existence of people
who make that bad decision should not influence the design of the
language.

I would guess that a lot of people don't know about that. All they know is
that a signed char might range from -128 to 127, and unsigned from 0 to 255.
If they need to store small positive numbers with a maximum above 127 but
below 256, then unsigned char makes sense, and uses half the memory of the
next unsigned type. And - most of the time - it works perfectly well.

And if they choose the 'char' type, then they might not even know if the
type is signed or unsigned. That means they will someday get stung with the
subtle differences between the two types.

It can't be a bug because it gives the right answer. It could be a bug
if it give the answer you expected it to give. However, if SIZE_MAX <
INT_MAX, the right answer would be the same as the one you expected it
to give. What are the values of SIZE_MAX and INT_MAX for that
implementation?

SIZE_MAX was 2**32-1 (according to the include files; I don't know what
header that is in); INT_MAX is 2**31-1.

Eric Sosman · May 19, 2012

The language could define another "level" of messages besides just a
single type of "diagnostic". There are several things now in the
language which if the standard had language to allow it to require the
emitting of a "warning" (vs an "error") could help programmers. In many
cases "good" compilers already implement them. One key is that the
implementation should need to define how to distinguish a "warning"
diagnostic from an "error" diagnostic, and allow for possible other
levels of messages (like "informational") which the standard doesn't
define/mandate.

Programs that created an "error" would have undefined behavior if
executed (if that was even possible), while programs which only
generated "warnings" should be able to be executed.

Doesn't this suggestion lead to four types of diagnostics?

1) "Errors" whose issuance is required by the Standard, and
which are required to produce translation failure,

2) "Errors" whose issuance is required by the Standard, and
which are allowed (but not required) to produce translation
failure,

3) "Warnings" whose issuance is required by the Standard, and
which are not allowed to produce translation failure, and

4) "Informational messages" whose issuance is entirely optional
(the Standard might not even mention them), and which are not
allowed to cause translation failure.

We already have [1] (one instance), [2] (many instances), and [4]
(an open-ended set); the new feature of your proposal is [3]. Are
you convinced [3] is the proper province of a language Standard?
Even if "yes," I think it would be hard to get universal agreement
on what particular diagnostics should be promoted from [4] to [3].
Note, for example, that some compilers can be told to suppress
specific diagnostics; this shows that the line between [3] and [4]
is indistinct and situational, not easily drawn by a Committee many
miles and years removed from a particular slice of code.

The business of a compiler is to compile if it can and to tell
you potentially useful things it discovers in the process, but
setting policy about the use of its outputs seems to me outside its
proper sphere. There's a natural tendency to dump extra work onto
the compiler, simply because it's always "there" and in plain sight;
people might fail to run lint but they can't avoid the compiler.
But if you've got a people problem you should talk with the problem
people, not with ISO/IEC!

James Kuyper · May 20, 2012

SIZE_MAX was 2**32-1 (according to the include files; I don't know what
header that is in); INT_MAX is 2**31-1.

SIZE_MAX is supposed to be defined in <stdint.h>; if you're using a C90
compiler, substitute (size_t)-1.
With those values for SIZE_MAX and INT_MAX, having -1 < sizeof "test"
yield a value of true makes the implementation non-conforming. This is
sufficiently unusual that I wonder if you'd be willing to perform some
tests? What does if give if you calculate the following values, convert
them to unsigned long, and print them using "%lu"?

(size_t)-1
sizeof "test"
-1 + sizeof "test"
INT_MAX
SIZE_MAX

BartC · May 20, 2012

James Kuyper said:
SIZE_MAX is supposed to be defined in <stdint.h>; if you're using a C90
compiler, substitute (size_t)-1.
With those values for SIZE_MAX and INT_MAX, having -1 < sizeof "test"
yield a value of true makes the implementation non-conforming. This is
sufficiently unusual that I wonder if you'd be willing to perform some
tests? What does if give if you calculate the following values, convert
them to unsigned long, and print them using "%lu"?

I used this code:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void){
printf("(size_t)-1 %lu\n", (unsigned long)((size_t)-1));
printf("sizeof \"test\" %lu\n",(unsigned long)(sizeof "test"));
printf("-1+sizeof \"test\" %lu\n",(unsigned long)(-1 + sizeof "test"));
printf("INT_MAX %lu\n", (unsigned long)INT_MAX);
printf("SIZE_MAX %lu\n", (unsigned long)SIZE_MAX);
printf("-1<sizeof \"test\" %lu\n",(unsigned long)(-1<sizeof "test"));
}

On gcc, it gave this output:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 0

On DMC (Digital Mars' C) it gives this:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 1

Exactly the same, exact for the final compare.

(BTW, why does sizeof "test" give 5? I thought that was the size of the
pointer. It can't be the string length (plus terminator), as it always gives
5 no matter how long the string.)

Ben Bacarisse · May 20, 2012

BartC said:
I used this code:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void){
printf("(size_t)-1 %lu\n", (unsigned long)((size_t)-1));
printf("sizeof \"test\" %lu\n",(unsigned long)(sizeof "test"));
printf("-1+sizeof \"test\" %lu\n",(unsigned long)(-1 + sizeof "test"));
printf("INT_MAX %lu\n", (unsigned long)INT_MAX);
printf("SIZE_MAX %lu\n", (unsigned long)SIZE_MAX);
printf("-1<sizeof \"test\" %lu\n",(unsigned long)(-1<sizeof "test"));
}

On gcc, it gave this output:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 0

On DMC (Digital Mars' C) it gives this:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 1

Exactly the same, exact for the final compare.

I won't interrupt James's question by replying here.

(BTW, why does sizeof "test" give 5? I thought that was the size of
the pointer.

It's an exception: 6.3.2 p3:

"Except when it is the operand of the sizeof operator or the unary &
operator, or is a string literal used to initialize an array, an
expression that has type 'array of type' is converted to an
expression with type 'pointer to type'..."

It can't be the string length (plus terminator), as it
always gives 5 no matter how long the string.)

It should be the size of the array that the string gives rise to, so it
should vary with the string's length:

#include <stdio.h>

int main(void)
{
printf("%zu\n", sizeof "a");
printf("%zu\n", sizeof "abc");
printf("%zu\n", sizeof "abacadabara");
}

which produces:

2
4
12

Eric Sosman · May 20, 2012

I used this code:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

int main(void){
printf("(size_t)-1 %lu\n", (unsigned long)((size_t)-1));
printf("sizeof \"test\" %lu\n",(unsigned long)(sizeof "test"));
printf("-1+sizeof \"test\" %lu\n",(unsigned long)(-1 + sizeof "test"));
printf("INT_MAX %lu\n", (unsigned long)INT_MAX);
printf("SIZE_MAX %lu\n", (unsigned long)SIZE_MAX);
printf("-1<sizeof \"test\" %lu\n",(unsigned long)(-1<sizeof "test"));
}

On gcc, it gave this output:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 0

On DMC (Digital Mars' C) it gives this:

(size_t)-1 4294967295
sizeof "test" 5
-1+sizeof "test" 4
INT_MAX 2147483647
SIZE_MAX 4294967295
-1<sizeof "test" 1

Exactly the same, exact for the final compare.

Obvious bug in the second implementation.

(BTW, why does sizeof "test" give 5? I thought that was the size of the
pointer. It can't be the string length (plus terminator), as it always
gives 5 no matter how long the string.)

Nonsense: It would not give 5 for `sizeof "BartC"', for example,
nor for `sizeof "Trolling? Who? Me?"'. Have you not read or at least
skimmed the Standard? Any version at all? Have you not read or at
least skimmed K&R -- again, any version at all? Have you never heard
of the FAQ?

"Go, and never darken my towels again." -- G. Marx

Richard Damon · May 20, 2012

The language could define another "level" of messages besides just a
single type of "diagnostic". There are several things now in the
language which if the standard had language to allow it to require the
emitting of a "warning" (vs an "error") could help programmers. In many
cases "good" compilers already implement them. One key is that the
implementation should need to define how to distinguish a "warning"
diagnostic from an "error" diagnostic, and allow for possible other
levels of messages (like "informational") which the standard doesn't
define/mandate.

Programs that created an "error" would have undefined behavior if
executed (if that was even possible), while programs which only
generated "warnings" should be able to be executed.

Click to expand...

Doesn't this suggestion lead to four types of diagnostics?

1) "Errors" whose issuance is required by the Standard, and
which are required to produce translation failure,

2) "Errors" whose issuance is required by the Standard, and
which are allowed (but not required) to produce translation
failure,

3) "Warnings" whose issuance is required by the Standard, and
which are not allowed to produce translation failure, and

4) "Informational messages" whose issuance is entirely optional
(the Standard might not even mention them), and which are not
allowed to cause translation failure.

We already have [1] (one instance), [2] (many instances), and [4]
(an open-ended set); the new feature of your proposal is [3]. Are
you convinced [3] is the proper province of a language Standard?
Even if "yes," I think it would be hard to get universal agreement
on what particular diagnostics should be promoted from [4] to [3].
Note, for example, that some compilers can be told to suppress
specific diagnostics; this shows that the line between [3] and [4]
is indistinct and situational, not easily drawn by a Committee many
miles and years removed from a particular slice of code.

The business of a compiler is to compile if it can and to tell
you potentially useful things it discovers in the process, but
setting policy about the use of its outputs seems to me outside its
proper sphere. There's a natural tendency to dump extra work onto
the compiler, simply because it's always "there" and in plain sight;
people might fail to run lint but they can't avoid the compiler.
But if you've got a people problem you should talk with the problem
people, not with ISO/IEC!

The reason to want to have a warning class of diagnostics in my mind
would be most useful to allow more feature that clutter the language to
depreciated status. For example, a rule that a warning shall be
generated for any program that uses trigraphs, unless the first use of
them is the sequence ??= or an option has been provided to the
translator signifying that trigraphs are expected.

It might even be able to define an official way (likely with _Pragma() )
to mark a function as depreciated, and any use of that function shall
generate a warning. Again, with a possible wording to allow the user to
disable such a warning, although the implementation could provide, and
already generally does provide, such an ability anyway, it just would
become non-conforming with such an unauthorized option (but that hasn't
stopped many compilers to have many non-conforming configurations).

As for your grouping of messages, I would NOT advice the standard
distinguishing between 1 and 2 the way you have, if anything the
difference should be on if the implementations HAS generated a
translation result (not just on if it was allowed to), with the
disclaimer that any use of such a result is undefined behavior.

As to level 3 and 4, from my experience their is a distinction of
optional messages between "warning" and "informational" in that the
former implies that the program, while still "legal" in the sense of the
standard, is likely to not do what the programmer thinks it ought to be
doing, while the later messages convey information about the program
without an implication that there may be a problem, being things like
output size, compilation time, or other statistics about the program.
The warning level could be divided by those diagnostics required by the
standard, and those not required, but I am not sure that really adds
much, and it says that if the standard requires documenting how to
determine what class a given diagnostic falls into, then if the standard
later adds an additional warning, then the implementations that
previously generated said warning (and their almost surely would be one,
as a test of existing practice) would need to change the message format
in some way to change its classification.

I would agree that requiring a warning diagnostic for a signed/unsigned
comparison operation is probably farther than should be required by the
standard, and many of the warnings that come out of current compilers
are beyond what should be expected out of a "basic but conforming" C
compiler.

Program to find the largest integer element of an array.	1	Mar 2, 2022
Beginner at c	0	Oct 5, 2023
Problem with changing of elements' locations	0	Oct 18, 2022
Why is 'i' equal to 7?I know it's super simple but can anyone help me?	1	Feb 11, 2023
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
Why sizeof(main) = 1?	8	Dec 17, 2012
Adding adressing of IPv6 to program	1	Feb 16, 2023
Fibonacci	0	May 13, 2023

condition true or false? -> (-1 < sizeof("test"))

Tim Rentsch

Eric Sosman

Eric Sosman

Kaz Kylheku

Eric Sosman

BartC

James Kuyper

Ben Bacarisse

BartC

Ben Bacarisse

BartC

James Kuyper

Richard Damon

BartC

Eric Sosman

James Kuyper

BartC

Ben Bacarisse

Eric Sosman

Richard Damon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads