sizeof 'A'

K

Keith Thompson

Peter Nilsson said:
Note that C++ didn't make 'A' a char to support overloading.
That is incidental. C++ made 'A' a char because it's common
sense to think of and _use_ 'A' as a character. The
situation is no different in C.

<OT>
According to the C++ standard, C++ made character constants type char
precisely because of overloading. See Annex C of the 1998 C++
standard:

Change: Type of character literal is changed from int to char
Rationale: This is needed for improved overloaded function
argument type matching.
The int-ness of 'A' may be surprising[*],

It is surprising.

Agreed (though I got over the surprise some time ago).
Can you think of any convention or practice elsewhere in
the language that would established a pattern with which
'A' being a character would be inconsistent?

Yes. 'A' is a constant. No other constants are of any integer type
smaller than int. For example, the macro CHAR_MAX typically expands
to either 127 or 255; both of these are constants of type int. The
language provides suffixes for integer constants that make them of
type long, unsigned int, unsigned long, and so forth, but there are no
such suffixes for char, unsigned char, signed char, short, or unsigned
short.

All C expressions of integer type are of type int, unsigned int, or
something larger. Even the name of an object of type char is promoted
to int (or, very rarely, to unsigned int) unless it's the operand of
"&" or "sizeof". Why should character constants be a special case?

The only case where it makes any difference is when a character
constant is the operand of sizeof. How often do you really apply
sizeof to a character constant in real code (unless you're trying to
determine whether you're using a C or C++ compiler)?
 
R

Richard Heathfield

Keith Thompson said:

How often do you really apply
sizeof to a character constant in real code (unless you're trying to
determine whether you're using a C or C++ compiler)?

And, quite possibly, failing, since sizeof 'A' can certainly be 1, even
though 'A' is an int.
 
E

Eric Sosman

Peter said:
[...]
So let me turn that back at you. If it's going to happen
anyhow, again I ask, why would making 'A' a char cause
any harm?

What type should 'AB' have?

Yes, it's implementation-defined. And yes, because of
its implementation-definedness it is (justifiably) denigrated.
But to deny its existence -- and occasional usefulness -- is
to deny the history of C. (Ritchie's writings on the descent
of C from B and the evanescent NB make illuminating reading.)

True, there's precedent for the value of a constant to
determine its type, over and above its lexical appearance:
42 is an int, but 42424242424242424242424242424242424 quite
likely isn't. But that's not really a "pre"cedent, since
it's a rule that was invented long after C began, a rule
that describes rather than prescribes.

Other than offering newbies a surprise -- and let's face
it, there are *many* things about C that surprise newbies --
can you think of even one deleterious consequence of 'A' being
an int?
 
P

pete

Keith Thompson wrote:
All C expressions of integer type are of type int, unsigned int, or
something larger. Even the name of an object of type char is promoted
to int (or, very rarely, to unsigned int) unless it's the operand of
"&" or "sizeof".

There's no promotion when the left and right operands
of the assignment operator are of type char.

There's no promotion when an expression of type char
is added to a pointer.

The logical operators don't cause the promotion
of their operands,
and neither does the comma operator.

It's not like for an array name.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
char a = 0;
char array[7] = "";

printf("sizeof (a , a) is %u\n", (unsigned)sizeof (a , a));
printf("sizeof ( a) is %u\n\n", (unsigned)sizeof ( a));

printf("sizeof (array , array) is %u\n",
(unsigned)sizeof (array , array));
printf("sizeof ( array) is %u\n",
(unsigned)sizeof ( array));
return 0;
}

/* END new.c */
 
R

Richard Bos

Eric Sosman said:
jacob navia wrote On 07/10/07 08:40,:

... but since it *isn't* a character,

Yes, it is. 'A' is a character constant. It is not a character type.
That's the inconsistency: character constants don't have character type,
but type int.

Richard
 
R

Richard Bos

Eric Sosman said:
What type should 'AB' have?

Never mind type, what _value_ should it have? IMO, none. Character
constants should be limited to a single member of the execution
character set.

Richard
 
R

Richard Bos

Keith Thompson said:
It makes very little difference in C; expressions of type char are
almost always promoted to int anyway. Applying sizeof to a character
constant is nearly the only case where the different shows up.

Oh, I know that; but it irks me for philosophical reasons.

Richard
 
C

CryptiqueGuy

There's no promotion when the left and right operands
of the assignment operator are of type char.

There's no promotion when an expression of type char
is added to a pointer.

The logical operators don't cause the promotion
of their operands,
and neither does the comma operator.

This appplies to a cast expression and also in the case, when it is a
primary expression.

That is:

unsigned char c='A';
(unsigned)c;
c;
(c);
/*In the last 3 expression statements, there need to be no promotion,
although having promotion doesn't affect the behavior of the code.*/

In general, integer promotion occurs only when the operand having rank
less that int, is an operand of an arithmetic operator.
It's not like for an array name.

What are you trying to prove with array names?

Array names are always converted into an rvalue equal to the pointer
to the first element except when it is an operand of sizeof or &.

Note that array names are NOT promoted. The pointer value you get with
array name is analogous to the char value you get when you have an
identifier of type char, in rvalue context. Promotion is what, that
subsequently occurs, when char operand is an operand of an arithmetic
operator (need not occur otherwise).

Note that,for array names there is no "promotion" defined by the
standards. If you are refferring to evaluation of array name as
pointer, then that is incorrect terminology!
/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
char a = 0;
char array[7] = "";

printf("sizeof (a , a) is %u\n", (unsigned)sizeof (a , a));
printf("sizeof ( a) is %u\n\n", (unsigned)sizeof ( a));

printf("sizeof (array , array) is %u\n",
(unsigned)sizeof (array , array));
printf("sizeof ( array) is %u\n",
(unsigned)sizeof ( array));
return 0;

}

/* END new.c */
 
K

Keith Thompson

pete said:
There's no promotion when the left and right operands
of the assignment operator are of type char.

There's no promotion when an expression of type char
is added to a pointer.

The logical operators don't cause the promotion
of their operands,
and neither does the comma operator.
[snip]

Ok, you're right. Here's another good illustration of your point:

#include <stdio.h>
int main(void)
{
char c;
printf("sizeof c = %d\n", (int)sizeof c);
printf("sizeof (c, c) = %d\n", (int)sizeof (c, c));
printf("sizeof (c + 0) = %d\n", (int)sizeof (c + 0));
return 0;
}

The output I get (on a system with sizeof(int)==4) is:

sizeof c = 1
sizeof (c, c) = 1
sizeof (c + 0) = 4

It's still the case that it hardly ever matters whether character
constants are of type int or of type char.
 
R

Richard

Eric Sosman said:
Peter said:
[...]
So let me turn that back at you. If it's going to happen
anyhow, again I ask, why would making 'A' a char cause
any harm?

What type should 'AB' have?

Well, a noob for sure would think thats how you define a character
string.

'AB' is not "a character" in any sense.
Yes, it's implementation-defined. And yes, because of
its implementation-definedness it is (justifiably) denigrated.
But to deny its existence -- and occasional usefulness -- is
to deny the history of C. (Ritchie's writings on the descent
of C from B and the evanescent NB make illuminating reading.)

True, there's precedent for the value of a constant to
determine its type, over and above its lexical appearance:
42 is an int, but 42424242424242424242424242424242424 quite
likely isn't. But that's not really a "pre"cedent, since
it's a rule that was invented long after C began, a rule
that describes rather than prescribes.

Other than offering newbies a surprise -- and let's face
it, there are *many* things about C that surprise newbies --
can you think of even one deleterious consequence of 'A' being
an int?

--
 
R

Richard

Yes, it is. 'A' is a character constant. It is not a character type.
That's the inconsistency: character constants don't have character type,
but type int.

Richard

Which is the crux of the issue. I find it hard to believe that anyone,
even in this group, would argue that a character constant has type int
is a good thing.
 
E

Eric Sosman

Richard said:
[...]
Which is the crux of the issue. I find it hard to believe that anyone,
even in this group, would argue that a character constant has type int
is a good thing.

... but is it in any sense "bad?" Seems to me this
entire thread is a tempest in a type-pot.
 
R

Richard

Eric Sosman said:
Richard said:
[...]
Which is the crux of the issue. I find it hard to believe that anyone,
even in this group, would argue that a character constant has type int
is a good thing.

... but is it in any sense "bad?" Seems to me this
entire thread is a tempest in a type-pot.

Yes. It promotes a billion posts of

'A' is not type char ....

OK, its not end of the world, but ...
 
R

Richard Bos

Eric Sosman said:
Richard said:
[...]
Which is the crux of the issue. I find it hard to believe that anyone,
even in this group, would argue that a character constant has type int
is a good thing.

... but is it in any sense "bad?"

Since it violates the principle of least surprise, yes. Not _very_ bad,
I'll grant you, but still slightly bad. Let's call it naughty.

Richard
 
D

David T. Ashley

Nishu said:
OK. What is the reason for considering it as int? I think we use
single quotes to say that it is a char.

Single quotes in this context actually specify a type conversion, same as
ORD() did in old versions of basic.

The issue isn't the final data type ... the issue is how one gets from 'A'
to the ASCII value of 'A'.

In other languages:

i = ORD("A");

In C:

i = 'A';
 
S

santosh

David said:
Single quotes in this context actually specify a type conversion, same as
ORD() did in old versions of basic.

The issue isn't the final data type ... the issue is how one gets from 'A'
to the ASCII value of 'A'.

To the value of 'A' in the machine execution character set, which
needn't be ASCII.
 
F

Flash Gordon

Richard Bos wrote, On 11/07/07 15:30:
Eric Sosman said:
Richard said:
[...]
Which is the crux of the issue. I find it hard to believe that anyone,
even in this group, would argue that a character constant has type int
is a good thing.
... but is it in any sense "bad?"

Since it violates the principle of least surprise, yes. Not _very_ bad,
I'll grant you, but still slightly bad. Let's call it naughty.

I agree with Richard Bos here. I think the language would be cleaner if
character constants were of type char, with the obvious corollary that
multi-character constants were invalid (required a diagnostic). However,
the language has worse things in it than that.
 
P

pete

CryptiqueGuy said:
What are you trying to prove with array names?

The way that Keith Thompson used & and sizeof as exceptions,
to describe when the integer promotions take place,
reminded me of how the standard describes
the explicit conversions of expressions of array type,
by noting the exceptions instead of stating the cases.

The standard states each case
where expressions of low ranking types
are subject to the integer promotions.

The standard describes where the explicit conversions
of expressions of array type take place
by listing the few exceptions where they don't occur.

N869
6.3.2.1 Lvalues and function designators
[#3] Except when it is the operand of the sizeof operator or
the unary & operator, or is a string literal used to
initialize an array, an expression that has type ``array of
type'' is converted to an expression with type ``pointer to
type'' that points to the initial element of the array
object and is not an lvalue.
 
P

pete

Keith said:
Here's another good illustration of your point:

#include <stdio.h>
int main(void)
{
char c;
printf("sizeof c = %d\n", (int)sizeof c);
printf("sizeof (c, c) = %d\n", (int)sizeof (c, c));
printf("sizeof (c + 0) = %d\n", (int)sizeof (c + 0));
return 0;
}

I think that (c + c) would be a better example of integer promotions,
than (c + 0) is,
because even if there were no such thing as "integer promotions",
the "usual arithmetic conversions" would still operate to make
(c + 0) be an expression of type int.
 
K

Keith Thompson

David T. Ashley said:
Single quotes in this context actually specify a type conversion, same as
ORD() did in old versions of basic.

No, the single quotes specify a charater constant, which is inherently
of type int.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,270
Latest member
TopCryptoTwitterChannels_

Latest Threads

Top