A question regarding Q20.1 from c-faq.com

P

Philip Potter

Peter said:
Because there are very few hosted systems where sizeof(int)
is 1. Indeed, it's held that such systems are not conforming.

Why are such systems not conforming? Surely a hosted implementation can
have CHAR_BIT == 16, sizeof(int) == 1 if it wants to?
 
J

James Kuyper

Peter Nilsson wrote:
....
Because there are very few hosted systems where sizeof(int)
is 1. Indeed, it's held that such systems are not conforming.

On what grounds?
 
S

santosh

Richard said:
santosh said:


How?

I'm sure I'm missing something simple here, but to me, if
sizeof(char)==sizeof(int), it seems impossible to simultaneously return
all possible character values and the out-of-band value EOF.
 
R

Richard Heathfield

santosh said:

I'm sure I'm missing something simple here, but to me, if
sizeof(char)==sizeof(int), it seems impossible to simultaneously return
all possible character values and the out-of-band value EOF.

All that this actually breaks is the assumption that a value of EOF
necessarily means "it failed" - on such systems, you have to check feof or
ferror as well before terminating your read loop.
 
J

James Kuyper

santosh said:
Wouldn't they break getc and friends?

No, they would simply mean that checking for EOF is not sufficient to
determine whether an error has actually occurred; it might also
represent a valid character. However, you can always disambiguate by
checking ferror() and feof().
 
S

santosh

Richard said:
santosh said:



All that this actually breaks is the assumption that a value of EOF
necessarily means "it failed" - on such systems, you have to check
feof or ferror as well before terminating your read loop.

What about character values that are outside of the common set of values
for unsigned char and int? How can they be returned?
 
J

James Kuyper

santosh said:
What about character values that are outside of the common set of values
for unsigned char and int? How can they be returned?

getc() is supposed to read one character at a time as an unsigned char.
When UCHAR_MAX > INT_MAX (which is guaranteed when sizeof(int)==1, and
possible though unlikely when sizeof(int) > 1), the conversion from
unsigned char to int has either an implementation-defined result or
raises an implementation-defined signal. If such an implementation were
to raise a signal from that conversion, I would consider it broken, but
not necessarily non-conforming. Similarly, if such implementation were
to define the conversion from unsigned char to int in such a way that it
was not a perfect inverse of the conversion from int to unsigned char, I
would also consider it broken, though not necessarily non-conforming.

On an implementation where sizeof(int)==1 and unsigned char=>int is a
perfect inverse for int=>unsigned char, I don't see any unavoidable
problem. Of course, there, is a great deal of code which does not avoid
the possible problems.
 
S

santosh

James said:
getc() is supposed to read one character at a time as an unsigned
char. When UCHAR_MAX > INT_MAX (which is guaranteed when
sizeof(int)==1, and possible though unlikely when sizeof(int) > 1),
the conversion from unsigned char to int has either an
implementation-defined result or raises an implementation-defined
signal. If such an implementation were to raise a signal from that
conversion, I would consider it broken, but not necessarily
non-conforming. Similarly, if such implementation were to define the
conversion from unsigned char to int in such a way that it was not a
perfect inverse of the conversion from int to unsigned char, I would
also consider it broken, though not necessarily non-conforming.

On an implementation where sizeof(int)==1 and unsigned char=>int is a
perfect inverse for int=>unsigned char, I don't see any unavoidable
problem. Of course, there, is a great deal of code which does not
avoid the possible problems.

Also on systems where sizeof(char) == sizeof(int) and the range of
possible values of unsigned char is greater than that of int, how would
one pass a possibly legal character value to functions like isprint and
others?
 
M

Mark McIntyre

Geoff said:
And why has anyone not noticed that sizeof 'A' returned 4, not 1?

Its supposed to. In C a character constant is an int.

(and as for the %d thing, I was entirely aware when I wrote the code,
but I didn't want to either muddy the waters with casting or figure out
which unsigned type gcc was using for size_t. I also entirely expected
the discussion that ensued. Its normal).
 
M

Mark McIntyre

Richard said:
santosh said:



All that this actually breaks is the assumption that a value of EOF
necessarily means "it failed"

Yes but its obligated to mean that - since many functions are obligated
to return EOF on error conditions.
- on such systems, you have to check feof or
ferror as well before terminating your read loop.

True. But that in itself sorta makes the point - you can't rely on what
the standard says about the return value.
 
J

jameskuyper

santosh wrote:
....
Also on systems where sizeof(char) == sizeof(int) and the range of
possible values of unsigned char is greater than that of int, how would
one pass a possibly legal character value to functions like isprint and
others?

You shouldn't be processing a value from getc() until you're sure that
it's a legal character, by checking for ferr() and feof() if getc()
returns EOF. Alternatively, you could use the I/O routines which
handle more than one byte at a time; in that case, the appropriate
logic to distinguish an error return from getc() of EOF from a valid
character that compares equal to an EOF must reside in the higher
level routine, so you don't need to worry about it.
 
R

Richard Heathfield

Mark McIntyre said:
Yes but its obligated to mean that - since many functions are obligated
to return EOF on error conditions.

I agree that it means that - but where does the Standard say that it can
*only* mean that? :)
 
K

Keith Thompson

Mark McIntyre said:
[I think that was Peter Nilsson.]
This is clearly incorrect.

No, it's clearly correct. The standard has a fairly strict definition
of what it means for two types to be "compatible". You can often
assign a value of type foo to an object of type bar (with an implicit
conversion) even if types foo and bar are not compatible. For
example:

char c = 'x';
double d = c;

The above is legal, but char and double are not compatible.

I don't think the standard has a term for types that can be assigned
to each other without a cast. I suppose you could call such types
"assignment-compatible".
 
K

Keith Thompson

Mark McIntyre said:
Its supposed to. In C a character constant is an int.

(and as for the %d thing, I was entirely aware when I wrote the code,
but I didn't want to either muddy the waters with casting or figure
out which unsigned type gcc was using for size_t. I also entirely
expected the discussion that ensued. Its normal).

Adding the required casts would have *avoided* muddying the waters.
And you knew that.
 
M

Mark McIntyre

Richard said:
Mark McIntyre said:


I agree that it means that - but where does the Standard say that it can
*only* mean that? :)

Smiley noted. But since some functions are obligated to use EOF to
indicate an error, its completely correct to check for that return.

If in some implementations a programmer can't distinguish the error
return from a non-error return, code which is completely conforming will
malfunction. My POV is that if an implementation can't translate and
_execute_ a conforming programme correctly, it can't be a properly
conforming implementation.

Otherwise it becomes trivial to write pathological implementations. Oh,
wait, did someone say MSVC... :)
gd&r
 
M

Mark McIntyre

Keith said:
Mark McIntyre said:
[I think that was Peter Nilsson.]
This is clearly incorrect.

No, it's clearly correct.

I still disagree...

The standard has a fairly strict definition
of what it means for two types to be "compatible".

Sure, but in those terms the entire discussion becomes meaningless,
since "Two types have compatible type if their types are the same."
The context of the discussion seemed different to merely a statement of
the words of the standard.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top