EOF

Richard Tobin · Sep 19, 2007

lak said:
why EOF has an -1 value.what's the purpose to make it as -1. can any
one tell answer for this?

It doesn't have to have the value -1. It has a negative value so that
it can be easily distinguished from real characters when returned from
getc() and the like.

-- Richard

lak · Sep 19, 2007

why EOF has an -1 value.what's the purpose to make it as -1. can any
one tell answer for this?

Mark Bluemel · Sep 19, 2007

lak said:
why EOF has an -1 value.what's the purpose to make it as -1. can any
one tell answer for this?

RTFM is an obvious answer. Hint: The logic is essentially the same as
that which stipulates that getchar() (etc) return int not char...

I picked up my copy of K&R (first edition, somewhat battered

and
found that at the time that was written there were two common EOF
conventions, one of which could be handled by getchar() returning a char
rather than an int... (Of course it had its own problems).

CBFalconer · Sep 19, 2007

lak said:
why EOF has an -1 value.what's the purpose to make it as -1. can
any one tell answer for this?

The only requirement for EOF is that it be negative.

Martin Ambuhl · Sep 19, 2007

lak said:
why EOF has an -1 value.what's the purpose to make it as -1. can any
one tell answer for this?

It doesn't. It could be any negative integral constant expression. The
reason is simple: to make EOF an encoding other than those used for
legitimate characters. That is why functions (or macros) like fgetc,
getc, and getchar return ints instead of chars. Did you check the FAQ
(or an elementary textbook) before posting? It's a good idea to do so;
many people, even the sweetest on occassion, get quite steamed at
elementary questions already answered in the FAQ.

bluey · Sep 19, 2007

why EOF has an -1 value.what's the purpose to make it as -1. can any
one tell answer for this?

The actual answer is quite simple. you need to learn - as do many -
about the registers on the cpu itself. given that 0 in acsii is a
value (albiet a Null value) the only way a processor can see past this
is by checking the carry bit flag. by passing a -1 to the register it
can see a processor condition of carry (or not in this case0.

Martin Ambuhl · Sep 20, 2007

bluey said:
The actual answer is quite simple. you need to learn - as do many -
about the registers on the cpu itself. given that 0 in acsii is a
value (albiet a Null value) the only way a processor can see past this
is by checking the carry bit flag. by passing a -1 to the register it
can see a processor condition of carry (or not in this case0.

Either "bluey" is making a joke, which is in very poor taste since the
person asking the question may take him seriously, or he's an idiot. In
either case you should ignore his blathering.

Keith Thompson · Sep 20, 2007

bluey said:
The actual answer is quite simple. you need to learn - as do many -
about the registers on the cpu itself. given that 0 in acsii is a
value (albiet a Null value) the only way a processor can see past this
is by checking the carry bit flag. by passing a -1 to the register it
can see a processor condition of carry (or not in this case0.

Knowing about CPU registers is not necessary at all, or even
particularly helpful in this case.

fgetc() returns an int value which is either:
the next character read from the input stream (interpreted as an
unsigned char and converted to int;
or:
a distinct value, namely the value of the macro EOF, to indicate
end-of-file or an error condition.

The standard requires EOF to have a negative value. (It doesn't
require it to be -1; I've never heard of an implementation where it's
defined as anything other than -1, but I still wouldn't assume
anything beyond what the standard guarantees).

Since a valid unsigned char value can be any non-negative value in the
range 0 to UCHAR_MAX, it makes sense to use a negative value if you
need something distinct from all possible unsigned char values.

Note the cpomlete lack in this explanation of any mention of CPU
registers, condition codes, or carry bits. It's all about values, not
representations. And since most code checks whether the result is
equal (or unequal) to EOF, the fact that it's negative doesn't
typically even matter. The standard *could* have allowed EOF to be
defined as (UCHAR_MAX + 1), for example.

Ben Pfaff · Sep 20, 2007

bluey said:
The actual answer is quite simple. you need to learn - as do many -
about the registers on the cpu itself. given that 0 in acsii is a
value (albiet a Null value) the only way a processor can see past this
is by checking the carry bit flag. by passing a -1 to the register it
can see a processor condition of carry (or not in this case0.

This answer is, at best, at the wrong level of abstraction for
the question. At worst, it is incorrect and misleading. I would
advise the OP to disregard it.

Richard Heathfield · Sep 20, 2007

CBFalconer said:

The only requirement for EOF is that it be negative.

No, it must also have integral type. #define EOF -3.14159 would not be
conforming.

Keith Thompson · Sep 20, 2007

Richard Heathfield said:
CBFalconer said:

No, it must also have integral type. #define EOF -3.14159 would not be
conforming.

Specifically, it must expand to an integer constant expression with
type int and a negative value.

#define EOF (-1L)

would not be conforming. Nor would

#define EOF -1

Charlie Gordon · Sep 20, 2007

Keith Thompson said:
Knowing about CPU registers is not necessary at all, or even
particularly helpful in this case.

fgetc() returns an int value which is either:
the next character read from the input stream (interpreted as an
unsigned char and converted to int;
or:
a distinct value, namely the value of the macro EOF, to indicate
end-of-file or an error condition.

The standard requires EOF to have a negative value. (It doesn't
require it to be -1; I've never heard of an implementation where it's
defined as anything other than -1, but I still wouldn't assume
anything beyond what the standard guarantees).

Since a valid unsigned char value can be any non-negative value in the
range 0 to UCHAR_MAX, it makes sense to use a negative value if you
need something distinct from all possible unsigned char values.

Note the cpomlete lack in this explanation of any mention of CPU
registers, condition codes, or carry bits. It's all about values, not
representations. And since most code checks whether the result is
equal (or unequal) to EOF, the fact that it's negative doesn't
typically even matter. The standard *could* have allowed EOF to be
defined as (UCHAR_MAX + 1), for example.

That would have required that UCHAR_MAX < INT_MAX.

Interestingly, on architectures where UCHAR_MAX > INT_MAX, converting an
unsigned char to an int with a simple cast invokes undefined behaviour. On
architectures with non 2's complement, int can have fewer distinguishible
values than unsigned char. Even on 2's complement, EOF could be
undistinguishable from a valid char read from the stream.

The classic idiom:

int c;
while ((c = getc(fp)) != EOF) {
...
}

would need to be changed to:

int c;
while ((c = getc(fp)) != EOF || !feof(fp)) {
...
}

And still could not account for the exact contents of the stream.

How ugly!

Coos Haak · Sep 20, 2007

Op Thu, 20 Sep 2007 09:33:55 +0200 schreef Charlie Gordon:

That would have required that UCHAR_MAX < INT_MAX.

Interestingly, on architectures where UCHAR_MAX > INT_MAX, converting an
unsigned char to an int with a simple cast invokes undefined behaviour. On
architectures with non 2's complement, int can have fewer distinguishible
values than unsigned char. Even on 2's complement, EOF could be
undistinguishable from a valid char read from the stream.

The classic idiom:

int c;
while ((c = getc(fp)) != EOF) {
...
}

would need to be changed to:

int c;
while ((c = getc(fp)) != EOF || !feof(fp)) {
...
}

while ((c = getc(fp)) != EOF && !feof(fp)) {

As long c isn't EOF _and_ the end of the file is not yet reached ;-)

And still could not account for the exact contents of the stream.

How ugly!

Yes!

CBFalconer · Sep 20, 2007

Charlie said:
.... snip ...

int c;
while ((c = getc(fp)) != EOF || !feof(fp)) {
...
}

And still could not account for the exact contents of the stream.

How not? Your construct seems to me to work everywhere, since the
feof() is not called unless EOF == c, and thus does not slow up
operation. feof() is also called only after the possible getc that
reached EOF.

Charlie Gordon · Sep 20, 2007

Coos Haak said:
Op Thu, 20 Sep 2007 09:33:55 +0200 schreef Charlie Gordon:

while ((c = getc(fp)) != EOF && !feof(fp)) {

As long c isn't EOF _and_ the end of the file is not yet reached ;-)

No, this loop will stop if getc(fp) returns EOF, which is possible before
the end of stream if sizeof(char) == sizeof(int).

But my loop is indeed incorrect as it does not stop in case of a read error.

The loop could actually be written this way:

while (c = getc(fp), !feof(fp) && !ferror(fp)) {
...
}

But that makes it even uglier.

Tor Rustad · Sep 20, 2007

Keith Thompson wrote:

[...]

Note the cpomlete lack in this explanation of any mention of CPU
registers, condition codes, or carry bits. It's all about values, not
representations. And since most code checks whether the result is
equal (or unequal) to EOF, the fact that it's negative doesn't
typically even matter. The standard *could* have allowed EOF to be
defined as (UCHAR_MAX + 1), for example.

It is interesting to note, while EOF need to be negative, the value of
WEOF need not to, and could be e.g. (WCHAR_MAX + 1).

Coos Haak · Sep 20, 2007

Op Thu, 20 Sep 2007 18:36:55 +0200 schreef Charlie Gordon:

No, this loop will stop if getc(fp) returns EOF, which is possible before
the end of stream if sizeof(char) == sizeof(int).

But my loop is indeed incorrect as it does not stop in case of a read error.

The loop could actually be written this way:

while (c = getc(fp), !feof(fp) && !ferror(fp)) {
...
}

But that makes it even uglier.

So the shortcut evaluation of C is somewhat different from mathematical
logic. || is indeed somewhat prettier to the eye than &&. Thanks for
remembering me.

Peter J. Holzer · Sep 21, 2007

How not? Your construct seems to me to work everywhere,

Assume that int is 16 bits with a sign-magnitude representation and
unsigned char is also 16 bits. Then there are only 65535 different int
values, but 65536 different unsigned char values. So two different
unsigned char values will be mapped to the same int value (probably 0
and 32768 will both be mapped to 0) and cannot be distinguished any
more (immediately assigning c to an unsigned char variable may help,
though).

One could argue that such an implementation is not conforming, however.

hp

Peter J. Holzer · Sep 21, 2007

Op Thu, 20 Sep 2007 18:36:55 +0200 schreef Charlie Gordon:

So the shortcut evaluation of C is somewhat different from mathematical
logic.

How is that different from mathematical logic?

hp

Select Eof extension files based on text list of filenames with if condition	0	May 4, 2022
Select files based on text list of filenames(part of the name:date) with condition	0	May 4, 2022
Questions about K&R (Kernighan and Ritchi)	57	Apr 22, 2010
getchar() and EOF confusion	21	Oct 15, 2008
getchar function and EOF problem..	13	Mar 10, 2006
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Using getchar and putchars	3	Jan 24, 2022
getc can return EOF, but ungetc can't sent it back... why?	11	Mar 9, 2005

EOF

Richard Tobin

lak

Mark Bluemel

CBFalconer

Martin Ambuhl

bluey

Martin Ambuhl

Keith Thompson

Ben Pfaff

Richard Heathfield

Keith Thompson

Charlie Gordon

Coos Haak

CBFalconer

Charlie Gordon

Tor Rustad

Coos Haak

Peter J. Holzer

Peter J. Holzer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads