Reading binary file finding EOF

S

spideyman99

How can I find the true EOF for a file that contains 0xFF. When I use
fgetc and it encounters 0xFF it thinks it's the end of file but it
really isn't.
 
J

Jens.Toerring

How can I find the true EOF for a file that contains 0xFF. When I use
fgetc and it encounters 0xFF it thinks it's the end of file but it
really isn't.

EOF is no character, it's a condition. The confusion probably comes
from some very old DOS versions where ^Z was an end of file marker.
But everything between 0x00 and 0xFF are completely normal and valid
characters that can be returned by fgetc(). That's why the return
value of fgets() is an int, not a char: since all possible chars are
valid return values a value not in this range, defined as EOF (don't
even speculate about it's value, while it's often -1 that's not
necessary and 1725 would also do) gets returned when the end of the
file is detected. So alway store what you get from fgetc() in an int
(and _not_ a char ) and compare that to EOF.

Regards, Jens
 
M

Mike Wahler

(e-mail address removed) wrote:
defined as EOF (don't
even speculate about it's value, while it's often -1 that's not
necessary and 1725 would also do)

No it would not. EOF must be negative.

From ISO/IEC 9899:1999 (E) 7.19.1/3 :
--------------------------------------------------------------
EOF

which expands to an integer constant expression, with type int
and a negative value, that is returned by several functions to
indicate end-of-file, that is, no more input from a stream
--------------------------------------------------------------


Anyway, the value 1725 could easily be valid character value
for implementations whose byte size can accomodate it. Also,
if a member of the basic execution character set is stored in
a char object, its value is guaranteed to be positive.
(6.2.5/3).

-Mike
 
J

Jens.Toerring

No it would not. EOF must be negative.
From ISO/IEC 9899:1999 (E) 7.19.1/3 :
which expands to an integer constant expression, with type int
and a negative value, that is returned by several functions to
indicate end-of-file, that is, no more input from a stream
--------------------------------------------------------------
Anyway, the value 1725 could easily be valid character value
for implementations whose byte size can accomodate it. Also,
if a member of the basic execution character set is stored in
a char object, its value is guaranteed to be positive.
(6.2.5/3).

I see. Thanks for the correction.
Regards, Jens
 
J

James McIninch

<posted & mailed>

0xFF is never EOF. You simply:

char ch;
FILE *fp;
..
..
..
for (ch = fgetc(fp); !feof(fp); ch = fgetc(fp))
{
/* do something */
}


or

int ch;
FILE *fp;
..
..
..
for (ch = fgetc(fp); ch != EOF; ch = fgetc(fp))
{
/* do something */
}
 
J

Jack Klein

On Sun, 12 Dec 2004 22:45:50 -0500, James McIninch

Note, top posting is considered rude here, and in most other technical
discussion groups. It makes discussions very hard to follow. Your
new material belongs after, or interspersed with, material you are
quoting. I have reformatted your post properly.
<posted & mailed>

0xFF is never EOF. You simply:

char ch;

WRONG!

As Keith correctly pointed out, see this question in the FAQ:

http://www.eskimo.com/~scs/C-faq/q12.1.html
FILE *fp;
.
.
.
for (ch = fgetc(fp); !feof(fp); ch = fgetc(fp))
{
/* do something */
}

The incorrect snippet above is almost certainly what the OP is doing
that is causing the problem he described.
or

int ch;

Yes, this one is correct.
 
L

Lawrence Kirby

On Sun, 12 Dec 2004 22:45:50 -0500, James McIninch

Note, top posting is considered rude here, and in most other technical
discussion groups. It makes discussions very hard to follow. Your
new material belongs after, or interspersed with, material you are
quoting. I have reformatted your post properly.


WRONG!

Not wrong.
As Keith correctly pointed out, see this question in the FAQ:

http://www.eskimo.com/~scs/C-faq/q12.1.html

The code here doesn't use the value of ch to test for end of file, so ch
can be char, which appears to be the point the code is trying to make.
The code does contain a bug however because it fails to test ferror(). If
the fgetc() operation failed due to an error the loop would never
terminate, appearing to process multiple (char)EOF valued "characters".
The incorrect snippet above is almost certainly what the OP is doing
that is causing the problem he described.


Yes, this one is correct.

For completeness there should be a test like

if (ferror(fp)) {
/* Handle error */
}

here after the loop. If a program encounters an input error it probably
shoudn't act as if it completed successfully.

Lawrence
 
H

Herbert Rosenau

How can I find the true EOF for a file that contains 0xFF. When I use
fgetc and it encounters 0xFF it thinks it's the end of file but it
really isn't.
fgetc() returns int, not char. So define the variable that should
receive the value returned by fgec() as int, not char and anything
works as expected.


--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 
K

Keith Thompson

Lawrence Kirby said:
Not wrong.

Yes, wrong.
The code here doesn't use the value of ch to test for end of file, so ch
can be char, which appears to be the point the code is trying to make.
The code does contain a bug however because it fails to test ferror(). If
the fgetc() operation failed due to an error the loop would never
terminate, appearing to process multiple (char)EOF valued "characters".

fgetc(fp) returns a value of type int; ch is of type char. The
assignment implicitly converts the int result to type char. If char
is signed, and if the value returned by fgetc(fp) is outside the range
CHAR_MIN..CHAR_MAX, the conversion either yields an implementation-defined
value or raises an implementation-defined signal (C99 6.3.1.3p3).
This will happen either of fgetc() returns EOF or if it returns a
value greater than CHAR_MAX (but less than or equal to UCHAR_MAX).

Making ch an unsigned char might alleviate this, but of course the
correct solution is to use an int and check for ch==EOF.
 
L

Lawrence Kirby

On Mon, 13 Dec 2004 23:53:14 +0000, Keith Thompson wrote:

....
fgetc(fp) returns a value of type int; ch is of type char. The
assignment implicitly converts the int result to type char. If char
is signed, and if the value returned by fgetc(fp) is outside the range
CHAR_MIN..CHAR_MAX, the conversion either yields an implementation-defined
value or raises an implementation-defined signal (C99 6.3.1.3p3).
This will happen either of fgetc() returns EOF or if it returns a
value greater than CHAR_MAX (but less than or equal to UCHAR_MAX).

Strictly correct but this is one of those cases where existing usage of
for example the form

int ch;
char buffer[SIZE];

while ((ch = fgetc(fp)) != EOF) {
buffer[pos] = ch;

.
.
.

is so overwhelming that it just isn't worth worrying about. An
implementation that causes trouble here won't be very popular.
Making ch an unsigned char might alleviate this, but of course the
correct solution is to use an int and check for ch==EOF.

Except when sizeof(int)==1 or more specifically UCHAR_MAX > INT_MAX.

That is another possibility that we tend to ignore, but one that occurs on
real (if mostly freestanding) implementations.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top