EOF for binary files

Discussion in 'C Programming' started by Registered User, Nov 11, 2006.

  1. I've read in a book:

    <quote>
    With a binary-mode stream, you can't detect the end-of-file by looking
    for EOF, because a byte of data from a binary stream could have that
    value, which would result in premature end of input. Instead, you can
    use the library function feof(), which can be used for both binary- and
    text-mode files:

    int feof(FILE *fp);
    </quote>

    Isn't it true that testing for EOF is valid for both text- and
    binary-mode files?

    Also, the FAQ recommends not to use feof():
    <quote>In virtually all cases, there's no need to use feof at all.
    </quote>
     
    Registered User, Nov 11, 2006
    #1
    1. Advertising

  2. Registered User said:

    > I've read in a book:
    >
    > <quote>
    > With a binary-mode stream, you can't detect the end-of-file by looking
    > for EOF, because a byte of data from a binary stream could have that
    > value, which would result in premature end of input.


    Ditch the book. It doesn't understand EOF.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: normal service will be restored as soon as possible. Please do not
    adjust your email clients.
     
    Richard Heathfield, Nov 11, 2006
    #2
    1. Advertising

  3. Richard Heathfield wrote:
    > Registered User said:
    >
    > > I've read in a book:
    > >
    > > <quote>
    > > With a binary-mode stream, you can't detect the end-of-file by looking
    > > for EOF, because a byte of data from a binary stream could have that
    > > value, which would result in premature end of input.

    >
    > Ditch the book. It doesn't understand EOF.
    >

    Oh, thanks Richard!! That part of the book really got me confused.
     
    Registered User, Nov 11, 2006
    #3
  4. Registered User

    CBFalconer Guest

    Registered User wrote:
    >
    > I've read in a book:
    >
    > <quote>
    > With a binary-mode stream, you can't detect the end-of-file by
    > looking for EOF, because a byte of data from a binary stream could
    > have that value, which would result in premature end of input.
    > Instead, you can use the library function feof(), which can be
    > used for both binary- and text-mode files:
    >
    > int feof(FILE *fp);
    > </quote>
    >
    > Isn't it true that testing for EOF is valid for both text- and
    > binary-mode files?


    Yes. The only possible exception occurs when (sizeof(int) == 1).
    A stream is a stream of bytes, and the routines to read them return
    ints formed from the (unsigned)char value involved. Thus the value
    of EOF is always distinct.

    >
    > Also, the FAQ recommends not to use feof():
    > <quote>In virtually all cases, there's no need to use feof at all.
    > </quote>


    feof is primarily useful to distinguish between i/o errors and
    actual eof, either of which conditions will usually return EOF.

    if (EOF == (ch = getc(f))) {
    if (feof(f)) /* actual file eof encountered */
    else {
    /* use ferror etc. to determine the cause */
    }
    }
    else {
    /* use the value of ch, which is a valid unsigned char */
    }

    note that ch must have been declared as an int.

    --
    Chuck F (cbfalconer at maineline dot net)
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net>
     
    CBFalconer, Nov 11, 2006
    #4
  5. Registered User said:

    > Richard Heathfield wrote:
    >> Registered User said:
    >>
    >> > I've read in a book:
    >> >
    >> > <quote>
    >> > With a binary-mode stream, you can't detect the end-of-file by looking
    >> > for EOF, because a byte of data from a binary stream could have that
    >> > value, which would result in premature end of input.

    >>
    >> Ditch the book. It doesn't understand EOF.
    >>

    > Oh, thanks Richard!! That part of the book really got me confused.


    The mistake the author makes is that he appears to believe EOF is a
    character. It isn't. It's a message from your I/O library which, freely
    translated, means "you asked me for more data, squire, but there ain't
    none. The pot's empty. Sorry, I'd love to help and all that...".

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: normal service will be restored as soon as possible. Please do not
    adjust your email clients.
     
    Richard Heathfield, Nov 11, 2006
    #5
  6. In article <>,
    Registered User <> wrote:

    >With a binary-mode stream, you can't detect the end-of-file by looking
    >for EOF, because a byte of data from a binary stream could have that
    >value, which would result in premature end of input.


    It would certainly be a mistake to compare a byte against EOF if the
    byte is a char, because EOF is an int value and a char converted to
    an int might have the same value as EOF. But getc() doesn't return
    a char; it returns an unsigned char converted to an int, so there
    is no possibility of a real byte appearing to be equal to EOF, because
    EOF is guaranteed to be negative.

    So you can perfectly well compare against EOF provided you don't
    convert the value to a char first.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, Nov 11, 2006
    #6
  7. Registered User

    Eric Sosman Guest

    Registered User wrote:
    > I've read in a book:
    >
    > <quote>
    > With a binary-mode stream, you can't detect the end-of-file by looking
    > for EOF, because a byte of data from a binary stream could have that
    > value, which would result in premature end of input. Instead, you can
    > use the library function feof(), which can be used for both binary- and
    > text-mode files:
    >
    > int feof(FILE *fp);
    > </quote>
    >
    > Isn't it true that testing for EOF is valid for both text- and
    > binary-mode files?


    The book is right in the sense that it is possible for a
    byte read from a stream (text or binary) to have the value
    EOF, but only on "exotic" machines where bytes and ints have
    the same size. That is, the book is right if it's trying to
    be "fully general" -- but if it's writing about "mainstream"
    C implementations it's wrong.

    The Standard defines all input operations as if they used
    the fgetc() function as many times as necessary (the actual
    implementation might do something more intricate, but the end
    result must be the same). The fgetc() function returns an int
    value: either EOF to indicate failure, or an actual input byte
    represented as unsigned char converted to int. If int is
    wider than char, converting an unsigned char to an int yields
    a non-negative value, and since the EOF macro expands to a
    negative number there can be no confusion.

    On those exotic architectures, though, things get sticky.
    If sizeof(int) == 1, there must be unsigned char values that
    are too large for int: for example, on a system with sixteen-bit
    chars and sixteen-bit ints, INT_MAX will be 32767 but UCHAR_MAX
    will be 65535. Since fgetc() must be able to read back any
    character values fputc() might have written (subject to some
    restrictions that don't matter here), on this system it must
    be able to return 65536 distinguishable int values. Half of
    those will necessarily be negative, and one of them will have
    the same value as EOF. So on exotic architectures, it is
    possible for fgetc() to return EOF when reading "real" data,
    and the only way to tell whether the EOF is actual data or an
    indication of input failure is to call both feof() and ferror().

    > Also, the FAQ recommends not to use feof():
    > <quote>In virtually all cases, there's no need to use feof at all.
    > </quote>


    I'm not the FAQ author, but I'd read "in virtually all cases"
    to mean "whenever int is wider than char," or "on virtually all
    `mainstream' machines." It would be nice, IMHO, if the FAQ were
    more explicit about this, but it's not a big failing.

    The FAQ is right in implying that feof() is seldom used,
    because after receiving an EOF return value (on a "mainstream"
    system) your immediate concern should be "End-of-input, or error?"
    and it seems more natural to use ferror() for that question:

    int ch;
    while ( (ch = fgetc(stream)) != EOF ) {
    /* process the character just read */
    }
    /* "Why did we get EOF?" */
    if (ferror(stream)) {
    /* do something about the I/O error */
    }
    else {
    /* normal end-of-input */
    }

    This code assumes that EOF can only appear as the result of
    end-of-input or I/O error, so if there's no I/O error the stream
    must have reached its end. Of course, the same reasoning would
    hold for using feof(stream) and swapping the bodies of the two
    if statements, but "ferror?" seems a more direct inquiry.

    On "exotic" architectures the either/or reasoning breaks down
    because there's a third possibility: an EOF return might be actual
    input data. If you're writing with such a system in mind you need
    to use both feof() and ferror() to distinguish the three outcomes,
    and the loop might look something like

    int ch;
    while ( (ch = fgetc(stream)) , /* comma operator */
    (!feof(stream) && !ferror(stream) ) {
    /* process the character just read */
    }
    /* "Was it error or end-of-input?" */
    if (ferror(stream)) {
    /* do something about the I/O error */
    }
    else {
    /* normal end-of-input */
    }

    Of course, this can be written in many other rearrangements. One
    likely change would be to call feof() and ferror() only when an EOF
    shows up instead of every single time, by changing the while clause
    to something like

    while ( (ch = fgetc(stream)) != EOF
    || (!feof(stream) && !ferror(stream)) )

    Since most I/O devices are pathetically slow compared to most CPUs,
    this "optimization" probably doesn't save noticeable time -- but
    it is in the tradition of C to worry about tiny efficiencies while
    ignoring gross waste. ;-) (That same tradition, by the way, calls
    for using getc() instead of fgetc() wherever possible.)

    --
    Eric Sosman
    lid
     
    Eric Sosman, Nov 11, 2006
    #7
  8. Registered User

    Coos Haak Guest

    Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:

    > In article <>,
    > Registered User <> wrote:
    >
    >>With a binary-mode stream, you can't detect the end-of-file by looking
    >>for EOF, because a byte of data from a binary stream could have that
    >>value, which would result in premature end of input.

    >
    > It would certainly be a mistake to compare a byte against EOF if the
    > byte is a char, because EOF is an int value and a char converted to
    > an int might have the same value as EOF. But getc() doesn't return
    > a char; it returns an unsigned char converted to an int, so there
    > is no possibility of a real byte appearing to be equal to EOF, because
    > EOF is guaranteed to be negative.


    getc returns an int, not a char, be it signed or unsigned.
    #include <stdio.h>
    int getc(FILE *FP);
    And yes, if no EOF condition is reached, the int may be regarded as char.
    EOF does not fit in a char so it well may be some negative number.

    > So you can perfectly well compare against EOF provided you don't
    > convert the value to a char first.


    Yes.
    --
    Coos
     
    Coos Haak, Nov 11, 2006
    #8
  9. Registered User

    Flash Gordon Guest

    Coos Haak wrote:
    > Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
    >
    >> In article <>,
    >> Registered User <> wrote:
    >>
    >>> With a binary-mode stream, you can't detect the end-of-file by looking
    >>> for EOF, because a byte of data from a binary stream could have that
    >>> value, which would result in premature end of input.

    >> It would certainly be a mistake to compare a byte against EOF if the
    >> byte is a char, because EOF is an int value and a char converted to
    >> an int might have the same value as EOF. But getc() doesn't return
    >> a char; it returns an unsigned char converted to an int, so there
    >> is no possibility of a real byte appearing to be equal to EOF, because
    >> EOF is guaranteed to be negative.

    >
    > getc returns an int, not a char, be it signed or unsigned.


    Richard said that.

    > #include <stdio.h>
    > int getc(FILE *FP);
    > And yes, if no EOF condition is reached, the int may be regarded as char.


    Be *definition* if EOF is not returned the value is that of an
    *unsigned* char as, again, richard said.

    > EOF does not fit in a char so it well may be some negative number.


    EOF is *defined* as being a negative number, so there is no "may well
    be" about it.

    >> So you can perfectly well compare against EOF provided you don't
    >> convert the value to a char first.

    >
    > Yes.


    Everything Richard said in that post is correct, not just that last
    sentence.
    --
    Flash Gordon
     
    Flash Gordon, Nov 11, 2006
    #9
  10. Registered User

    Coos Haak Guest

    Op Sat, 11 Nov 2006 17:21:14 +0000 schreef Flash Gordon:

    > Coos Haak wrote:
    >> Op 11 Nov 2006 14:34:44 GMT schreef Richard Tobin:
    >>
    >>> In article <>,
    >>> Registered User <> wrote:
    >>>
    >>>> With a binary-mode stream, you can't detect the end-of-file by looking
    >>>> for EOF, because a byte of data from a binary stream could have that
    >>>> value, which would result in premature end of input.
    >>> It would certainly be a mistake to compare a byte against EOF if the
    >>> byte is a char, because EOF is an int value and a char converted to
    >>> an int might have the same value as EOF. But getc() doesn't return

    My mistake, I overlooked this -------
    Sorry for reading and replying too fast and hasty ;-(
    --
    Coos
     
    Coos Haak, Nov 11, 2006
    #10
  11. "Registered User" <> writes:
    > I've read in a book:
    >
    > <quote>
    > With a binary-mode stream, you can't detect the end-of-file by looking
    > for EOF, because a byte of data from a binary stream could have that
    > value, which would result in premature end of input. Instead, you can
    > use the library function feof(), which can be used for both binary- and
    > text-mode files:
    >
    > int feof(FILE *fp);
    > </quote>


    Who is the author? If it's Schildt, we already know about him (and
    warn people away from his books whenever possible). If it's someone
    else, we may have another name for The List.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Nov 11, 2006
    #11
  12. Registered User

    SM Ryan Guest

    "Registered User" <> wrote:
    # I've read in a book:
    #
    # <quote>
    # With a binary-mode stream, you can't detect the end-of-file by looking
    # for EOF, because a byte of data from a binary stream could have that
    # value, which would result in premature end of input. Instead, you can
    # use the library function feof(), which can be used for both binary- and
    # text-mode files:

    It's referring to getw(fp) which can return the same value as EOF
    without actually being at the end of file.

    --
    SM Ryan http://www.rawbw.com/~wyrmwif/
    The little stoner's got a point.
     
    SM Ryan, Nov 11, 2006
    #12
  13. SM Ryan <> writes:
    > "Registered User" <> wrote:
    > > I've read in a book:
    > >
    > > <quote>
    > > With a binary-mode stream, you can't detect the end-of-file by looking
    > > for EOF, because a byte of data from a binary stream could have that
    > > value, which would result in premature end of input. Instead, you can
    > > use the library function feof(), which can be used for both binary- and
    > > text-mode files:

    >
    > It's referring to getw(fp) which can return the same value as EOF
    > without actually being at the end of file.


    What makes you think it's referring to getw()? There is no such function
    in standard C.

    <OT>
    There is a non-standard function getw() that reads a word (defined as
    an int) from a stream. It's not even POSIX; it's defined by SVID, and
    one man page recommends using fread() instead. The text quoted from
    the book doesn't even make sense in terms of getw(), since it talks
    about a *byte* of data having the value EOF.
    </OT>

    It's far more likely that the author of the book just doesn't know
    what he's talking about.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Nov 11, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. flamesrock

    EOF for binary?

    flamesrock, Jan 8, 2005, in forum: Python
    Replies:
    6
    Views:
    462
    Dennis Lee Bieber
    Jan 8, 2005
  2. Reading binary file finding EOF

    , Dec 13, 2004, in forum: C Programming
    Replies:
    11
    Views:
    657
    Lawrence Kirby
    Dec 14, 2004
  3. Kobu
    Replies:
    10
    Views:
    640
    Keith Thompson
    Mar 4, 2005
  4. SpreadTooThin

    ifstream eof not reporting eof?

    SpreadTooThin, Jun 13, 2007, in forum: C++
    Replies:
    10
    Views:
    709
    James Kanze
    Jun 15, 2007
  5. Jan Burse
    Replies:
    67
    Views:
    1,083
    Jan Burse
    Mar 14, 2012
Loading...

Share This Page