ifstream::get() surprise

Discussion in 'C++' started by Jacek Dziedzic, Aug 26, 2004.

  1. Hi!

    Consider the following program

    #include <fstream>
    #include <iostream>
    using namespace std;

    int main() {
    ifstream in("test.txt");
    char buf[40];
    in.get(buf,40);
    cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
    }

    and a file, test.txt, starting with an empty line, ie. a lone EOL
    character on the first line.

    I was quite surprised to find out, that under these circumstances
    the aforementioned program produced
    "Read **, trouble: 1".

    Why does 'in' go to a fail state? I thought 'get' reads up to
    the terminator, stores all characters into 'buf' and leaves the
    terminator inside the stream. That would mean 'buf' containing
    just a \0 char (no chars read), the EOL still in the stream, but
    why a failed state? There are more lines in the file, so we're
    not eof(), and my understanding of this situation is a
    "successful read of zero characters" rather than "read error".

    How then can I distinguish a successfull reading of an empty
    line from an I/O error during reading a line? Of course I have
    no a priori knowledge if these empty lines exist in my parsed
    file or not.

    TIA,
    - J.
    Jacek Dziedzic, Aug 26, 2004
    #1
    1. Advertising

  2. Jacek Dziedzic wrote:
    > Consider the following program
    >
    > #include <fstream>
    > #include <iostream>
    > using namespace std;
    >
    > int main() {
    > ifstream in("test.txt");
    > char buf[40];
    > in.get(buf,40);
    > cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
    > }
    >
    > and a file, test.txt, starting with an empty line, ie. a lone EOL
    > character on the first line.
    >
    > I was quite surprised to find out, that under these circumstances
    > the aforementioned program produced
    > "Read **, trouble: 1".
    >
    > Why does 'in' go to a fail state?


    Yes. It does that if it reads no characters.

    > I thought 'get' reads up to
    > the terminator, stores all characters into 'buf' and leaves the
    > terminator inside the stream. That would mean 'buf' containing
    > just a \0 char (no chars read), the EOL still in the stream, but
    > why a failed state?


    There is no other way to tell you that no characters have been read.

    > There are more lines in the file, so we're
    > not eof(), and my understanding of this situation is a
    > "successful read of zero characters" rather than "read error".


    Right.

    > How then can I distinguish a successfull reading of an empty
    > line from an I/O error during reading a line? Of course I have
    > no a priori knowledge if these empty lines exist in my parsed
    > file or not.


    If your stream is in good standing before the operation and has its
    'failbit' set after the operation, no characters have been stored. If
    the file (stream) has somehow lost integrity, 'badbit' is set. That's
    how you distinguish.

    Victor
    Victor Bazarov, Aug 26, 2004
    #2
    1. Advertising

  3. Victor Bazarov:
    >> the EOL still in the stream, but
    >> why a failed state?

    >
    > There is no other way to tell you that no characters have been read.


    Why not just store '\0' into the buffer and leave the stream ok?

    > If your stream is in good standing before the operation and has its
    > 'failbit' set after the operation, no characters have been stored. If
    > the file (stream) has somehow lost integrity, 'badbit' is set. That's
    > how you distinguish.


    Oh I see. So if someone eg. slips the floppy containg the file
    out from the drive, then in.bad() will be true? Yes, that sounds
    reasonable!

    Thanks for the quick reply,
    - J.
    Jacek Dziedzic, Aug 26, 2004
    #3
  4. Jacek Dziedzic

    Mike Wahler Guest

    "Jacek Dziedzic" <> wrote in message
    news:cgl2kt$grf$...
    > Hi!
    >
    > Consider the following program
    >
    > #include <fstream>
    > #include <iostream>
    > using namespace std;
    >
    > int main() {
    > ifstream in("test.txt");


    You need to check here whether the file was opened
    successfully or not, and not proceed if it wasn't.

    Also note that 'get()' is an unformatted input function.
    Using it with a text-mode stream (the default) could have
    unexpected results. If you want to read unformatted,
    open with 'std::ios::binary'. But I think you're probably
    just using the wrong function. See below.

    > char buf[40];
    > in.get(buf,40);
    > cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
    > }
    >
    > and a file, test.txt, starting with an empty line, ie. a lone EOL
    > character on the first line.
    >
    > I was quite surprised to find out, that under these circumstances
    > the aforementioned program produced
    > "Read **, trouble: 1".


    It's not surprising when you read the specification of 'std::istream::get()'

    >
    > Why does 'in' go to a fail state?


    By design.

    >I thought 'get' reads up to
    > the terminator, stores all characters into 'buf' and leaves the
    > terminator inside the stream. That would mean 'buf' containing
    > just a \0 char (no chars read), the EOL still in the stream, but
    > why a failed state? There are more lines in the file, so we're
    > not eof(), and my understanding of this situation is a
    > "successful read of zero characters" rather than "read error".


    No such thing as 'successful read of zero characters.' If characters
    were requested and none were extracted, that's a 'failure'.


    ============== begin quote ===========================
    ISO/IEC 14882:1998(E)

    27.6.1.3 Unformatted input functions

    basic_istream<charT,traits>& get(char_type* s, streamsize n,
    char_type delim );

    7 Effects: Extracts characters and stores them into successive locations
    of an array whose first element is designated by s. (286) Characters
    are extracted and stored until any of the following occurs:

    -- n ­ 1 characters are stored;

    -- end­of­file occurs on the input sequence (in which case the function
    calls setstate(eofbit));

    -- c == delim for the next available input character c(in which case c
    is not extracted).

    8 If the function stores no characters, it calls setstate(failbit) (which
    may throw ios_base::failure (27.4.4.3)). In any case, it then stores a
    null character into the next successive location of the array.

    9 Returns: *this.

    basic_istream<charT,traits>& get(char_type* s, streamsize n)

    10 Effects: Calls get(s,n,widen('\n'))

    11 Returns: Value returned by the call.
    ============== end quote ===========================

    >
    > How then can I distinguish a successfull reading of an empty
    > line from an I/O error during reading a line? Of course I have
    > no a priori knowledge if these empty lines exist in my parsed
    > file or not.


    I recommend you eschew the array and use std::strings and
    std::getline to parse your file.

    std::string s;
    while(std::getline(in, s))
    cout << s << '\n';

    if(!in.eof())
    cerr << "Error reading\n";

    Now you don't have to worry if your array is big enough,
    and you can get at individual characters the same way
    as from an array, e.g.

    char c = s[0];

    HTH,
    -Mike
    Mike Wahler, Aug 26, 2004
    #4
  5. Jacek Dziedzic

    tom_usenet Guest

    On Thu, 26 Aug 2004 18:44:50 +0200, Jacek Dziedzic
    <> wrote:

    >Victor Bazarov:
    >>> the EOL still in the stream, but
    >>> why a failed state?

    >>
    >> There is no other way to tell you that no characters have been read.

    >
    > Why not just store '\0' into the buffer and leave the stream ok?


    Well, you asked it to get characters, and it failed to get any, I
    suppose. It's arguable whether this behaviour is the most useful and
    least surprising or not, of course; I think I agree with you that
    going into a fail state isn't the intuitive behaviour.

    '\0' is stored into the buffer though.

    >
    >> If your stream is in good standing before the operation and has its
    >> 'failbit' set after the operation, no characters have been stored. If
    >> the file (stream) has somehow lost integrity, 'badbit' is set. That's
    >> how you distinguish.

    >
    > Oh I see. So if someone eg. slips the floppy containg the file
    >out from the drive, then in.bad() will be true? Yes, that sounds
    >reasonable!


    That's the intent of badbit I think. But floppy disks are
    implementation defined - it's possible you just get eof on some
    implementations.

    Tom
    tom_usenet, Aug 26, 2004
    #5
  6. Mike Wahler wrote:

    >>
    >>int main() {
    >> ifstream in("test.txt");

    >
    > You need to check here whether the file was opened
    > successfully or not, and not proceed if it wasn't.


    Obviously, but not in the famous "shortest possible program
    displaying the behaviour", right?

    > Also note that 'get()' is an unformatted input function.
    > Using it with a text-mode stream (the default) could have
    > unexpected results.


    What unexpected results, could you clarify?

    > If you want to read unformatted,
    > open with 'std::ios::binary'.


    Nope, I want to read formatted. Binary mode is out because
    I don't want to mind CR/LF differences. Plus I'm reading
    trivial configuration text files.

    > But I think you're probably
    > just using the wrong function. See below.
    >


    So what's wrong with get()?

    >
    > It's not surprising when you read the specification of 'std::istream::get()'
    >


    Yes, but I don't have these.

    > No such thing as 'successful read of zero characters.' If characters
    > were requested and none were extracted, that's a 'failure'.


    I see that now, but it's not obvious a'priori. Some read functions
    don't complain about reading zero bytes or seek functions don't
    complain about seeking zero bytes. I mean it's obvious for someone
    who KNOWS already, but for me it was counter-intuitive.

    > ============== begin quote ===========================
    > ISO/IEC 14882:1998(E)
    >
    > 27.6.1.3 Unformatted input functions
    >
    > basic_istream<charT,traits>& get(char_type* s, streamsize n,
    > char_type delim );
    >
    > 7 Effects: Extracts characters and stores them into successive locations
    > of an array whose first element is designated by s. (286) Characters
    > are extracted and stored until any of the following occurs:
    >
    > -- n ­ 1 characters are stored;
    >
    > -- end­of­file occurs on the input sequence (in which case the function
    > calls setstate(eofbit));
    >
    > -- c == delim for the next available input character c(in which case c
    > is not extracted).
    >
    > 8 If the function stores no characters, it calls setstate(failbit) (which
    > may throw ios_base::failure (27.4.4.3)). In any case, it then storesa
    > null character into the next successive location of the array.
    >
    > 9 Returns: *this.
    >
    > basic_istream<charT,traits>& get(char_type* s, streamsize n)
    >
    > 10 Effects: Calls get(s,n,widen('\n'))
    >
    > 11 Returns: Value returned by the call.
    > ============== end quote ===========================
    >


    Yes, that was helpful.

    > I recommend you eschew the array and use std::strings and
    > std::getline to parse your file.
    >
    > std::string s;
    > while(std::getline(in, s))
    > cout << s << '\n';
    >
    > if(!in.eof())
    > cerr << "Error reading\n";
    >
    > Now you don't have to worry if your array is big enough,
    > and you can get at individual characters the same way
    > as from an array, e.g.
    >
    > char c = s[0];


    You overdid that one a bit, Mike :). I am in the middle of
    coding a 'better_ifstream' class which inherits from ifstream
    and supplies facilities like get_string(), get_word(),
    parse_phrase(), etc.
    std::strings are useless in my problem, since I need
    byte-copiable POD types to be transferred across different
    processors in a parallel system. But you couldn't have known that,
    since the rule is to post the shortest code suffering from the
    questioned behaviour, not the *neatest* shortest code, right? :)

    > HTH,
    > -Mike


    yes, the quote did,
    - J.
    Jacek Dziedzic, Aug 27, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. chris
    Replies:
    1
    Views:
    355
    John C. Bollinger
    Aug 19, 2003
  2. John Hunter

    operator double() surprise in cxx

    John Hunter, Apr 30, 2004, in forum: Python
    Replies:
    7
    Views:
    356
    =?iso-8859-1?q?Beno=EEt_Dejean?=
    Apr 30, 2004
  3. Roman Suzi

    2.2 <-> 2.3 surprise

    Roman Suzi, May 31, 2004, in forum: Python
    Replies:
    0
    Views:
    341
    Roman Suzi
    May 31, 2004
  4. Shalabh Chaturvedi

    Re: 2.2 <-> 2.3 surprise

    Shalabh Chaturvedi, May 31, 2004, in forum: Python
    Replies:
    1
    Views:
    353
    Duncan Booth
    Jun 1, 2004
  5. Roman Suzi

    Re: 2.2 <-> 2.3 surprise

    Roman Suzi, Jun 2, 2004, in forum: Python
    Replies:
    2
    Views:
    312
    Roman Suzi
    Jun 5, 2004
Loading...

Share This Page