ifstream::get() surprise

J

Jacek Dziedzic

Hi!

Consider the following program

#include <fstream>
#include <iostream>
using namespace std;

int main() {
ifstream in("test.txt");
char buf[40];
in.get(buf,40);
cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
}

and a file, test.txt, starting with an empty line, ie. a lone EOL
character on the first line.

I was quite surprised to find out, that under these circumstances
the aforementioned program produced
"Read **, trouble: 1".

Why does 'in' go to a fail state? I thought 'get' reads up to
the terminator, stores all characters into 'buf' and leaves the
terminator inside the stream. That would mean 'buf' containing
just a \0 char (no chars read), the EOL still in the stream, but
why a failed state? There are more lines in the file, so we're
not eof(), and my understanding of this situation is a
"successful read of zero characters" rather than "read error".

How then can I distinguish a successfull reading of an empty
line from an I/O error during reading a line? Of course I have
no a priori knowledge if these empty lines exist in my parsed
file or not.

TIA,
- J.
 
V

Victor Bazarov

Jacek said:
Consider the following program

#include <fstream>
#include <iostream>
using namespace std;

int main() {
ifstream in("test.txt");
char buf[40];
in.get(buf,40);
cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
}

and a file, test.txt, starting with an empty line, ie. a lone EOL
character on the first line.

I was quite surprised to find out, that under these circumstances
the aforementioned program produced
"Read **, trouble: 1".

Why does 'in' go to a fail state?

Yes. It does that if it reads no characters.
I thought 'get' reads up to
the terminator, stores all characters into 'buf' and leaves the
terminator inside the stream. That would mean 'buf' containing
just a \0 char (no chars read), the EOL still in the stream, but
why a failed state?

There is no other way to tell you that no characters have been read.
There are more lines in the file, so we're
not eof(), and my understanding of this situation is a
"successful read of zero characters" rather than "read error".
Right.

How then can I distinguish a successfull reading of an empty
line from an I/O error during reading a line? Of course I have
no a priori knowledge if these empty lines exist in my parsed
file or not.

If your stream is in good standing before the operation and has its
'failbit' set after the operation, no characters have been stored. If
the file (stream) has somehow lost integrity, 'badbit' is set. That's
how you distinguish.

Victor
 
J

Jacek Dziedzic

Victor Bazarov:
There is no other way to tell you that no characters have been read.

Why not just store '\0' into the buffer and leave the stream ok?
If your stream is in good standing before the operation and has its
'failbit' set after the operation, no characters have been stored. If
the file (stream) has somehow lost integrity, 'badbit' is set. That's
how you distinguish.

Oh I see. So if someone eg. slips the floppy containg the file
out from the drive, then in.bad() will be true? Yes, that sounds
reasonable!

Thanks for the quick reply,
- J.
 
M

Mike Wahler

Jacek Dziedzic said:
Hi!

Consider the following program

#include <fstream>
#include <iostream>
using namespace std;

int main() {
ifstream in("test.txt");

You need to check here whether the file was opened
successfully or not, and not proceed if it wasn't.

Also note that 'get()' is an unformatted input function.
Using it with a text-mode stream (the default) could have
unexpected results. If you want to read unformatted,
open with 'std::ios::binary'. But I think you're probably
just using the wrong function. See below.
char buf[40];
in.get(buf,40);
cerr << "Read: *" << buf << "*, trouble: " << !in << endl;
}

and a file, test.txt, starting with an empty line, ie. a lone EOL
character on the first line.

I was quite surprised to find out, that under these circumstances
the aforementioned program produced
"Read **, trouble: 1".

It's not surprising when you read the specification of 'std::istream::get()'
Why does 'in' go to a fail state?

By design.
I thought 'get' reads up to
the terminator, stores all characters into 'buf' and leaves the
terminator inside the stream. That would mean 'buf' containing
just a \0 char (no chars read), the EOL still in the stream, but
why a failed state? There are more lines in the file, so we're
not eof(), and my understanding of this situation is a
"successful read of zero characters" rather than "read error".

No such thing as 'successful read of zero characters.' If characters
were requested and none were extracted, that's a 'failure'.


============== begin quote ===========================
ISO/IEC 14882:1998(E)

27.6.1.3 Unformatted input functions

basic_istream<charT,traits>& get(char_type* s, streamsize n,
char_type delim );

7 Effects: Extracts characters and stores them into successive locations
of an array whose first element is designated by s. (286) Characters
are extracted and stored until any of the following occurs:

-- n ­ 1 characters are stored;

-- end­of­file occurs on the input sequence (in which case the function
calls setstate(eofbit));

-- c == delim for the next available input character c(in which case c
is not extracted).

8 If the function stores no characters, it calls setstate(failbit) (which
may throw ios_base::failure (27.4.4.3)). In any case, it then stores a
null character into the next successive location of the array.

9 Returns: *this.

basic_istream<charT,traits>& get(char_type* s, streamsize n)

10 Effects: Calls get(s,n,widen('\n'))

11 Returns: Value returned by the call.
============== end quote ===========================
How then can I distinguish a successfull reading of an empty
line from an I/O error during reading a line? Of course I have
no a priori knowledge if these empty lines exist in my parsed
file or not.

I recommend you eschew the array and use std::strings and
std::getline to parse your file.

std::string s;
while(std::getline(in, s))
cout << s << '\n';

if(!in.eof())
cerr << "Error reading\n";

Now you don't have to worry if your array is big enough,
and you can get at individual characters the same way
as from an array, e.g.

char c = s[0];

HTH,
-Mike
 
T

tom_usenet

Victor Bazarov:

Why not just store '\0' into the buffer and leave the stream ok?

Well, you asked it to get characters, and it failed to get any, I
suppose. It's arguable whether this behaviour is the most useful and
least surprising or not, of course; I think I agree with you that
going into a fail state isn't the intuitive behaviour.

'\0' is stored into the buffer though.
Oh I see. So if someone eg. slips the floppy containg the file
out from the drive, then in.bad() will be true? Yes, that sounds
reasonable!

That's the intent of badbit I think. But floppy disks are
implementation defined - it's possible you just get eof on some
implementations.

Tom
 
J

Jacek Dziedzic

Mike said:
You need to check here whether the file was opened
successfully or not, and not proceed if it wasn't.

Obviously, but not in the famous "shortest possible program
displaying the behaviour", right?
Also note that 'get()' is an unformatted input function.
Using it with a text-mode stream (the default) could have
unexpected results.

What unexpected results, could you clarify?
If you want to read unformatted,
open with 'std::ios::binary'.

Nope, I want to read formatted. Binary mode is out because
I don't want to mind CR/LF differences. Plus I'm reading
trivial configuration text files.
But I think you're probably
just using the wrong function. See below.

So what's wrong with get()?
It's not surprising when you read the specification of 'std::istream::get()'

Yes, but I don't have these.
No such thing as 'successful read of zero characters.' If characters
were requested and none were extracted, that's a 'failure'.

I see that now, but it's not obvious a'priori. Some read functions
don't complain about reading zero bytes or seek functions don't
complain about seeking zero bytes. I mean it's obvious for someone
who KNOWS already, but for me it was counter-intuitive.
============== begin quote ===========================
ISO/IEC 14882:1998(E)

27.6.1.3 Unformatted input functions

basic_istream<charT,traits>& get(char_type* s, streamsize n,
char_type delim );

7 Effects: Extracts characters and stores them into successive locations
of an array whose first element is designated by s. (286) Characters
are extracted and stored until any of the following occurs:

-- n ­ 1 characters are stored;

-- end­of­file occurs on the input sequence (in which case the function
calls setstate(eofbit));

-- c == delim for the next available input character c(in which case c
is not extracted).

8 If the function stores no characters, it calls setstate(failbit) (which
may throw ios_base::failure (27.4.4.3)). In any case, it then storesa
null character into the next successive location of the array.

9 Returns: *this.

basic_istream<charT,traits>& get(char_type* s, streamsize n)

10 Effects: Calls get(s,n,widen('\n'))

11 Returns: Value returned by the call.
============== end quote ===========================

Yes, that was helpful.
I recommend you eschew the array and use std::strings and
std::getline to parse your file.

std::string s;
while(std::getline(in, s))
cout << s << '\n';

if(!in.eof())
cerr << "Error reading\n";

Now you don't have to worry if your array is big enough,
and you can get at individual characters the same way
as from an array, e.g.

char c = s[0];

You overdid that one a bit, Mike :). I am in the middle of
coding a 'better_ifstream' class which inherits from ifstream
and supplies facilities like get_string(), get_word(),
parse_phrase(), etc.
std::strings are useless in my problem, since I need
byte-copiable POD types to be transferred across different
processors in a parallel system. But you couldn't have known that,
since the rule is to post the shortest code suffering from the
questioned behaviour, not the *neatest* shortest code, right? :)
HTH,
-Mike

yes, the quote did,
- J.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

ifstream 1
ifstream 5
std::ifstream multithread 5
ifstream 5
ifstream character read problem 4
ifstream::getline() synatx 18
Ifstream problem 6
ifstream seeking 3

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top