end of stream for std::cin

S

Sanyi

What is cin's streambuffer's state and content after the following
(using namespace std)?

string((istreambuf_iterator<char>(cin)),
istreambuf_iterator<char>()) );

I am trying to use this in a loop (to get a 'multiple cat'-like
behavior) and the second time around it returns an empty string
instead of waiting for more input. This makes me think that either the
end-of-stream stays in the buffer or cin or its buffer is in some bad
state. Either way, how can this be fixed to get more input?

Thanks,
S.
 
C

Christopher

What is cin's streambuffer's state and content after the following
(using namespace std)?

string((istreambuf_iterator<char>(cin)),
istreambuf_iterator<char>()) );

I am trying to use this in a loop (to get a 'multiple cat'-like
behavior) and the second time around it returns an empty string
instead of waiting for more input. This makes me think that either the
end-of-stream stays in the buffer or cin or its buffer is in some bad
state. Either way, how can this be fixed to get more input?

Thanks,
S.

What was wrong with using the get unformatted text methods (get or
getline) of cin?
Then checking the state via cin::bad(), cin::fail(), cin::eof(), and
cin::good()?
 
C

Christopher

What is cin's streambuffer's state and content after the following
(using namespace std)?

string((istreambuf_iterator<char>(cin)),
istreambuf_iterator<char>()) );

I am trying to use this in a loop (to get a 'multiple cat'-like
behavior) and the second time around it returns an empty string
instead of waiting for more input. This makes me think that either the
end-of-stream stays in the buffer or cin or its buffer is in some bad
state. Either way, how can this be fixed to get more input?

Thanks,
S.

I must admit I haven't used these iterators, but I can tell from
looking at your code vs some example code I found in a reference:

--------------
// construct an istreambuf_iterator pointing to
// the ofstream object underlying streambuffer
std::istreambuf_iterator<char, std::char_traits<char> > iter
(out.rdbuf ());

// construct an end of stream iterator
const std::istreambuf_iterator<char, std::char_traits<char> > end;

std::cout << std::endl;

// output the content of the file
while (!iter.equal (end))
{
// use both operator++ and operator*
std::cout << *iter++;
}

std::cout << std::endl;
---------------

1) That they construct the begin iterator with the cin's stream buffer
as a parameter where you passed the cin stream itself.
2) You are assuming that the iterator created with the default
constructor somehow magically points to the end of cin's stream
buffer, when in fact it has no clue of anything about cin. How would
it? It seems to only be used for comparison with the first iterator.

Moral of the story is stream buffer iterators differ from your
everyday iterator.
 
S

Sanyi

What was wrong with using the get unformatted text methods (get or
getline) of cin?
Then checking the state via cin::bad(), cin::fail(), cin::eof(), and
cin::good()?

Performance. See e.g http://www.ddj.com/cpp/184401357, or Josuttis'
book about unnecessary overhead with get. I am not sure about getline,
but it seems unnatural to break up the input in lines and then
reconstruct in memory. If there is no better solution, I can certainly
do it this way, but if the most efficient solution is also
conceptually simple, then I prefer to use that one.
I must admit I haven't used these iterators, but I can tell from
looking at your code vs some example code I found in a reference:

--------------
// construct an istreambuf_iterator pointing to
// the ofstream object underlying streambuffer
std::istreambuf_iterator<char, std::char_traits<char> > iter
(out.rdbuf ());

// construct an end of stream iterator
const std::istreambuf_iterator<char, std::char_traits<char> > end;

std::cout << std::endl;

// output the content of the file
while (!iter.equal (end))
{
// use both operator++ and operator*
std::cout << *iter++;

}

std::cout << std::endl;

They are equivalent. The istreambuf_iterator constructed with a stream
argument automatically uses the argument's streambuf.
2) You are assuming that the iterator created with the default
constructor somehow magically points to the end of cin's stream
buffer, when in fact it has no clue of anything about cin. How would
it? It seems to only be used for comparison with the first iterator.
Moral of the story is stream buffer iterators differ from your
everyday iterator.

No, I do not assume that. The istreambuf_iterator constructed without
arguments returns the end-of-stream iterator, which is generic in the
sense that is not tied to any particular stream. The std::cin's input
iterator will be different from this as long as there are more
charcters to read, so my code works on the same principles as yours.
It probably also means that cin's iterator became the end-of-stream
iterator when the ^D (or Ctrl-Z or whatever is the end-of-file on the
given platform) was read.

But that still does not resolve my problem that I am no longer able to
read from cin. I need somehow to tell my program, "I know you read an
EOF, that is perfectly fine, but believe me, cin is still all right
and has more data for you".

S.
 
J

James Kanze

What is cin's streambuffer's state and content after the following
(using namespace std)?
string((istreambuf_iterator<char>(cin)),
istreambuf_iterator<char>()) );
I am trying to use this in a loop (to get a 'multiple cat'-like
behavior) and the second time around it returns an empty string
instead of waiting for more input.

What else do you expect. The first time around, you read until
end of file. What do you expect to read further once you've
reached end of file?
This makes me think that either the end-of-stream stays in the
buffer or cin or its buffer is in some bad state.

Once you've reached end of file, you've reached end of file,
yes. And once a read has failed, the istream remains in failed
state until you clear the error.
Either way, how can this be fixed to get more input?

What more input do you expect? It's easy to clear the failbit:
cin.clear(). But you're still at end of file, and there is
nothing more to read.
 
C

Christopher

No, I do not assume that. The istreambuf_iterator constructed without
arguments returns the end-of-stream iterator, which is generic in the
sense that is not tied to any particular stream. The std::cin's input
iterator will be different from this as long as there are more
charcters to read, so my code works on the same principles as yours.

I am not too sure about that. I admit I am wading in unfamiliar
territory. So, I am not saying you are wrong, but I would like some
clarity in this statement for my own reasons.

I would think that they are _only_ equivalent if the internals of the
std::string constructor you are using:
........
copies one character at a time until the begin iterator is not equal
to the end iterator. However, I don't know if there is a guarantee on
that.

If we have that guarantee then I see your point.

I still wonder though, why you want to read past the EOF. If you've
reached the EOF, than how is there more to read?
This is a little fuzzy to me because we are using a stream that is not
filled until input is provided (by the user or pipe or what have you).
So, to "reuse" it you want to allow for that new input to be provided
again and stored in the stream. Does that std::string constructor call
result in the user being prompted for input? What happens in the case
where input is piped in?
I suppose more context is needed. If you are using it in a loop than I
can't imagine anything other than expecting the user to provide input
vs it working with piped input. In that case I believe you would want
to disgard its contents via cin::ignore and cin::clear the stream
error bits.
 
S

Sanyi

And once a read has failed, the istream remains in failed
state until you clear the error.

This is not correct. When working with the streambuf directly, the
stream's state flags do not get set. Just to be sure, I checked, and
cin itself is still in a good state after the read.
For the same reason cin.clear() will not work.
What more input do you expect? It's easy to clear the failbit:
cin.clear(). But you're still at end of file, and there is
nothing more to read.

Huh? We are talking about the console input here. Last time I checked,
when I pressed ^D, my keyboard did not disappear in a cloud of smoke,
neither did it refuse to accept any more input. It is also easy to
imagine a similar need with files, say in a log-file monitoring
program. In one read pass I might reach the end-of-file, but by the
next pass some external process might have added to it. End of file is
not end of life.

S.
 
S

Sanyi

I am not too sure about that. I admit I am wading in unfamiliar
territory. So, I am not saying you are wrong, but I would like some
clarity in this statement for my own reasons.

I would think that they are _only_ equivalent if the internals of the
std::string constructor you are using:


...
copies one character at a time until the begin iterator is not equal
to the end iterator. However, I don't know if there is a guarantee on
that.

If we have that guarantee then I see your point.

I agree with you, I am assuming that the string constructor works the
way you say, and this might be unsafe. I will try to rewrite into a
loop, but I am not sure it will help. I am also on fairly unfamiliar
territory, and trying to figure out the details to see why it does not
work.
I still wonder though, why you want to read past the EOF. If you've
reached the EOF, than how is there more to read?

Well, the idea is that user pastes large chunk of text, presses end-of-
file and then the chunk is processed and then the program returns
waiting for more input. As I said just like in *nix's cat, just
several time over. Probably it can be designed differently, but that's
off topic here, and rather not give up on it just because my lack of
understanding.
This is a little fuzzy to me because we are using a stream that is not
filled until input is provided (by the user or pipe or what have you).
So, to "reuse" it you want to allow for that new input to be provided
again and stored in the stream. Does that std::string constructor call
result in the user being prompted for input?

This is a good point, and might be at the heart of the problem; first
time yes, second time around, no.
What happens in the case
where input is piped in?

Same thing; the piped file gets read and processed, but in the second
pass the empty string is read automatically.
I suppose more context is needed. If you are using it in a loop than I
can't imagine anything other than expecting the user to provide input
vs it working with piped input. In that case I believe you would want
to disgard its contents via cin::ignore and cin::clear the stream
error bits.

I am not sure I understand this.

S.
 
S

Sanyi

I am not too sure about that. I admit I am wading in unfamiliar
territory. So, I am not saying you are wrong, but I would like some
clarity in this statement for my own reasons.

I would think that they are _only_ equivalent if the internals of the
std::string constructor you are using:


...
copies one character at a time until the begin iterator is not equal
to the end iterator. However, I don't know if there is a guarantee on
that.

If we have that guarantee then I see your point.

I agree with you, I am assuming that the string constructor works the
way you say, and this might be unsafe. I will try to rewrite into a
loop, but I am not sure it will help. I am also on fairly unfamiliar
territory, and trying to figure out the details to see why it does not
work.
I still wonder though, why you want to read past the EOF. If you've
reached the EOF, than how is there more to read?

Well, the idea is that user pastes large chunk of text, presses end-of-
file and then the chunk is processed and then the program returns
waiting for more input. As I said just like in *nix's cat, just
several time over. Probably it can be designed differently, but that's
off topic here, and rather not give up on it just because my lack of
understanding.
This is a little fuzzy to me because we are using a stream that is not
filled until input is provided (by the user or pipe or what have you).
So, to "reuse" it you want to allow for that new input to be provided
again and stored in the stream. Does that std::string constructor call
result in the user being prompted for input?

This is a good point, and might be at the heart of the problem; first
time yes, second time around, no.
What happens in the case
where input is piped in?

Same thing; the piped file gets read and processed, but in the second
pass the empty string is read automatically.
I suppose more context is needed. If you are using it in a loop than I
can't imagine anything other than expecting the user to provide input
vs it working with piped input. In that case I believe you would want
to disgard its contents via cin::ignore and cin::clear the stream
error bits.

I am not sure I understand this.

S.
 
J

James Kanze

This is not correct.

It's what the standard says, and what all conformant
implementations do.
When working with the streambuf directly, the
stream's state flags do not get set.

Obviously, since there is no stream.

So my comment concerning istream isn't relevant to the example.
Sorry about that.
Just to be sure, I checked, and cin itself is still in a good
state after the read. For the same reason cin.clear() will
not work.
Huh? We are talking about the console input here.

Are we? I thought we were talking about cin. Which may or may
not be console input.
Last time I checked, when I pressed ^D, my keyboard did not
disappear in a cloud of smoke, neither did it refuse to accept
any more input.

Last time I checked on a Windows machine, if I pressed ^D, my
program read a ^D.

Under Unix, the convention is that ^D will terminate any read
(system level) in progress, immediately. (That's not actually
true, since you can easily configure this to be any character
you want. But ^D is the default.) Under Unix, the convention
is also that a system level read which returns 0 bytes read is
an end of file. Period. A well written filebuf which reads 0
bytes will set an internal flag, and always return EOF
afterwards. (Note that the conventions under Windows are very
different: a ^Z in a text file IS end of file. Period. In
every text file, not just console input, and regardless of where
it appears, even in the middle of a line.)

The standard is somewhat vague about this, but in practice, if
anything is to work, any calls to streambuf::sgets() without an
intervening streambuf::sbumps() or streambuf::snexts() must
return the same value. And this means that the streambuf must
have internal state, and memorize the end of file.

(Pragmatically, I follow the rule "be liberal in what you
accept, conservative in what you require". My streambuf classes
do memorize end of file, even if they are filtering streambuf's,
and get it from another streambuf. And my >> operators make a
point of never calling any of the read functions of streambuf
once I've seen EOF---and set eofbit---just in case the streambuf
doesn't use a flag. Experience with a number of different
implementations of iostream have lead me to be very cautious
about such things.)
It is also easy to imagine a similar need with files, say in a
log-file monitoring program. In one read pass I might reach
the end-of-file, but by the next pass some external process
might have added to it. End of file is not end of life.

No, but it is a definite state for a stream. The C++ I/O model
doesn't take into account the fact that file data might change
while the file is being read.

Of course, a specific implementation might offer additional
functions to reset this sort of state (and clear the
buffer---other processes might write into the middle of a file
as well), forcing the filebuf to retry next time around. But
it's very implementation dependent.

If you're really only concerned with Unix, the obvious solution
is to open "/dev/tty" each time. But such an interface
definitly conflicts with the Unix conventions, where the user
expects a ^D at the start of a line to be treated as an end of
file, with no more input coming from the console.
 
J

James Kanze

On Feb 22, 7:20 am, Sanyi <[email protected]> wrote:

[...]
Moral of the story is stream buffer iterators differ from your
everyday iterator.

istreambuf_iterator is an input iterator. It fulfills all of
the requirements of an input iterator. It's not like a foreward
iterator, of course, but that's why there are different
categories.
 
J

James Kanze

I am not too sure about that. I admit I am wading in
unfamiliar territory. So, I am not saying you are wrong, but I
would like some clarity in this statement for my own reasons.
I would think that they are _only_ equivalent if the internals of the
std::string constructor you are using:
...
copies one character at a time until the begin iterator is not
equal to the end iterator. However, I don't know if there is a
guarantee on that.

And what else should it do? The standard is very clear about
this. I don't see how you could have any doubts.
If we have that guarantee then I see your point.

[...]
I still wonder though, why you want to read past the EOF. If
you've reached the EOF, than how is there more to read?

There isn't. By definition. His problem (I think) is that
under Unix, there is no real EOF from a tty. There is only a
convention: a low level system read is terminated by either a
newline or a ^D. (What is considered a newline, and for that
matter, what character is used for the ^D, is configurable.)
The '\n' is transmitted as part of the buffer read; the ^D is
not (which means that if the ^D is the first character in a
line, read() will return 0 characters read). Since Unix doesn't
have any other means of signaling end of file, reading 0 bytes
is considered by convention end of file. With the result that a
user entering text at a keyboard expects ^D to be treated as end
of file.
This is a little fuzzy to me because we are using a stream
that is not filled until input is provided (by the user or
pipe or what have you). So, to "reuse" it you want to allow
for that new input to be provided again and stored in the
stream. Does that std::string constructor call result in the
user being prompted for input?

Obviously not. He has to take care of that upstream in his
program.
What happens in the case where input is piped in?

His program won't work. Another Unix convention is that you
only get end of file on a pipe if there are no possible future
writes.

I think he's trying to play games at the wrong level. First, I
think he's violating Unix conventions, and will surprise his
users. (I hope, at least, that he's verified that standard in
isatty().) But if you know what you are doing, you can play
such games at the Posix API level---the behavior at that level
is very well defined (even if what he wants to do doesn't
correspond to the usual convensions in the milieu). It should
be very simple for him to write a streambuf which will do
exactly what he wants.
I suppose more context is needed. If you are using it in a
loop than I can't imagine anything other than expecting the
user to provide input vs it working with piped input.

Exactly. And of course, one of the most omnipresent Unix
conventions is that any program which reads from standard in can
receive its input from a pipe or a file. It's really one of
the most fundamental conventions in the Unix world.
In that case I believe you would want to disgard its contents
via cin::ignore and cin::clear the stream error bits.

As he pointed out in his response to me, he isn't actually using
cin. Just the streambuf which is attached to it. And his only
real problem is that the implementation of that streambuf
(actually a filebuf) enforces the usual Unix conventions.
 
P

Pavel

Sanyi wrote:
...
Huh? We are talking about the console input here. Last time I checked,
when I pressed ^D, my keyboard did not disappear in a cloud of smoke,
neither did it refuse to accept any more input. It is also easy to
imagine a similar need with files, say in a log-file monitoring
program. In one read pass I might reach the end-of-file, but by the
next pass some external process might have added to it. End of file is
not end of life.
The continuous log file "tailing" problem of the kind you describe can
be solved in C++, because the file is not closed. You can use
streambuf::in_avail() to implement your own tailing in C++ and you won't
even need to mess with the flags.

In your original problem, the behavior of the "underlying sequence" of a
stream buffer on which eof occurred is not defined by the Standard and I
believe it cannot be defined usefully, because it is more as the
sequence is the active piece here (that is, the agent that creates or at
least originally detects eof condition), rather than your C++ program.
In your case, when the user presses Ctrl-D, s/he instructs the terminal
(which is not a part of your program) to stop sending characters to your
process for good -- and I am not aware of a standard way for a program
to re-connect to any terminal again after this happened.

What I believe you are trying to achieve is to implement some protocol
to allow the user to instruct the program to process the piece of data
s/he already entered and then wait for another piece. Ctrl-D will not
work for this and maybe it is not at all bad because, if it did, you
would have another problem -- how to notify the program that the user is
done-done. You need to introduce your own protocol.

This problem is not new either, so there used to be some "standards" in
UNIX (and not only) for this part of console user interface and some of
your older users may be comfortable if you use these standards. For
example I vaguely remember these (it was so long ago that I do not
remember what programs did it, one with dot was probably some primitive
mail client):

1. Enter a text line with a single dot character (these days, ClearCase
command-line tool does it, too)
2. Enter 2 empty lines in a row

Then, to actually "disconnect" of your program user would have to enter
Ctrl-D or maybe enter 2 dot-only lines in a row or whatever you decide.

Hope this will help
-Pavel
 
J

James Kanze

Sanyi wrote:
..
The continuous log file "tailing" problem of the kind you
describe can be solved in C++, because the file is not closed.
You can use streambuf::in_avail() to implement your own
tailing in C++ and you won't even need to mess with the flags.

Have you tried it? Or read the documentation of in_avail(), to
know what it really does? It's not guaranteed to give you a
useful answer at all, and I've seen more than one system where
it doesn't.

[...]
1. Enter a text line with a single dot character (these days,
ClearCase command-line tool does it, too)

That's the traditional Unix convention.

Obviously, if the text you're entering might reasonably contain
a line with a single dot, then it won't work. But unless that's
the case, that's definitly what you should be doing under Unix.
2. Enter 2 empty lines in a row

Also a good alternative (IMHO, at least).
Then, to actually "disconnect" of your program user would
have to enter Ctrl-D or maybe enter 2 dot-only lines in a row
or whatever you decide.

Well, I don't think you really understand in_avail, but you do
show a good deal of good sense with regards to the application
domain. (Which is more important. When in_avail fails you, you
doubtlessly know where to read about what it does exactly.)
 
S

Sanyi

Thanks for all for the enlightening replies, and sorry if I made some
rash statements.
In your original problem, the behavior of the "underlying sequence" of a
stream buffer on which eof occurred is not defined by the Standard and I
believe it cannot be defined usefully, because it is more as the
sequence is the active piece here (that is, the agent that creates or at
least originally detects eof condition), rather than your C++ program.
In your case, when the user presses Ctrl-D, s/he instructs the terminal
(which is not a part of your program) to stop sending characters to your
process for good -- and I am not aware of a standard way for a program
to re-connect to any terminal again after this happened.

That pretty much clears everything up. I had the wrong idea that the
end-of-file is just another character in the input stream that can be
neglected by my program if it wishes. (The reason why I chose it as
terminator was that in a first version my chunk of text was indeed
piped in, and only later I wanted to extend it to a interactive multi-
chunk piece. Also it seemed to me that is an easy way to abstract some
differences btw linux and windows/dos.)
I see now, that I better use some other terminating sequence as
recommended below.

As for respecting unix conventions, I will remember next time. Given
the very limited user space, in this case it should have not been a
problem.

Thanks,
S.
 
S

Sanyi

Thanks for all for the enlightening replies, and sorry if I made some
rash statements.
In your original problem, the behavior of the "underlying sequence" of a
stream buffer on which eof occurred is not defined by the Standard and I
believe it cannot be defined usefully, because it is more as the
sequence is the active piece here (that is, the agent that creates or at
least originally detects eof condition), rather than your C++ program.
In your case, when the user presses Ctrl-D, s/he instructs the terminal
(which is not a part of your program) to stop sending characters to your
process for good -- and I am not aware of a standard way for a program
to re-connect to any terminal again after this happened.

That pretty much clears everything up. I had the wrong idea that the
end-of-file is just another character in the input stream that can be
neglected by my program if it wishes. (The reason why I chose it as
terminator was that in a first version my chunk of text was indeed
piped in, and only later I wanted to extend it to a interactive multi-
chunk piece. Also it seemed to me that is an easy way to abstract some
differences btw linux and windows/dos.)
I see now, that I better use some other terminating sequence as
recommended below.

As for respecting unix conventions, I will remember next time. Given
the very limited user space, in this case it should have not been a
problem.

Thanks,
S.
 
S

Sanyi

Thanks for all for the enlightening replies, and sorry if I made some
rash statements.
In your original problem, the behavior of the "underlying sequence" of a
stream buffer on which eof occurred is not defined by the Standard and I
believe it cannot be defined usefully, because it is more as the
sequence is the active piece here (that is, the agent that creates or at
least originally detects eof condition), rather than your C++ program.
In your case, when the user presses Ctrl-D, s/he instructs the terminal
(which is not a part of your program) to stop sending characters to your
process for good -- and I am not aware of a standard way for a program
to re-connect to any terminal again after this happened.

That pretty much clears everything up. I had the wrong idea that the
end-of-file is just another character in the input stream that can be
neglected by my program if it wishes. (The reason why I chose it as
terminator was that in a first version my chunk of text was indeed
piped in, and only later I wanted to extend it to a interactive multi-
chunk piece. Also it seemed to me that is an easy way to abstract some
differences btw linux and windows/dos.)
I see now, that I better use some other terminating sequence as
recommended below.

As for respecting unix conventions, I will remember next time. Given
the very limited user space, in this case it should have not been a
problem.

Thanks,
S.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top