Stream states questions

Discussion in 'C++' started by john, Sep 8, 2007.

  1. john

    john Guest

    I am reading TC++PL3 and in "21.3.3 Stream State", 4 member functions
    returning bool are mentioned:


    template <class Ch, class Tr= char_traits<Ch> >
    class basic_ios: public ios_base {
    public:
    // ...
    bool good() const; // next operation might succeed
    bool eof() const; // end of input seen
    bool fail() const; // next operation will fail
    bool bad() const; // stream is corrupted

    [...]
    };



    It is also mentioned:

    "If the state is good() the previous input operation succeeded. If the
    state is good(), the next input operation might succeed; otherwise, it
    will fail.

    ==> Applying an input operation to a stream that is not in the good()
    state is a null operation as far as the variable being read into is
    concerned."

    Q1: What does the above mean?


    ==> "If we try to read into a variable v and the operation fails,"

    Q2: Does this mean that fail() becomes true after this?


    "the value of v should be unchanged (it is unchanged if v is a variable
    of one of the types handled by istream or ostream member functions). The
    difference between the states fail() and bad() is subtle. When the state
    is fail() but not also bad(), it is assumed that the stream is
    uncorrupted and that no characters have been lost. When the state is
    bad(), all bets are off".


    Q3: When !good() is true, is fail() always true?
     
    john, Sep 8, 2007
    #1
    1. Advertisements

  2. * john:
    It means that in an ungood state, stream input and output operations do
    nothing at all (except possibly throwing exceptions, if you have turned
    that on).


    No.

    It helps to consider that fail() checks one bit in a set of possible
    problem flags: badbit, eofbit and failbit. good() is not a separate
    bit, and it's not the inverted fail bit: good() says that /all/ the
    three problem bits are zero. Alles in ordnung.

    But note that the only way eof() is set automatically, is by failing to
    read beyond end of file, which tends to also set the fail bit.

    Also note that operator void* (used for conversion to logical boolean)
    and operator! (ditto) just check the failbit.

    In other words, let s be a stream object, then if(s){...} is not the
    same as if(s.good()){...}, it's the same as if(!s.fail()){...}. It's
    all very perplexing. But, remember, you can just say NO to iostreams.

    Cheers, & hth.,

    - Alf
     
    Alf P. Steinbach, Sep 8, 2007
    #2
    1. Advertisements

  3. It means that if you try to read from a file into a variable 'value' and
    good() == false the data in 'value' will not be changed.
    Yes, there are three flags associated with the state of the stream, eof,
    fail, and bad. If none of those are set then good() == true.
     
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Sep 8, 2007
    #3
  4. Correction, the above sentence should be "No, there are three flags...".
     
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Sep 8, 2007
    #4
  5. john

    john Guest

    So, do you suggest going back to the C subset fopen(), etc?
     
    john, Sep 8, 2007
    #5
  6. john

    john Guest

    I think that an implementation should understand when the end of file is
    encountered. Also I am still learning C++, but doesn't it make sense
    that the next read will fail after end of file is encountered?

    Also, if a stream's bad() is true, doesn't that imply that fail() should
    be true in real world?


    Also, what about using "if (s.good())" and working with the rest eof(),
    fail(), and bad() when the statement becomes false and we want to?

    The C subset functions return NULL in case of errors, we can check
    "s.good()" alone too.
     
    john, Sep 8, 2007
    #6
  7. Have you found one real advantage of iostreams over the 'C subset'?
     
    Roland Pibinger, Sep 8, 2007
    #7
  8. john

    john Guest


    I am still reading TC++PL3, but so far I liked istreams and ostreams.
    For example, the simplicity of getting whitespace separated strings:


    string s;

    do
    {
    cin>>s;

    // ...
    }while(cin);


    or even this:

    char v[4];

    cin.width(4);

    cin>> v;

    (it reads 3 characters max and adds 0 in the end).


    Isn't this more high level and elegant than the C subset I/O facilities?
     
    john, Sep 8, 2007
    #8
  9. john

    Jerry Coffin Guest

    scanf("%s", s);

    does essentially the same thing, from the viewpoint of the stream. The
    one real difference is attributable to the string -- that it resizes
    itself as needed to accomodate the data being read.
    scanf("%3s", s);

    or:

    fgets(buffer, 4, stdin);

    Don't get me wrong: I'm not arguing that iostreams lack advantages --
    only that the things you've cited don't (directly) show much advantage
    for them.
     
    Jerry Coffin, Sep 8, 2007
    #9
  10. john

    john Guest


    Yes, but this leaves room for overflow, while C++ new I/O facilities
    together with the other high level facilities are more elegant, safe and
    convenient (as far as I have read until now).

    Yes, which is nice, safe and convenient. If string gets more than it can
    accommodate it throws a length_error exception. No way for overflow here.


    Well, the cin way looks more high level and convenient to me.
     
    john, Sep 8, 2007
    #10
  11. * john:
    Sense and sense... There have been long discussions here and in the
    moderated group about when exactly eofbit and failbit "should" be set
    according to the standard, and when they're actually set with given
    implementations. That aside, if the streams are set up to generate
    exceptions on failure, then you're not guaranteed to avoid exception on
    detecting end of file, which makes that feature rather useless.

    I'd think so, but fail() only checks the failbit. If failbit were
    equivalent to eofbit||badbit, then failbit would be redundant,
    meaningless. So there must be situations where either eofbit doesn't
    imply failbit, or badbit doesn't imply failbit. My feelings about
    iostreams are such that I have not investigated what those situations are.


    The convention is to use "!s.fail()", expressed as "s".

    The C functions are simple and more efficient but not type safe.
    However, the iostreams' formatted input is defined in terms of C fscanf,
    and inherits the Undefined Behavior of fscanf. For example, when
    inputting hex -- yes, iostreams have built-in Undefined Behavior!

    The problem is that in the C and C++ standards there are no generally
    good i/o facilities (I think Posix defines the old Unix open etc. as a
    generally clean and good but very very low-level i/o facility).

    For very simple tasks the iostreams are good because they're relatively
    safe, thus, well suited to typical beginner's experimental programs, but
    bring in e.g. input of hex or numbers, or text handling like
    uppercasing, or anything not very simple, and the complexity,
    inefficiency, verbosity and unsafety meet you head on.


    Cheers, and sorry I don't have any more positive advice,

    - Alf
     
    Alf P. Steinbach, Sep 9, 2007
    #11
  12. john

    john Guest

    I think istream can be defined to input hex and other bases (there is a
    flag basefield that is mentioned in later pages from where I am now). In
    TC++PL3, in what I have read it is mentioned:

    "The format expected for input is specified by the current locale
    (21.7). By default, the bool values true and false are represented by 1
    and 0, respectively. Integers must be decimal and floating-point numbers
    of the form used to write them in a C++ program. By setting basefield
    (21.4.2), it is possible to read 0123 as an octal number with the
    decimal value 83 and 0xff as a hexadecimal number with the decimal value
    255. The format used to read pointers is completely
    implementation-dependent (have a look to see what your implementation
    does)".

    I think you gotta have a look at TC++PL3. Uppercasing is easy by using
    toupper().
     
    john, Sep 9, 2007
    #12
  13. A rare day when I get to correct Alf, but fail() returns true if either
    failbit or badbid is set. I'm not sure however, if badbit can be set
    without failbit being set.
    It is my hope that using variadic templates will allow rewriting many of
    the C IO-functions in a typesafe manner, Douglas Gregor hinted at this
    in N2087 (A Brief Introduction to Variadic Templates).
     
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Sep 9, 2007
    #13
  14. * Erik Wikström:
    Mea culpa. Sorry. There is a note (in the standard) explaining that it
    does that because of historical practice.

    I'm not sure either. ;-)

    Cheers, & thanks,

    - Alf
     
    Alf P. Steinbach, Sep 9, 2007
    #14
  15. * john:
    Yes, it can do that.

    The point is that if the user types something unexpected, the result is
    (by the definitional reliance on fscanf) Undefined Behavior -- which
    you can try out easily on e.g. older Visual C++ compilers, and perhaps
    even 8.0.

    UB on invalid input is just not the thing one would expect from a "type
    safe" i/o facility.

    Uppercasing is one of the thorniest problems in data processing, in
    general, because natural languages have all sorts of irrational rules.

    But it is of course easy if you restrict yourself to the English
    alphabet, /and/ is happy with using only the C library.

    More general C++ standard library compatible uppercasing functionality
    is provided by e.g. the Boost library.


    Cheers, & hth.,

    - Alf
     
    Alf P. Steinbach, Sep 9, 2007
    #15
  16. john

    john Guest

    I suppose we do not have undefined behaviour since we have the
    istream/ostream::good(),fail(),bad(),eof() etc facilities. So if an
    hexadecimal is expected and a decimal is entered, we get

    cin.good()==false.

    Yes, however we have wchar_t and its facilities. In most systems
    supporting Unicode, wchar_t is Unicode character type so in
    <cwctype>/<wctype.h> we have towupper() which does the job.
     
    john, Sep 9, 2007
    #16
  17. john

    john Guest

    TC++PL3 mentions: "When the state is fail() but not also bad(), it is
    assumed that the stream is uncorrupted and that no characters have been
    lost. When the state is bad(), all bets are off".

    Based on this, I assume that fail() checks failbit only. So is this
    TC++PL3 errata or something?
     
    john, Sep 9, 2007
    #17
  18. * john:
    Sorry, your supposition is incorrect. First, a decimal number
    specification is also valid as a hexadecimal one. Second, there is UB.

    Sorry, again (although it depends on what you mean by "does the job"):
    for the general case that function has too little information to go on,
    and has a too limited signature to be able to do the job in principle.

    As an example, uppercase of German "ß" (small letter sharp s) is "SS"
    which, as you may note, consists of /two/ characters, whereas towupper
    is a function that at most can produces /one/ character. With at least
    one implementation it just produces "ß" as uppercase. And ß" is lowercase.

    Cheers, & hth.,

    - Alf
     
    Alf P. Steinbach, Sep 9, 2007
    #18
  19. Nope, it is just you interpretation that is wrong. Note the word *also*,
    which says that the failbit can be set without badbit, but tells us
    nothing about the reverse.
     
    =?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Sep 9, 2007
    #19
  20. john

    john Guest

    In my system (CentOS, GCC and Anjuta) the code:

    #include <iostream>

    int main()
    {
    using namespace std;

    int x;

    cin>> x;

    cout<< boolalpha<< cin.good()<< endl;

    cin>> x;

    cout<< boolalpha<< cin.good()<< endl;

    if( cin.good() )
    cout<< "x= "<< x<< endl;
    }



    At first I enter a decimal, in second an hexadecimal (while decimal is
    expected):


    [[email protected] src]$ ./foobar-cpp
    1
    true
    0xff
    true
    x= 0

    [[email protected] src]$

    It looks like it reads the first digital character.


    [[email protected] src]$ ./foobar-cpp
    1
    true
    xff
    false

    [[email protected] src]$

    I think there is no more space for UB, than the C subset I/O.



    OK. In any case the C subset I/O doesn't provide something more than
    itself, and as far as I know iostreams do not provide something
    equivalent (and more inferior) either.
     
    john, Sep 9, 2007
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.