Why getchar() doesn't quit if EOF isn't the first char

Discussion in 'C Programming' started by lovecreatesbea..., Nov 14, 2007.

  1. Thank you for your time.


    #include <stdio.h>

    int main(void)
    {
    int c;

    while ((c = getchar()) != EOF){
    putchar(c);
    fflush(stdout);
    }
    return 0;
    }


    /* [a console interact session]

    aaa^Zbbb [INPUT surrounding ^Z followed by enter]
    [a single enter here]
    aaa [output]

    ^Z [here, ^Z followed by an enter]

    */
     
    lovecreatesbea..., Nov 14, 2007
    #1
    1. Advertisements

  2. lovecreatesbea...

    Mark Bluemel Guest

    wrote:

    [His subject line was "Why getchar() doesn't quit if EOF isn't the first
    char"]

    It's a good habit to put your question in the body of the message. Some
    newsreaders, apparently, may not show the header and body together.

    The C language doesn't, as far as I can see, define the behaviour of
    interactive input streams in any detail. It's therefore down to the host
    environment to define under what circumstances an interactive stream
    indicates end-of-file.

    The behaviour documented for POSIX, according to one of the texts I
    have to hand, is that the EOF character (^D by default) only designates
    end of file if it starts a line of input. It looks like Windows follows
    the same convention.
     
    Mark Bluemel, Nov 14, 2007
    #2
    1. Advertisements

  3. lovecreatesbea...

    Ben Pfaff Guest

    The answer to this question is actually Unix-specific. The best
    answer I have seen is in _The Unix Programming Environment_ by
    Kernighan and Pike. I'd encourage you to obtain a copy, because
    it is a good book. I see that amazon.com has used copies for
    under $10.
     
    Ben Pfaff, Nov 14, 2007
    #3
  4. Since his sample session shows ^Z, not ^D, it's probably not Unix.
     
    Keith Thompson, Nov 14, 2007
    #4
  5. wrote:
    [ Subject: Why getchar() doesn't quit if EOF isn't the first char ]
    EOF isn't a character. It's a value returned by getchar() to indicate an
    end-of-file condition; that value is distinct from any character value.

    Apparently when you enter control-Z in the middle of a line, it doesn't trigger
    and end-of-file condition. The manner in which this condition will be triggered
    depends on your operating system and your C implementation.
     
    Keith Thompson, Nov 14, 2007
    #5
  6. lovecreatesbea...

    CBFalconer Guest

    There is no such thing as "an EOF char". EOF, as returned by such
    routines as getc(), is an out of band value (which is why getc()
    returns an int, not a char). The only thing you know about it is
    that it is negative. This is also why getc returns the int version
    of the unsigned char input.

    On the other hand various systems have ways of persuading a
    terminal to signal EOF. On Unix it is often the CTRL-d char. On
    Windoze if is often the CTRL-z char. On other systems, read the
    docs. These signals are often only effective immediately after a
    '\n', or end-of-line, condition.
     
    CBFalconer, Nov 15, 2007
    #6
  7. Firstly, EOF is not a character at all. It is a special value
    returned by some input functions to indicate the end-of-file condition
    has been detected on the input stream.

    Secondly, ^Z does not, by itself, cause this condition. It is a
    convention used by some systems to allow a stream to simulate this
    condition. But the convention may have restrictions. It is possible
    your system has the restriction that ^Z will only serve this purpose
    if it immediately follows an ENTER.

    If you really want to know what is happening, you should print c as an
    integer (I prefer hex) rather than a character. This way, you will
    see exactly what characters getchar obtains. You might want to change
    the while to a do-while so you see the EOF value also. (Currently you
    may not be able to tell the difference between entering a ^Z and
    entering a ^C.)


    Remove del for email
     
    Barry Schwarz, Nov 15, 2007
    #7
  8. lovecreatesbea...

    Mark Bluemel Guest

    Of course there is in Operating Systems terms, which is what I meant
    here - that should have been clear from my first sentence.

    I'm well aware that there is no EOF character in C.
     
    Mark Bluemel, Nov 19, 2007
    #8
  9. lovecreatesbea...

    Mark Bluemel Guest

    Of course there is in Operating Systems terms, which is what I meant
    here - that should have been clear from my first sentence.

    I'm well aware that there is no EOF character in C.
     
    Mark Bluemel, Nov 19, 2007
    #9
  10. EOF is a condition set on the stream by the OS when no more data is
    available to read.
    I would assume that you're referring to the end-of-file marker that a
    few operating systems use for legacy reasons in text files. This is not
    actually an EOF character, and for reference,

    Unices don't use one. When you PRESS Crtl-D it sets the flag. If you had
    a char with value 0x04 in the stream, it has no effect and is treated as
    an ordinary character.

    Windows/DOS does use one, but its Ctrl-Z. This is a hangover from CP/M I
    think. I'm too lazy to fire up my CPM emulator to find out.

    $ cat test.c
    #include <stdio.h>
    int main()
    {
    int c;
    FILE *s = fopen("test.txt","r");
    c=fgetc(s);
    while (c!=EOF)
    {
    printf("%d %c\n", c,c);
    c = fgetc(s);
    }
    return 0;
    }
    $ hexdump test.txt
    ddddd
    64 64 04 64 64 64 10 00

    $ ./a.out
    100 d
    100 d
    4
    100 d
    100 d
    100 d
    10
     
    Mark McIntyre, Nov 20, 2007
    #10
  11. [...]

    No, EOF is neither a character nor a condition.

    The condition set on a stream when no more data is available to read
    (actually, set *after* an attempt to read more data has failed) is
    called the "end-of-file indicator". The feof() function can be used to
    query this indicator.

    EOF is a macro defined in <stdio.h>. It expands to an integer constant
    expression of type int with a negative value. This value matches the
    value returned by several functions to indicate either an end-of-file
    condition or an error condition.

    EOF stands for End Of File, but EOF and end-of-file are two quite
    different things.
     
    Keith Thompson, Nov 20, 2007
    #11
  12. lovecreatesbea...

    Mark Bluemel Guest

    Nope - I'm refering to the special character recognised by an operating
    system to indicate the end of interactive input.
    "POSIX.1 defines 11 special characters that are handled specially on
    input. SVR4 adds another 6 special characters and 4.3+BSD adds 7."
    (W Richard Stevens "Advanced Programming in the Unix Environment")

    $ man stty
    STTY(1)
    User Commands
    STTY(1)

    NAME
    stty - change and print terminal line settings

    [snip]
    Special characters:
    * dsusp CHAR
    CHAR will send a terminal stop signal once input flushed

    eof CHAR
    CHAR will send an end of file (terminate the input)
    [snip]
     
    Mark Bluemel, Nov 20, 2007
    #12
  13. [/QUOTE]
    In some operating systems. In others, it appears as a character
    marking the end of the file.
    How do you determine whether it's "actually" an EOF character? It's
    just terminology, there's no fact of the matter.

    It's pointless to argue "EOF doesn't mean so-and-so". The term is used
    outside C, and it is natural to refer to non-C uses of it when talking
    about EOF in C.

    -- Richard
     
    Richard Tobin, Nov 20, 2007
    #13
  14. lovecreatesbea...

    santosh Guest

    Is it "EOF"? It's probably safer to say "end-of-file", since "EOF" is an
    identifier defined only by Standard C.

    <snip>
     
    santosh, Nov 20, 2007
    #14
  15. Except when it is not the first character on an input line, or not
    immediately preceded by the same character. In the latter case also
    the preceding occurrence will be removed.
     
    Dik T. Winter, Nov 20, 2007
    #15

  16. *shrug*.

    I gave a worked example showing that Crtl-D, 0x04 is NOT an EOF, and
    will not terminate reading from a file.

    I don't dispute you can press Ctrl-D and send an EOF signal to your
    application, but encountering character 0x04 in a stream is not the same
    as sending that stream a signal to say "end of data reached".

    I have a feeling we had this dull discussion about 12 months ago. I was
    right then too.... :)
     
    Mark McIntyre, Nov 20, 2007
    #16
  17. Because the ASCII character set has no character called EOF. It has an
    EOT and ESC, which occupy the positions commonly associated with the
    control sequences many OSen use to send an end-of-data signal from the
    keyboard.
    EBCDIC also has no EOF character.
    On that basis, nothing is fact.
    Agreed.
     
    Mark McIntyre, Nov 20, 2007
    #17
  18. ^D is indeed EOT (End Of Transmission), but ^Z is not ESC but SUB.
     
    Dik T. Winter, Nov 21, 2007
    #18
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.