Why getchar() doesn't quit if EOF isn't the first char

Discussion in 'C Programming' started by lovecreatesbea...@gmail.com, Nov 14, 2007.

  1. Guest

    Thank you for your time.


    #include <stdio.h>

    int main(void)
    {
    int c;

    while ((c = getchar()) != EOF){
    putchar(c);
    fflush(stdout);
    }
    return 0;
    }


    /* [a console interact session]

    aaa^Zbbb [INPUT surrounding ^Z followed by enter]
    [a single enter here]
    aaa [output]

    ^Z [here, ^Z followed by an enter]

    */
     
    , Nov 14, 2007
    #1
    1. Advertising

  2. Mark Bluemel Guest

    wrote:

    [His subject line was "Why getchar() doesn't quit if EOF isn't the first
    char"]

    It's a good habit to put your question in the body of the message. Some
    newsreaders, apparently, may not show the header and body together.

    The C language doesn't, as far as I can see, define the behaviour of
    interactive input streams in any detail. It's therefore down to the host
    environment to define under what circumstances an interactive stream
    indicates end-of-file.

    The behaviour documented for POSIX, according to one of the texts I
    have to hand, is that the EOF character (^D by default) only designates
    end of file if it starts a line of input. It looks like Windows follows
    the same convention.
     
    Mark Bluemel, Nov 14, 2007
    #2
    1. Advertising

  3. Ben Pfaff Guest

    "" <> writes:
    > Subject: Why getchar() doesn't quit if EOF isn't the first char


    The answer to this question is actually Unix-specific. The best
    answer I have seen is in _The Unix Programming Environment_ by
    Kernighan and Pike. I'd encourage you to obtain a copy, because
    it is a good book. I see that amazon.com has used copies for
    under $10.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Nov 14, 2007
    #3
  4. Ben Pfaff <> writes:
    > "" <> writes:
    >> Subject: Why getchar() doesn't quit if EOF isn't the first char

    >
    > The answer to this question is actually Unix-specific. The best
    > answer I have seen is in _The Unix Programming Environment_ by
    > Kernighan and Pike. I'd encourage you to obtain a copy, because
    > it is a good book. I see that amazon.com has used copies for
    > under $10.


    Since his sample session shows ^Z, not ^D, it's probably not Unix.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 14, 2007
    #4
  5. wrote:
    [ Subject: Why getchar() doesn't quit if EOF isn't the first char ]
    > #include <stdio.h>
    >
    > int main(void)
    > {
    > int c;
    >
    > while ((c = getchar()) != EOF){
    > putchar(c);
    > fflush(stdout);
    > }
    > return 0;
    > }
    >
    >
    > /* [a console interact session]
    >
    > aaa^Zbbb [INPUT surrounding ^Z followed by enter]
    > [a single enter here]
    > aaa [output]
    >
    > ^Z [here, ^Z followed by an enter]
    >
    > */


    EOF isn't a character. It's a value returned by getchar() to indicate an
    end-of-file condition; that value is distinct from any character value.

    Apparently when you enter control-Z in the middle of a line, it doesn't trigger
    and end-of-file condition. The manner in which this condition will be triggered
    depends on your operating system and your C implementation.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 14, 2007
    #5
  6. CBFalconer Guest

    Mark Bluemel wrote:
    >

    .... snip ...
    >
    > The behaviour documented for POSIX, according to one of the texts
    > I have to hand, is that the EOF character (^D by default) only
    > designates end of file if it starts a line of input. It looks
    > like Windows follows the same convention.


    There is no such thing as "an EOF char". EOF, as returned by such
    routines as getc(), is an out of band value (which is why getc()
    returns an int, not a char). The only thing you know about it is
    that it is negative. This is also why getc returns the int version
    of the unsigned char input.

    On the other hand various systems have ways of persuading a
    terminal to signal EOF. On Unix it is often the CTRL-d char. On
    Windoze if is often the CTRL-z char. On other systems, read the
    docs. These signals are often only effective immediately after a
    '\n', or end-of-line, condition.

    --
    Chuck F (cbfalconer at maineline dot net)
    <http://cbfalconer.home.att.net>
    Try the download section.



    --
    Posted via a free Usenet account from http://www.teranews.com
     
    CBFalconer, Nov 15, 2007
    #6
  7. On Wed, 14 Nov 2007 06:01:22 -0800, ""
    <> wrote:

    >Thank you for your time.
    >
    >
    >#include <stdio.h>
    >
    >int main(void)
    >{
    > int c;
    >
    > while ((c = getchar()) != EOF){
    > putchar(c);
    > fflush(stdout);
    > }
    > return 0;
    >}
    >
    >
    >/* [a console interact session]
    >
    >aaa^Zbbb [INPUT surrounding ^Z followed by enter]
    > [a single enter here]
    >aaa [output]
    >
    >^Z [here, ^Z followed by an enter]
    >
    >*/


    Firstly, EOF is not a character at all. It is a special value
    returned by some input functions to indicate the end-of-file condition
    has been detected on the input stream.

    Secondly, ^Z does not, by itself, cause this condition. It is a
    convention used by some systems to allow a stream to simulate this
    condition. But the convention may have restrictions. It is possible
    your system has the restriction that ^Z will only serve this purpose
    if it immediately follows an ENTER.

    If you really want to know what is happening, you should print c as an
    integer (I prefer hex) rather than a character. This way, you will
    see exactly what characters getchar obtains. You might want to change
    the while to a do-while so you see the EOF value also. (Currently you
    may not be able to tell the difference between entering a ^Z and
    entering a ^C.)


    Remove del for email
     
    Barry Schwarz, Nov 15, 2007
    #7
  8. Mark Bluemel Guest

    CBFalconer wrote:
    > Mark Bluemel wrote:
    > ... snip ...
    >> The behaviour documented for POSIX, according to one of the texts
    >> I have to hand, is that the EOF character (^D by default) only
    >> designates end of file if it starts a line of input. It looks
    >> like Windows follows the same convention.

    >
    > There is no such thing as "an EOF char".


    Of course there is in Operating Systems terms, which is what I meant
    here - that should have been clear from my first sentence.

    I'm well aware that there is no EOF character in C.
     
    Mark Bluemel, Nov 19, 2007
    #8
  9. Mark Bluemel Guest

    CBFalconer wrote:
    > Mark Bluemel wrote:
    > ... snip ...
    >> The behaviour documented for POSIX, according to one of the texts
    >> I have to hand, is that the EOF character (^D by default) only
    >> designates end of file if it starts a line of input. It looks
    >> like Windows follows the same convention.

    >
    > There is no such thing as "an EOF char".


    Of course there is in Operating Systems terms, which is what I meant
    here - that should have been clear from my first sentence.

    I'm well aware that there is no EOF character in C.
     
    Mark Bluemel, Nov 19, 2007
    #9
  10. Mark Bluemel wrote:
    > CBFalconer wrote:
    >> Mark Bluemel wrote:
    >> ... snip ...
    >>> The behaviour documented for POSIX, according to one of the texts
    >>> I have to hand, is that the EOF character (^D by default) only
    >>> designates end of file if it starts a line of input. It looks
    >>> like Windows follows the same convention.

    >>
    >> There is no such thing as "an EOF char".

    >
    > Of course there is in Operating Systems terms,


    EOF is a condition set on the stream by the OS when no more data is
    available to read.

    > which is what I meant
    > here - that should have been clear from my first sentence.


    I would assume that you're referring to the end-of-file marker that a
    few operating systems use for legacy reasons in text files. This is not
    actually an EOF character, and for reference,

    Unices don't use one. When you PRESS Crtl-D it sets the flag. If you had
    a char with value 0x04 in the stream, it has no effect and is treated as
    an ordinary character.

    Windows/DOS does use one, but its Ctrl-Z. This is a hangover from CP/M I
    think. I'm too lazy to fire up my CPM emulator to find out.

    $ cat test.c
    #include <stdio.h>
    int main()
    {
    int c;
    FILE *s = fopen("test.txt","r");
    c=fgetc(s);
    while (c!=EOF)
    {
    printf("%d %c\n", c,c);
    c = fgetc(s);
    }
    return 0;
    }
    $ hexdump test.txt
    ddddd
    64 64 04 64 64 64 10 00

    $ ./a.out
    100 d
    100 d
    4
    100 d
    100 d
    100 d
    10
     
    Mark McIntyre, Nov 20, 2007
    #10
  11. Mark McIntyre wrote:
    > Mark Bluemel wrote:
    >> CBFalconer wrote:
    >>> Mark Bluemel wrote:
    >>> ... snip ...
    >>>> The behaviour documented for POSIX, according to one of the texts
    >>>> I have to hand, is that the EOF character (^D by default) only
    >>>> designates end of file if it starts a line of input. It looks
    >>>> like Windows follows the same convention.
    >>>
    >>> There is no such thing as "an EOF char".

    >>
    >> Of course there is in Operating Systems terms,

    >
    > EOF is a condition set on the stream by the OS when no more data is
    > available to read.

    [...]

    No, EOF is neither a character nor a condition.

    The condition set on a stream when no more data is available to read
    (actually, set *after* an attempt to read more data has failed) is
    called the "end-of-file indicator". The feof() function can be used to
    query this indicator.

    EOF is a macro defined in <stdio.h>. It expands to an integer constant
    expression of type int with a negative value. This value matches the
    value returned by several functions to indicate either an end-of-file
    condition or an error condition.

    EOF stands for End Of File, but EOF and end-of-file are two quite
    different things.

    --
    Keith Thompson (The_Other_Keith) <>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 20, 2007
    #11
  12. Mark Bluemel Guest

    Mark McIntyre wrote:
    > Mark Bluemel wrote:
    >> CBFalconer wrote:
    >>> Mark Bluemel wrote:
    >>> ... snip ...
    >>>> The behaviour documented for POSIX, according to one of the texts
    >>>> I have to hand, is that the EOF character (^D by default) only
    >>>> designates end of file if it starts a line of input. It looks
    >>>> like Windows follows the same convention.
    >>>
    >>> There is no such thing as "an EOF char".

    >>
    >> Of course there is in Operating Systems terms,

    >
    > EOF is a condition set on the stream by the OS when no more data is
    > available to read.
    >
    >> which is what I meant here - that should have been clear from my first
    >> sentence.

    >
    > I would assume that you're referring to the end-of-file marker that a
    > few operating systems use for legacy reasons in text files.


    Nope - I'm refering to the special character recognised by an operating
    system to indicate the end of interactive input.

    > This is not
    > actually an EOF character, and for reference,
    >
    > Unices don't use one. When you PRESS Crtl-D it sets the flag. If you had
    > a char with value 0x04 in the stream, it has no effect and is treated as
    > an ordinary character.


    "POSIX.1 defines 11 special characters that are handled specially on
    input. SVR4 adds another 6 special characters and 4.3+BSD adds 7."
    (W Richard Stevens "Advanced Programming in the Unix Environment")

    $ man stty
    STTY(1)
    User Commands
    STTY(1)

    NAME
    stty - change and print terminal line settings

    [snip]
    Special characters:
    * dsusp CHAR
    CHAR will send a terminal stop signal once input flushed

    eof CHAR
    CHAR will send an end of file (terminate the input)
    [snip]
     
    Mark Bluemel, Nov 20, 2007
    #12
  13. In article <>,
    Mark McIntyre <> wrote:

    >>> There is no such thing as "an EOF char".


    >> Of course there is in Operating Systems terms,


    >EOF is a condition set on the stream by the OS when no more data is
    >available to read.


    In some operating systems. In others, it appears as a character
    marking the end of the file.

    >I would assume that you're referring to the end-of-file marker that a
    >few operating systems use for legacy reasons in text files. This is not
    >actually an EOF character, and for reference,


    How do you determine whether it's "actually" an EOF character? It's
    just terminology, there's no fact of the matter.

    It's pointless to argue "EOF doesn't mean so-and-so". The term is used
    outside C, and it is natural to refer to non-C uses of it when talking
    about EOF in C.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, Nov 20, 2007
    #13
  14. santosh Guest

    In article <>, Mark McIntyre
    <> wrote on Tuesday 20 Nov 2007 2:44 pm:

    > Mark Bluemel wrote:
    >> CBFalconer wrote:
    >>> Mark Bluemel wrote:
    >>> ... snip ...
    >>>> The behaviour documented for POSIX, according to one of the texts
    >>>> I have to hand, is that the EOF character (^D by default) only
    >>>> designates end of file if it starts a line of input. It looks
    >>>> like Windows follows the same convention.
    >>>
    >>> There is no such thing as "an EOF char".

    >>
    >> Of course there is in Operating Systems terms,

    >
    > EOF is a condition set on the stream by the OS when no more data is
    > available to read.


    Is it "EOF"? It's probably safer to say "end-of-file", since "EOF" is an
    identifier defined only by Standard C.

    <snip>
     
    santosh, Nov 20, 2007
    #14
  15. In article <fhufg2$gce$> Mark Bluemel <> writes:
    > eof CHAR
    > CHAR will send an end of file (terminate the input)


    Except when it is not the first character on an input line, or not
    immediately preceded by the same character. In the latter case also
    the preceding occurrence will be removed.
    --
    dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
    home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
     
    Dik T. Winter, Nov 20, 2007
    #15
  16. Mark Bluemel wrote:

    > stty - change and print terminal line settings
    >
    > [snip]
    > Special characters:
    > * dsusp CHAR
    > CHAR will send a terminal stop signal once input flushed
    >
    > eof CHAR
    > CHAR will send an end of file (terminate the input)



    *shrug*.

    I gave a worked example showing that Crtl-D, 0x04 is NOT an EOF, and
    will not terminate reading from a file.

    I don't dispute you can press Ctrl-D and send an EOF signal to your
    application, but encountering character 0x04 in a stream is not the same
    as sending that stream a signal to say "end of data reached".

    I have a feeling we had this dull discussion about 12 months ago. I was
    right then too.... :)
     
    Mark McIntyre, Nov 20, 2007
    #16
  17. Richard Tobin wrote:
    > How do you determine whether it's "actually" an EOF character?


    Because the ASCII character set has no character called EOF. It has an
    EOT and ESC, which occupy the positions commonly associated with the
    control sequences many OSen use to send an end-of-data signal from the
    keyboard.
    EBCDIC also has no EOF character.

    >It's just terminology, there's no fact of the matter.


    On that basis, nothing is fact.

    > It's pointless to argue "EOF doesn't mean so-and-so". The term is used
    > outside C, and it is natural to refer to non-C uses of it when talking
    > about EOF in C.


    Agreed.
     
    Mark McIntyre, Nov 20, 2007
    #17
  18. In article <> Mark McIntyre <> writes:
    > Richard Tobin wrote:
    > > How do you determine whether it's "actually" an EOF character?

    >
    > Because the ASCII character set has no character called EOF. It has an
    > EOT and ESC, which occupy the positions commonly associated with the
    > control sequences many OSen use to send an end-of-data signal from the
    > keyboard.
    > EBCDIC also has no EOF character.


    ^D is indeed EOT (End Of Transmission), but ^Z is not ESC but SUB.
    --
    dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
    home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
     
    Dik T. Winter, Nov 21, 2007
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin Dickopp

    Re: EOF and getchar/fgetc

    Martin Dickopp, Feb 14, 2004, in forum: C Programming
    Replies:
    0
    Views:
    2,205
    Martin Dickopp
    Feb 14, 2004
  2. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,258
    Smokey Grindel
    Dec 2, 2006
  3. Luke Wu

    The need for int to capture getchar()'s EOF

    Luke Wu, Jan 23, 2005, in forum: C Programming
    Replies:
    6
    Views:
    595
    Lawrence Kirby
    Jan 26, 2005
  4. broeisi

    getchar function and EOF problem..

    broeisi, Mar 10, 2006, in forum: C Programming
    Replies:
    13
    Views:
    633
    Barry Schwarz
    Mar 12, 2006
  5. arnuld

    why (getchar() != EOF) always equal to 1

    arnuld, Mar 8, 2007, in forum: C Programming
    Replies:
    12
    Views:
    736
    Mark McIntyre
    Mar 9, 2007
Loading...

Share This Page