What's the guideline for dealing with unwanted chars in input stream?

Discussion in 'C Programming' started by lovecreatesbeauty, Dec 31, 2005.

  1. /*
    When should we worry about the unwanted chars in input stream? Can we
    predicate this kind of behavior and prevent it before debugging and
    testing? What's the guideline for dealing with it?

    As shown below line #21, I should remove the unwanted characters in
    input stream there at that time. Do I miss some other possible errors
    in i/o which will happen to occur sometimes in other places? And
    welcome your kind comments on following the code, thank you.
    */


    1 #define STRLEN 200
    2
    3 int main(int argc, char *argv[])
    4 {
    5 int ret = 0;
    6 char cust[STRLEN] = {'\0'};
    7 char dest[STRLEN] = {'\0'};
    8 char flight = '\0';
    9 char hotel = '\0';
    10
    11 printf("Customer name: ");
    12 gets(cust);
    13 printf("Destination: ");
    14 gets(dest);
    15
    16 printf("Will flight be available: ");
    17 flight = getchar();
    18 printf("Will hotel be available: ");
    19
    20 /* remove unwanted chars in input stream here now */
    21 while(getchar() != '\n'); //intended null loop body
    22 hotel = getchar();
    23
    24 printf("\n- summary -\n");
    25 printf("Customer name\t: %s \n" ,cust);
    26 printf("Destination\t: %s \n" ,dest);
    27 printf("is flight available\t: %c \n", flight);
    28 printf("is hotel available\t: %c \n", hotel);
    29
    30 return ret;
    31 }
    32
    lovecreatesbeauty, Dec 31, 2005
    #1
    1. Advertising

  2. lovecreatesbeauty said:

    > /*
    > When should we worry about the unwanted chars in input stream?


    When you know they're unwanted.

    > Can we
    > predicate this kind of behavior and prevent it before debugging and
    > testing? What's the guideline for dealing with it?


    Decide what you wish to keep and what you wish to discard. Devise an
    algorithm for distinguishing between them. Implement the algorithm.

    > As shown below line #21, I should remove the unwanted characters in
    > input stream there at that time. Do I miss some other possible errors
    > in i/o which will happen to occur sometimes in other places? And
    > welcome your kind comments on following the code, thank you.
    > */
    >
    >
    > 1 #define STRLEN 200
    > 2
    > 3 int main(int argc, char *argv[])


    Well done, you got main() right. A good start.

    > 4 {
    > 5 int ret = 0;
    > 6 char cust[STRLEN] = {'\0'};
    > 7 char dest[STRLEN] = {'\0'};
    > 8 char flight = '\0';
    > 9 char hotel = '\0';
    > 10
    > 11 printf("Customer name: ");


    Undefined behaviour. You forgot to #include <stdio.h>

    > 12 gets(cust);


    Never, ever, ever, ever, ever, ever, ever use gets(). Use fgets instead, as
    it allows you to specify how much storage space you have available for the
    input. Any input that won't fit will stay in the stream awaiting collection
    by the next function to read from that stream.

    > 13 printf("Destination: ");
    > 14 gets(dest);


    Never, ever, ever, ever, ever, ever, ever use gets(). Use fgets instead, as
    it allows you to specify how much storage space you have available for the
    input. Any input that won't fit will stay in the stream awaiting collection
    by the next function to read from that stream.

    > 15
    > 16 printf("Will flight be available: ");
    > 17 flight = getchar();


    Why not use fgets here too?

    > 18 printf("Will hotel be available: ");
    > 19
    > 20 /* remove unwanted chars in input stream here now */
    > 21 while(getchar() != '\n'); //intended null loop body


    This won't work, because it will keep reading and discarding characters up
    to AND INCLUDING the first non-newline, which is presumably supposed to be
    data.

    I suggest capturing all your data in string form, using fgets (for now, that
    is - later you'll probably devise your own input routine when you have a
    lot more experience), even if it's single-character data. It's easy enough
    to pick a single character out of a string if that's all you need from it.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
    Richard Heathfield, Dec 31, 2005
    #2
    1. Advertising

  3. lovecreatesbeauty

    Chuck F. Guest

    Re: What's the guideline for dealing with unwanted chars in inputstream?

    Richard Heathfield wrote:
    > lovecreatesbeauty said:
    >

    .... big snip ...
    >
    >> 19
    >> 20 /* remove unwanted chars in input stream here now */
    >> 21 while(getchar() != '\n'); //intended null loop body

    >
    > This won't work, because it will keep reading and discarding
    > characters up to AND INCLUDING the first non-newline, which is
    > presumably supposed to be data.


    Hunh? It will remove chars from the stream, including the '\n'.
    The fault is that it doesn't check for EOF, which can lead to a
    fairly long wait.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Chuck F., Dec 31, 2005
    #3
  4. Chuck F. said:

    > Richard Heathfield wrote:
    >> lovecreatesbeauty said:
    >>

    > ... big snip ...
    > >
    >>> 19
    >>> 20 /* remove unwanted chars in input stream here now */
    >>> 21 while(getchar() != '\n'); //intended null loop body

    >>
    >> This won't work, because it will keep reading and discarding
    >> characters up to AND INCLUDING the first non-newline, which is
    >> presumably supposed to be data.

    >
    > Hunh? It will remove chars from the stream, including the '\n'.


    Oops. My apologies. I read it as == for some reason.


    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
    Richard Heathfield, Dec 31, 2005
    #4
  5. lovecreatesbeauty

    Guest

    Richard Heathfield wrote:
    > Chuck F. said:
    >
    > > Richard Heathfield wrote:
    > >> lovecreatesbeauty said:
    > >>

    > > ... big snip ...
    > > >
    > >>> 19
    > >>> 20 /* remove unwanted chars in input stream here now */
    > >>> 21 while(getchar() != '\n'); //intended null loop body
    > >>
    > >> This won't work, because it will keep reading and discarding
    > >> characters up to AND INCLUDING the first non-newline, which is
    > >> presumably supposed to be data.

    > >
    > > Hunh? It will remove chars from the stream, including the '\n'.

    >
    > Oops. My apologies. I read it as == for some reason.
    >


    More accurately, the "while(getchar() != '\n');" part removes every
    other character in the stream so that the following "getchar()" will
    only get 'even' characters.

    If "Hello world\n" is in the stream then you would only be getting "el
    ol\n". In this case, because there are an odd number of chars in the
    stream before the "\n", the while loop will even miss the newline.
    , Dec 31, 2005
    #5
  6. "Chuck F. " <> writes:

    > Richard Heathfield wrote:
    > > lovecreatesbeauty said:

    > ... big snip ...
    > >> 20 /* remove unwanted chars in input stream here now */
    > >> 21 while(getchar() != '\n'); //intended null loop body

    > > This won't work, because it will keep reading and discarding

    >
    > > characters up to AND INCLUDING the first non-newline, which is
    > > presumably supposed to be data.

    >
    > Hunh? It will remove chars from the stream, including the '\n'. The
    > fault is that it doesn't check for EOF, which can lead to a fairly
    > long wait.


    Not so long a wait, as (EOF != '\n') != 0.

    mlp
    Mark L Pappin, Jan 1, 2006
    #6
  7. Mark L Pappin wrote:
    > "Chuck F. " <> writes:
    >
    > > Richard Heathfield wrote:
    > > > lovecreatesbeauty said:

    > > ... big snip ...
    > > >> 20 /* remove unwanted chars in input stream here now */
    > > >> 21 while(getchar() != '\n'); //intended null loop body
    > > > This won't work, because it will keep reading and discarding

    > >
    > > > characters up to AND INCLUDING the first non-newline, which is
    > > > presumably supposed to be data.

    > >
    > > Hunh? It will remove chars from the stream, including the '\n'. The
    > > fault is that it doesn't check for EOF, which can lead to a fairly
    > > long wait.

    >
    > Not so long a wait, as (EOF != '\n') != 0.
    >
    > mlp


    Thank you, then do I also need to treat the EOF character specially?

    > 5.
    > Jan 1, 5:42 am show options
    >
    >More accurately, the "while(getchar() != '\n');" part removes every
    >other character in the stream so that the following "getchar()" will
    >only get 'even' characters.


    >If "Hello world\n" is in the stream then you would only be getting "el
    >ol\n". In this case, because there are an odd number of chars in the
    >stream before the "\n", the while loop will even miss the newline.


    And I don't understand this reply exactly, sorry.
    lovecreatesbeauty, Jan 1, 2006
    #7
  8. said:

    > More accurately, the "while(getchar() != '\n');" part removes every
    > other character in the stream so that the following "getchar()" will
    > only get 'even' characters.


    Nonsense.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
    Richard Heathfield, Jan 1, 2006
    #8
  9. lovecreatesbeauty said:

    > Thank you, then do I also need to treat the EOF character specially?


    If getchar() yields EOF, it means you won't be getting any more data from
    this stream, no matter how much you try, so if you haven't got enough data
    to complete your task you might as well emit an error message and quit.

    >
    >> 5.
    >> Jan 1, 5:42 am show options
    >>
    >>More accurately, the "while(getchar() != '\n');" part removes every
    >>other character in the stream so that the following "getchar()" will
    >>only get 'even' characters.

    >
    >>If "Hello world\n" is in the stream then you would only be getting "el
    >>ol\n". In this case, because there are an odd number of chars in the
    >>stream before the "\n", the while loop will even miss the newline.

    >
    > And I don't understand this reply exactly, sorry.


    It's just nonsense which you can safely ignore.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
    Richard Heathfield, Jan 1, 2006
    #9
  10. lovecreatesbeauty

    clayne Guest

    Anyways, I think the original poster was looking for ideas on typical
    methods.. comp.lang.c spends way too much time on pedantic bullshit vs.
    getting any thing actually answered as originally asked in preference
    for focusing on the absence of <stdio.h>.

    To the original poster:

    switch on the return from getchar() and case on values for which you
    actually care about, with a default case either doing what would be
    intuitive or re-issuing the output query to the user.
    clayne, Jan 1, 2006
    #10
  11. clayne said:

    > Anyways, I think the original poster was looking for ideas on typical
    > methods..


    I gave him an excellent idea elsethread.

    > comp.lang.c spends way too much time on pedantic bullshit vs.
    > getting any thing actually answered


    You have drawn an incorrect conclusion based on insufficient data. If you'd
    read my first reply in this thread, you'd have seen the following text:

    "I suggest capturing all your data in string form, using fgets (for now,
    that is - later you'll probably devise your own input routine when you have
    a lot more experience), even if it's single-character data. It's easy
    enough to pick a single character out of a string if that's all you need
    from it."

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
    Richard Heathfield, Jan 1, 2006
    #11
  12. lovecreatesbeauty

    Chuck F. Guest

    Re: What's the guideline for dealing with unwanted chars in inputstream?

    Mark L Pappin wrote:
    > "Chuck F. " <> writes:
    >> Richard Heathfield wrote:
    >>> lovecreatesbeauty said:

    >
    >> ... big snip ...

    >
    >>>> 20 /* remove unwanted chars in input stream here now */
    >>>> 21 while(getchar() != '\n'); //intended null loop body

    >
    >>> This won't work, because it will keep reading and discarding

    >>
    >>> characters up to AND INCLUDING the first non-newline, which is
    >>> presumably supposed to be data.

    >>
    >> Hunh? It will remove chars from the stream, including the
    >> '\n'. The fault is that it doesn't check for EOF, which can
    >> lead to a fairly long wait.

    >
    > Not so long a wait, as (EOF != '\n') != 0.


    Meaning that the loop continues. This sort of confusion is why I
    would write the thing as:

    while (('\n' != (ch = getchar())) && (EOF != ch)) continue;

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Chuck F., Jan 1, 2006
    #12
  13. lovecreatesbeauty

    Guest

    lovecreatesbeauty wrote:
    > Mark L Pappin wrote:
    > > "Chuck F. " <> writes:
    > >
    > > > Richard Heathfield wrote:
    > > > > lovecreatesbeauty said:
    > > > ... big snip ...
    > > > >> 20 /* remove unwanted chars in input stream here now */
    > > > >> 21 while(getchar() != '\n'); //intended null loop body
    > > > > This won't work, because it will keep reading and discarding
    > > >
    > > > > characters up to AND INCLUDING the first non-newline, which is
    > > > > presumably supposed to be data.
    > > >
    > > > Hunh? It will remove chars from the stream, including the '\n'. The
    > > > fault is that it doesn't check for EOF, which can lead to a fairly
    > > > long wait.

    > >
    > > Not so long a wait, as (EOF != '\n') != 0.
    > >
    > > mlp

    >
    > Thank you, then do I also need to treat the EOF character specially?
    >
    > > 5.
    > > Jan 1, 5:42 am show options
    > >
    > >More accurately, the "while(getchar() != '\n');" part removes every
    > >other character in the stream so that the following "getchar()" will
    > >only get 'even' characters.

    >
    > >If "Hello world\n" is in the stream then you would only be getting "el
    > >ol\n". In this case, because there are an odd number of chars in the
    > >stream before the "\n", the while loop will even miss the newline.

    >
    > And I don't understand this reply exactly, sorry.


    Sorry, I thought you wrote:

    while(getchar() != '\n')
    hotel = getchar();

    Misread your code, missed the semicolon at the end of the while.
    , Jan 1, 2006
    #13
  14. lovecreatesbeauty

    Tim Rentsch Guest

    Re: What's the guideline for dealing with unwanted chars in input stream?

    "Chuck F. " <> writes:

    > Mark L Pappin wrote:
    > > "Chuck F. " <> writes:
    > >> Richard Heathfield wrote:
    > >>> lovecreatesbeauty said:

    > >
    > >> ... big snip ...

    > >
    > >>>> 20 /* remove unwanted chars in input stream here now */
    > >>>> 21 while(getchar() != '\n'); //intended null loop body

    > >
    > >>> This won't work, because it will keep reading and discarding
    > >>
    > >>> characters up to AND INCLUDING the first non-newline, which is
    > >>> presumably supposed to be data.
    > >>
    > >> Hunh? It will remove chars from the stream, including the
    > >> '\n'. The fault is that it doesn't check for EOF, which can
    > >> lead to a fairly long wait.

    > >
    > > Not so long a wait, as (EOF != '\n') != 0.

    >
    > Meaning that the loop continues. This sort of confusion is why I
    > would write the thing as:
    >
    > while (('\n' != (ch = getchar())) && (EOF != ch)) continue;


    When there are two tests, as there are here, it seems better (IMO) to
    write the assignment as the first part of a comma expression:

    while( ch = getchar(), ch != '\n' && ch != EOF ) ...

    Personally I find it easier to read expressions that have less visual
    clutter. Also I think it helps to get the assignment right out front
    (left out front? fuggedaboudit) where it's immediately obvious that an
    assignment is happening.
    Tim Rentsch, Jan 1, 2006
    #14
  15. lovecreatesbeauty

    tmp123 Guest

    lovecreatesbeauty wrote:
    > /*
    > When should we worry about the unwanted chars in input stream? Can we
    > predicate this kind of behavior and prevent it before debugging and
    > testing? What's the guideline for dealing with it?
    >
    > As shown below line #21, I should remove the unwanted characters in
    > input stream there at that time. Do I miss some other possible errors
    > in i/o which will happen to occur sometimes in other places? And
    > welcome your kind comments on following the code, thank you.
    > */
    >
    >
    > 1 #define STRLEN 200
    > 2
    > 3 int main(int argc, char *argv[])
    > 4 {
    > 5 int ret = 0;
    > 6 char cust[STRLEN] = {'\0'};
    > 7 char dest[STRLEN] = {'\0'};
    > 8 char flight = '\0';
    > 9 char hotel = '\0';
    > 10
    > 11 printf("Customer name: ");
    > 12 gets(cust);
    > 13 printf("Destination: ");
    > 14 gets(dest);
    > 15
    > 16 printf("Will flight be available: ");
    > 17 flight = getchar();
    > 18 printf("Will hotel be available: ");
    > 19
    > 20 /* remove unwanted chars in input stream here now */
    > 21 while(getchar() != '\n'); //intended null loop body
    > 22 hotel = getchar();
    > 23
    > 24 printf("\n- summary -\n");
    > 25 printf("Customer name\t: %s \n" ,cust);
    > 26 printf("Destination\t: %s \n" ,dest);
    > 27 printf("is flight available\t: %c \n", flight);
    > 28 printf("is hotel available\t: %c \n", hotel);
    > 29
    > 30 return ret;
    > 31 }
    > 32



    Hi,

    If you want to read some data character to character, and to do it
    directly using C, you are developing a "parser". It is a few long to
    explain how to implement it, but these are some elements used in the
    most usual implementation:

    a) a variable with the last character read.
    b) "ungetc" calls to discard the last character if you have read too
    much.
    c) An integer to mark the state.

    If you want more details, do not hesitate to ask.

    When the problem starts to be too much complex to code manually this
    part, some tools like "lex" and "yacc"(="bison") will do most part of
    the work for you.

    Other more complex resources are also available. Welcome to the great
    world of the grammars.

    Kind regards.
    tmp123, Jan 1, 2006
    #15
  16. lovecreatesbeauty

    Malcolm Guest

    "lovecreatesbeauty" <> wrote
    >
    > Thank you, then do I also need to treat the EOF character specially?
    >

    Usually, yes.
    You should always write programs with the assumption that any sequence of
    input is possible. This includes valid input being suddenly truncated, input
    designed by someone with access to your source code deliberately intended to
    make the program malfunction, input produced by machines rather than humans,
    and so on.
    Malcolm, Jan 1, 2006
    #16
  17. lovecreatesbeauty

    Chuck F. Guest

    Re: What's the guideline for dealing with unwanted chars in inputstream?

    Tim Rentsch wrote:
    > "Chuck F. " <> writes:
    >

    .... snip ...
    >>
    >> while (('\n' != (ch = getchar())) && (EOF != ch)) continue;

    >
    > When there are two tests, as there are here, it seems better
    > (IMO) to write the assignment as the first part of a comma
    > expression:
    >
    > while( ch = getchar(), ch != '\n' && ch != EOF ) ...
    >
    > Personally I find it easier to read expressions that have less
    > visual clutter. Also I think it helps to get the assignment
    > right out front (left out front? fuggedaboudit) where it's
    > immediately obvious that an assignment is happening.


    A further option, avoiding the confusing (to a neophyte) comma
    operator, and the evaluated assignment, is:

    do {
    ch = getchar();
    } while (('\n' != ch) && (EOF != ch));

    Avoiding all the Cisms that are not found in other languages.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Chuck F., Jan 1, 2006
    #17
  18. lovecreatesbeauty

    Jordan Abel Guest

    On 2006-01-01, Richard Heathfield <> wrote:
    > lovecreatesbeauty said:
    >
    >> Thank you, then do I also need to treat the EOF character specially?

    >
    > If getchar() yields EOF, it means you won't be getting any more data from
    > this stream, no matter how much you try, so if you haven't got enough data
    > to complete your task you might as well emit an error message and quit.


    That depends. Some implementations have types of streams for which you
    can clearerr() and get more data, even when it was a "genuine" EOF.

    For example, on tty devices on UNIX-derived systems, the user types a
    particular character once at the beginning of the line [or twice
    elsewhere] to trigger end-of-file status. getchar() will continue to
    yield EOF until the end-of-file indicator is cleared, but can yield more
    data after it has been cleared.
    Jordan Abel, Jan 1, 2006
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Salerno

    style guideline for naming variables?

    John Salerno, Mar 17, 2006, in forum: Python
    Replies:
    2
    Views:
    272
    Duncan Smith
    Mar 18, 2006
  2. Edward Elliott

    stripping unwanted chars from string

    Edward Elliott, May 4, 2006, in forum: Python
    Replies:
    7
    Views:
    415
    Alex Martelli
    May 4, 2006
  3. Vyom

    macro style guideline

    Vyom, Nov 21, 2004, in forum: C Programming
    Replies:
    7
    Views:
    316
    Dan Pop
    Nov 23, 2004
  4. lovecreatesbeauty

    Redundant behavior in coding guideline

    lovecreatesbeauty, Oct 27, 2005, in forum: C Programming
    Replies:
    0
    Views:
    400
    lovecreatesbeauty
    Oct 27, 2005
  5. lovecreatesbeauty

    Redundant behavior in coding guideline

    lovecreatesbeauty, Oct 27, 2005, in forum: C Programming
    Replies:
    2
    Views:
    329
    Netocrat
    Oct 27, 2005
Loading...

Share This Page