Why does my simple while loop printf 2x?

Discussion in 'C Programming' started by DFS, May 15, 2014.

  1. DFS

    BartC Guest

    That's true. But it's another argument against using character-at-a-time
    input, as this variation of an off-by-one bug can creep in.
    That's fine, but reads a little odd because the error message code appears
    before the lines that read any input.

    But switch to a line-based version:

    int confirm(void) {
    char str[5];
    int c;

    while (1) {
    printf("Enter Y or N (not case-sensitive): ");
    if (fgets(str,sizeof(str),stdin)) {
    if ((c=='Y' || c=='N') && str[1]=='\n') break;
    return c;

    And the sequencing is more intuitive.

    But notice I've made the line buffer very small, this makes it easy to see
    what goes long with long lines (I printed out the str[] contents, as a byte
    array after each fgets). Effectively, a long line is broken up and presented
    as a series of shorter lines, none of which end with \n, only the last line.

    This needs dealing with, but now however it's easy to change the call to
    fgets() with a call to your own line-reading function, eg. mygets():

    * This keeps the untidy newline and EOF logic outside of your main routine.

    * If there are other functions that need short input, they can use the same
    mygets() function; they don't need to worry about newline and EOF.

    * mygets() can use a variety of ways to deal with long lines; one way is, if
    the first line doesn't end with '\n', to keep requesting more input, and to
    skip that extra data (and if necessary, to signal that to the caller by some
    error character). (And of course it will throw in whatever EOF checks are
    needed. The writer of confirm() doesn't need to care anymore; it's the
    headache of mygets())

    * By skipping the rest of the superfluously long lines, it is now behaving
    in a similar way to the code that repeatedly calls getchar() looking for the
    end of a line (in fact, mygets() could do the same!).

    * Now that confirm() is based on a line buffer, it is easy to modify it to
    also allow, for example, 'yes' and 'no' input. (For that purpose, it would
    be expeditious to remove the untidy '\n' character from the line buffer, and
    just have a normal string.)

    Example (returns 1 for success, 0 for error):

    int mygets(char* dest,int length,FILE* f) {
    int n,c;

    if (fgets(dest,length,f)==NULL)
    return 0;
    n=strlen(dest); /* (will n always be>=1? */
    if (dest[n-1]!='\n') {
    while (c!=EOF && c!='\n');
    return 0;
    dest[n-1]=0; /* change '\n' to normal string ending */
    return 1;
    BartC, May 20, 2014
    1. Advertisements

  2. In this particular case, none. More generally, it's just a bad habit.
    99+% of the time, there is no good reason to write your own declarations
    of library functions rather than using the proper #include directive.

    Writing the declarations yourself risks getting them wrong, with no
    guarantee that the compiler will warn you about your mistake.

    (You'll find declarations of standard library functions in very old code
    that was written before the C standard required all implementations to
    provide proper headers with proper declarations.)
    Keith Thompson, May 20, 2014
    1. Advertisements

  3. Check your metaphors. An independent entity -- a bug -- that seems to
    have agency (it can creep places) has got into your code. That's not
    how I think about code and programming. Some code is hard to write
    without error, but this is not one of those cases.
    To you. To me, it reads oddly because the headline is "never stop doing
    this" and the "this" includes printing error! How odd. Oh no, you
    lied, you do the thing forever except when some conditions are met. It
    also reads oddly because it has a magic number in it. Why 5? Would 6
    do? What about 3 or 2?

    But the biggest problem with the code (as you half note below) is that
    it leaves the input in a peculiar unknown state. If the next input is
    supposed to be another confirmation, what should the caller do? They
    can't just call confirm again because you might get a Y or a N that the
    user did not intend.
    Whew. That's the simpler version, yes? Simpler and less error prone
    than my 9 lines of code?
    Another of those pesky bugs has inserted itself into your code. You are
    making my case for me. (This is a bit of guess -- maybe this function
    is not supposed to handle all the input it might come across in the
    wild, or all the parameter values that callers might think of using.
    You don't say.)
    Ben Bacarisse, May 20, 2014
  4. DFS

    BartC Guest

    That's the specification: keep reading lines of input, and printing an error
    message, until such time as Y or N is entered:

    show prompt
    read a line
    is it Y or N? If so we're done so quit the loop
    if not report error and repeat

    which is pretty much what I've written. (And as written; it will keep
    demanding the input it wants even after EOF is encountered.)
    The small number was to help highlight problems of longer lines. It also
    saves a lot of typing when testing, if the limit is 5 characters rather

    (And why 5? It was supposed to be arbitrary, in fact it was chosen to allow
    ('y','e','s','\n',0) to be entered.)
    Deliberately, to highlight the problem, and also debunk the idea that
    dealing with massive lines of input will use up all the memory unless you
    choose a buffer-less method.

    If you mean the extra work of creating an interface to fgets, then that is
    work then is generally useful and not just for this one confirm function.
    You mean, input which is on the last line of a file but which which doesn't
    end in a newline? Yes, that would need extra logic to allow (and ensure all
    lines are consistently well-formed; the calling function just wants a
    string, not something that sometimes ends with '\n' and sometimes doesn't.).
    Don't know if requiring that a line ends in a newline is an actual bug
    though; more of a requirement.

    But I don't know what other tricks C, its runtime, or the Unix-like systems
    that seem to be inextricably linked to it, have up their sleeves. All the
    more reason to isolate that part, together with any bugs, inside its own

    (I don't know what you mean about parameter values. For this example, I've
    chosen the same ones that fgets() uses (with a different return value).)

    You don't seem to grasp that I'm trying to separate the logic of a function
    such as confirm(), from the intricacies of i/o, nor that that might be a
    useful to do.
    BartC, May 21, 2014
  5. Yes, the specification gives a termination condition. I have to hunt
    for that. It's not even a simple condition because it's guarded by a
    non-null return from fgets and that not trivial to state.
    This is much closer to an implementation than a specification.
    What, if anything, causes you to write a while or do loop with a
    condition other than 1? The specification of every loop can be turned
    into a loop like your pseudo code above, yet I bet you write an actual
    condition in some cases.
    As I think I've said before, I get the sense that you and I are engaged in
    quite different kinds of activity when we are doing the thing we both
    call programming. Anyway, your peculiar and private reason for choosing
    5 does not stop it being mysterious to readers of your code.
    You deliberately wrote unusable code to debunk an claim no one has made?
    If you do that again, please flag it with some comment so I won't
    bother reading it.
    My main worry is null-bytes in the input. It's quite possible that all
    the problems with this code are simply due to unstated requirements.
    Unfortunately, as I said before, that's the sort of code bad people are
    looking to exploit.
    Yes, writing a good line reading function is an excellent idea, but I
    don't think your original advice to the OP was "use your already
    correctly written line-reading function that uses a bounded buffer and
    discards excess characters on the line".
    No, you've used (almost) the same types but they have slightly different
    meanings in corner cases. For example, passing a length of 1 makes
    fgets write a null and return a pointer to the start of the array.
    Pointless, but harmless. Passing a length of 1 to mygets produces
    undefined behaviour.
    I grasp it. I don't think you've managed it, but I know what you are
    trying to do. With a good enough line-reading function, you will
    probably be able to write a line-based "confirm" function that is as
    simple as the obvious one. It will certainly be possible to do better
    tests (say for "yes" rather than just "y") but that was never called
    Ben Bacarisse, May 21, 2014
  6. DFS

    DFS Guest

    I don't get you.

    Isn't it better to reduce the complexity/#includes?

    So isn't it better to reduce the complexity/#includes?

    You got me...
    DFS, May 21, 2014
  7. DFS

    BartC Guest

    I've commented such a possibility in the code.
    I've managed it plenty of times. I nearly always use line-based input for
    this stuff, then use string processing on the result. (When I don't use
    line-based input, it's because I'm using unbuffered character or event-based
    input to create my own command-line handling.)

    As things are now, I can completely change the way I implement the reading
    of lines (or acquire them a different way, from a data-entry line on a
    graphics display for example), but the code that looks for a Y or N response
    will not change (neither will the code that requires a 1, 2 or 3 response,
    or an A, B or C one, or any of a myriad other simple requirements that,
    following your recommendation, would all need to contain expert code in
    dealing with Unix line- and file-ending conventions and who knows what
    It will be simpler in that it doesn't need to understand so much, as I
    mentioned above. It will just be presented with a string (and/or a status
    code, as I've suggested in my example). (Even splitting a function into one
    that requests and checks the input, and one that acquires it, without
    necessarily using line-buffers, would be better.)
    (BTW your confirm() function doesn't check that Y or N isn't followed by
    anything else, a slight deviation from the version I've shown. But you're in
    good company, as pressing Ctrl-Break from a Windows console app shows:

    Terminate batch job (Y/N)?

    And will accept anything starting with Y, N, y or n. No error is shown for
    any other input, it will just prompt again.

    The same for copying over a file when it shows: "Overwrite <filename>?
    (Yes/No/All): ". You're not an advisor to MS are you? It seems very slack
    for an OS purportedly comprising 50 MLoc.)
    BartC, May 21, 2014
  8. DFS

    James Kuyper Guest

    The same issue applies. The code as originally written (modulo the
    author's misunderstanding of how the terminating newline characters
    would be handled) would accept arbitrarily long sequences of invalid
    input, waiting for a valid response. Presumably, that's the desired
    behavior. Therefore, how many consecutive null characters should the
    program read before failing due to invalid input? I don't think it's
    consistent with the original program for there to be any upper limit.
    The only way to justify rejecting "/dev/zero" as input would be based
    upon knowing that the stream will never end - and I don't see any way
    for the program to determine that fact.
    I do - since any upper limit deemed "reasonable" by one person will be
    "unreasonable" for someone else. The only length I would consider
    unambiguously "reasonable" is one that corresponds to maximum amount of
    memory it can allocate, which is precisely what it appears to be doing.

    This would be chosen
    Such balking would NOT be sensible for a general purpose utility like
    "grep". "input which is not intended for it" is pretty much an empty
    set. More specialized utilities could impose more specific limits, but
    it would not be appropriate for grep to do so.
    Too much - it seems - you're compromising on things that don't need to
    be compromised.
    Adding the text editor is unnecessary complication. Using fgets() where
    getchar() is more than sufficient is also unnecessary complication.
    That's what I mean by "same reason".
    James Kuyper, May 21, 2014
  9. My confirm was offered as a correction to yours. I did not want to
    change the specification.

    This dig at my suggestion is misplaced. It contains an error you could
    have pointed out (not a serious one -- nothing undefined for example)
    but an error is an error none the less.

    Here is a corrected version:

    int confirm(void) {
    int reply = 'Y';
    do {
    int c;
    if (reply != 'Y')
    puts("Y or N must be the first character.");
    printf("Enter Y or N (not case-sensitive): ");
    reply = c = toupper(getchar());
    while (c != EOF && c != '\n') c = getchar();
    } while (reply != 'Y' && reply != 'N');
    return reply;

    Reply must be initialised to something that can't possibly be an
    erroneous entry, otherwise the extra message won't get printed when the
    actual input that the erroneous character. Obviously that's the kind of
    detail that it needs a comment, but since I'm explaining it here, I
    won't bother.

    Another method I've taken to using when extra output is needed on the
    second and subsequent executions of a loop is to add it as an explicit
    string, initialised to the empty string. (A good example is printing a
    list with separating commas). Here, it would look like this:

    int confirm(void) {
    int reply;
    const char *extra_prompt = "";
    do {
    int c;
    printf("%sEnter Y or N (not case-sensitive): ", extra_prompt);
    reply = c = toupper(getchar());
    while (c != EOF && c != '\n') c = getchar();
    extra_prompt = "The first character must be Y or N.\n";
    } while (reply != 'Y' && reply != 'N');
    return reply;
    Ben Bacarisse, May 21, 2014
  10. DFS

    BartC Guest

    I don't think that is true. There will be some people with extreme ideas,
    but common sense should apply.

    For example, a limit of 1 character per line is clearly too short, while
    1000000 characters is going to be longer than the longest line most people
    have ever encountered.
    That's completely crazy. So you would risk grep, or any similar app, hanging
    a system (when the input will just about fit into memory) just for the sake
    of a principle?

    (To be fair, I don't think it's clear exactly why grep failed when asked to
    process /dev/zero.)

    I wouldn't place much faith in its results either, if when fed a copy of a
    novel where line-endings have been accidentally removed, and asked on how
    many lines does 'the' occur, it only mentions the one line!
    If it is specially designed for input divided into lines, and not just any
    free-format text, then I don't think it is an empty set (as that might be
    less than useful as I suggested above).
    OK, so it's one of those programs which literally can be fed any old
    garbage. You give it program.exe by mistake instead of program.c, and the
    results might well make enough sense for you not to notice the error!
    BartC, May 21, 2014
  11. DFS

    James Kuyper Guest

    I'm not a big believer in "common sense". It's generally just used as an
    excuse for ignoring the possibility that other people might not think
    about something the same way that you do.
    While still being shorter than the longest line someone might want to
    pass through grep. There's no need to impose a shorter limit (and the
    documentation mentions no line length limit), so it shouldn't have one.

    The message I got when I ran it seemed pretty self-explanatory.
    That sounds like the correct answer to me. I'd be very upset if I
    deliberately removed the line endings, and then had grep either reject
    the file as invalid, or gave the number of matching lines as greater than 1.
    I just reviewed the POSIX specification, which mandates that the input
    be a text file. I'm used to using the Linux version, which is quite
    capable of handling binary files. However, by default, if the input is a
    binary file, it only reports whether there's a match, it doesn't
    actually display the matching lines. I use the -a option on the rare
    occasions when I need it to handle binary files the same way as text files.

    However, even a strictly POSIX version of grep has no limits on the
    number or spacing of newlines in it's input file.
    That seems unlikely; since the contents of program.exe are likely to be
    radically different from those of program.c. In any event, you could
    also give program.exe (to the Linux version, anyway) deliberately, and
    make sense of the output, even though it's not the same sense you would
    get performing the same search on program.c.
    James Kuyper, May 21, 2014
  12. DFS

    Geoff Guest

    True, it exists only to shut up the Microsoft compiler which emits:
    warning C4715: 'GetYesNo' : not all control paths return a value

    .... and is more properly written as:


    // if expecting n and get n or blank, return false
    if (yn.compare(0, 1, "n") == 0 && (str.compare(0, 1, "") == 0)
    || (str.compare(0, 1, "n") == 0))
    return false;

    return false;
    Geoff, May 25, 2014
  13. DFS

    Ike Naar Guest

    You probably misunderstood what I was trying to say, it's not only

    return false;

    that could be simplified to

    return false;

    but the whole fragment

    if ((yn.compare(0, 1, "n") == 0 && str.compare(0, 1, "") == 0) ||
    str.compare(0, 1, "n") == 0)
    return false;
    return false;

    which always returns false.
    also in this case the whole fragment can be simplified to

    return false;

    because, no matter what the conditional expression evaluates to,
    in the end false is returned.
    Ike Naar, May 25, 2014
  14. DFS

    Geoff Guest

    Yes, but the code breaks when the condition is deleted.

    The point is to make the function variadic in that it accepts "Y" or
    "N" as an argument.

    If the function is expecting "Y" and it gets "Y" or blank it returns
    true. If it is expecting "N" and gets "N" or blank it returns false.
    However, if it gets "Y" when it's told to expect "N", it will return
    true because the first "if" expression finds the "Y" before the second
    evaluates it.

    Perhaps you are missing the implied "else if" that begins the second

    How would you write it?
    Geoff, May 25, 2014
  15. DFS

    Stefan Ram Guest

    To make functions variadic, we use a parameter list ending
    return false;

    instead of the second »if«, if the behavior is to be retained.

    However, this make it impossible to distinguish an answer of
    »No« from an erroneous input, such as »A«.
    Stefan Ram, May 25, 2014
  16. How can it? The code has no effect -- no side effects and no
    consequences for the return value.
    That's not what variadic means. A variadic function is one that can be
    called with different numbers of arguments. In C, these have ... at the
    end of the function prototype (it's a bit more complex in C++).
    Your function returns true in only two cases: when

    yn.compare(0, 1, "y") == 0 && str.compare(0, 1, "")

    (that's when yes is the default and the user enters nothing) and when

    str.compare(0, 1, "y") == 0

    (that's when the user enters yes, and is means yes no matter what the
    default is). If neither of these conditions is met, the function must
    return false. No further tests are needed, no matter what value the yn
    parameter has. You could simply have written

    return str.compare(0, 1, "") == 0 && yn.compare(0, 1, "y") == 0 ||
    str.compare(0, 1, "y") == 0
    I know you are not asking me, and I've posted this before, but I like to
    repeat myself!

    bool getYesNo(char deflt)
    int c = std::cin.peek();
    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
    return tolower(c == EOF || c == '\n' ? deflt : c) == 'y';

    The key point being to avoid unbounded storage use.
    Ben Bacarisse, May 26, 2014
  17. DFS

    Geoff Guest

    As it's shown in the function, return false.
    No, failure to input an explicit "y" in the default "n" case is reason
    to return false.
    No. It can only return a bool.
    No. It can only return a bool.
    Geoff, May 26, 2014
  18. DFS

    Martin Shobe Guest

    So the specification could be simplified to

    Return true if 'Y' typed or if ' ' typed and default is 'Y'.
    Return false otherwise.

    Anyway, if you replaced the second condition with "return false;", under
    what conditions would there be a change in the value returned or a
    side-effect of the function?

    Martin Shobe
    Martin Shobe, May 26, 2014
  19. DFS

    Geoff Guest

    No, this doesn't cover the case where the default response is supposed
    to be N or space. The point of the function is to set a bit according
    to query at the user prompt with the expectation that user will hit
    return to indicate the default answer. The program it's derived from
    asks a series of yes/no questions, some default to yes, some default
    to no. A response other than empty string or Y or N returns false.
    Only a Y answer can set it true in the case of a default no
    expectation and any other input results in the default case.
    I take your point and I like your solution even though it doesn't
    completely satisfy the requirements.
    Geoff, May 26, 2014
  20. DFS

    Geoff Guest

    Sorry, wrong word.
    It's not required to distinguish any other responses. Erroneous input
    is erroneous input and the function returns false in that case.
    Geoff, May 26, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.