[prefers fgets+sscanf over scanf]
You're right in that there's very little difference between using scanf
directly and fgets+sscanf (in sscanf you know the size of input, but you
can still invoke undefined behavior if you're careless with your conversion
specifiers as the content is not under your control). ...
This is not the only, or perhaps (depending on one's point of view)
even the major, difference. The other significant difference between
using scanf() directly, and using some input-reading function --
whether fgets() or some other function -- followed by sscanf(),
is that the latter separates what one might reasonably consider
fundamentally different tasks.
Specifically, fgets() (or ggets() or any of the variants designed
to avoid fgets()'s limitations) reads "raw", "uninterpreted" data.
(I use double quotes around "raw" and "uninterpreted" because
fgets() and company do in fact interpret data as "lines of text"
separated by newlines, which is one of the two fundamental file
formats required of all hosted C implementations. The other is
the "even raw-er" binary file, which you get with fopen's "b"
modifier -- "rb", "wb", and the like -- in the open-mode parameter.)
Compare this with the scanf() family, which includes directives
like %d -- "interpret an integer" -- and %f and %[ and so on.
It is certainly possible to "read and interpret" more or less
simultaneously. Indeed, one of the features of certain GUI
applications is that they can do this "per input event", and beep
or otherwise gripe at you the very instant you do something
inappropriate, such as attempting to enter the letter 'z' in a
numeric field. But ANSI/ISO C is too primitive for this -- in
portable yet interactive C, we have to just read a line at a time,
then do our best to make sense of it.
This is where scanf() goes wrong.
Suppose, for instance, you do this:
n = scanf("%d", &intvar);
and the user enters 'zx81'. What does scanf() do with this, and
what would you *like* to have happen? A GUI interface might
reject the z, reject the x, and then accept the 81. The scanf
engine, on the other hand, sees the 'z' and rejects it but LEAVES
IT IN THE INPUT STREAM. It never gets as far as the 8.
If you put this scanf() call in a loop, the engine keeps rejecting
the same 'z' over and over again, never making any progress. The
program runs forever (or until externally interrupted).
If, on the other hand, you use fgets() (or ggets() etc.) first, so
as to read a "raw" line, *then* apply sscanf(), you not only can
detect the failure to convert the 'z', you also get the presumably-desired
effect of having entirely consumed the only interactive input item
portable C supports, i.e., the entire line. While this may be less
desirable than interactively catching the 'z' and 'x' as the user
pushes the keys, it does at least keep the program from getting
stuck in an infinite loop.
So, if you believe scanf is dangerous, then so is fgets+sscanf.
(But fgets() followed by sscanf() allows the insertion of limit
checking, and avoids infinite loops. It is possible to do both of
these with scanf(), but it is also clumsy to do so.)
However, I believe it is
considerably difficult to build your own alternatives that are safer
(depending on the task you want to do of course). Rather, if one devotes
some of the time spent building their own complex input routines to
understanding the *scanf family, one could end up with a more robust and
effective alternative.
Interactive user input is perhaps *the* most difficult thing to
do with computers, because it means the computer must work with
that most unpredictable of I/O devices, the human being.
To OP: Can you elaborate on this? Specifically, what is the book
complaining about as far as scanf's handling of end of lines?
Consider the "%d" example again, with a bit more prefix:
printf("Please enter an integer: ");
fflush(stdout);
n = scanf("%d", &intvar);
Not only does this have the "scanf engine jams up on alphabetic
input" problem, it also has another series of problems relating
to whitespace. The "%d" directive does not *just* mean "convert
an int", it *also* means "skip (ignore) leading white space", and
to the scanf engine, all white space -- spaces, tabs, newlines,
formfeeds, vertical tabs: basically anything for which isspace()
returns nonzero -- is equivalent.
If the user simply presses the ENTER (or RETURN) key, the program
sits there impassively. The scanf() code has eaten the newline
and is still awaiting more input -- but the computer does not
issue a new prompt to the human saying "please, no blank lines;
I need an integer, a series of digits". This too is often not
what one wanted.
Worse, suppose the user enters "123,456" (comma and all). The
scanf() engine reads and converts the 123 and leaves the comma and
subsequent digits and newline in the input stream, for the next
input operation to find. If the user really does enter just an
integer, scanf() leaves the newline behind. Since ANSI/ISO C's
interactive input *is* a primitive line-at-a-time based model, this
is quite a disservice. Because different scanf formats imply "skip
whitespace", it is impossible to tell a priori just how many input
lines (or "input events" as a human operator might see them) scanf()
has consumed.
One can attempt to work around the "trailing newline left in the
input stream", but almost invariably, C programmers' first attempts
to do so read something like this:
n = scanf("%d\n", &intvar);
(though most leave off the "n =" as well!). This simply does not
work -- because once again, the scanf engine interprets "any white
space" as "ANY white space". The newline in the format directive
does *not* mean "eat the trailing newline", nor even "eat trailing
blanks if any followed by a trailing newline" (which is probably
what the programmer wants), but rather "eat all white space including
as many newlines as possible". This means that even if the user
enters an integer as directed, the computer *still* just sits there
impassively, waiting for more input lines. If the user presses
ENTER again, the computer continues waiting for input. Only when
the user enters something "not white space" -- such as "wake up
you stupid machine" -- does the scanf() call return! And, of
course, it leaves this "bad" input in the input stream, where it
immediately jams up the scanf engine on the next "%d" format.
As long as the user sees (correctly) that ENTER is the way to push
input through to an interactive ANSI-C program -- that he gets to
edit any input until that point, and the ENTER key, well, *enters*
it, committing it to the program's input -- the scanf() function
remains badly misdesigned for interactive input. This is the
programming-language equivalent of mismatched impedances on radio
antennae, or plugging a 120 volt appliance into a 220 volt socket,
or any number of similar analogies: scanf() works in the wrong
units. The units the interactive program and its user exchange
are "input lines", but the units scanf() handles are "things that
match the next format directive", and there is no single directive
that *ever* means "an input line".
While fgets() is not perfect, it *does* get "an input line", so it
works far, far better for the average programmer and interactive
user. The pieces fit: in the usual case, the part that sticks out
of the user goes smoothly into the computer, and no blood is shed.
