(
http://www.thecoolkids.org/articles/editorials/onlinelang.php )
Each line can be parsed with (your data types may vary):
fscanf(fp, "%s; %i; %c\n", &var1, &var2, &var3);
No, %s is the wrong format; and "&var1" is almost certainly wrong
as well. %i is legal but may or may not be what is desired.
Lastly, the final newline in the scanf directive does not mean what
I suspect you think it means.
The "%s" directive tells the scanf family of functions to:
(a) read and consume any leading whitespace, including no
whitespace, then
(b) read and convert at least one character, and as many
characters as possible until the next whitespace. If
assignment is not suppressed, these will be stored through
the supplied pointer, writing to *&var, then *((&var)+1),
then *(&var + 2), and so on; after the last character
written, scanf will add the string-terminator marker '\0'.
If "var1" has type "array N of char", the value you want to supply
to the scanf family is the address of the *first element* of var,
not the address of "var" as a whole. Due to type conversions inside
the call, &var is fairly likely to work on most machines, despite
having undefined behavior in the C standard. More importantly,
though, if "var1" has type "array 100 of char" for instance:
char var1[100];
then even if you fix the call to read:
ret = fscanf(fp, "%s; %i; %c\n", &var1[0], &var2, &var3);
the input line "John; 23; a" will write {'J', 'o', 'h', 'n', ';'}
in sequence to var1[0] through var1[4], and set var1[5] to '\0'.
The OP asked to have var1[0] through var1[3] set to "John", without
the semicolon, so var1[4] should hold the '\0'.
(Of course, without a field-width, this fscanf() call is the next
Microsoft security hole waiting to be exploited, as well.)
Now, we can fix this by using scanf's %[ directive to scan for
"characters not including semicolon". Since %[ does *not* skip
leading whitespace, we must decide whether to do so ourselves.
If we choose to skip leading whitespace, and use a fieldwidth
that is correct for "char var1[100]", we could write:
ret = fscanf(fp, " %99[^;]; %i; %c\n", var1, &var2, &var3);
This solves the problem of converting the semicolon, but perhaps
adds a new one: malformed input lines that contain *no* semicolon
will be scanned, newline and all, into var1 until var1 fills up.
For instance, if the text in the input stream at the point of the
call begins with:
" hello there\n\tthis is not proper, is it?\nBob;"
then the " " directive will skip the two initial blanks, but the
subsequent %[ directive will read and convert the next two entire
lines plus the word "Bob" off the third line.
Can this be fixed (assuming it is a problem)? Certainly: just
add newline to the characters excluded from the scanset, giving
" %[^;\n]". But as you can see, things are getting complicated
already.
A well-behaved application will, in my opinion, handle malformed
input files, and if possible, give a hint about fixing such files.
One way to do this is to deliver a message to stderr giving the
input file name and line number. How can you keep track of the
line number? You simply have to take some particular action every
time you cross a newline. The scanf family's whitespace-eating
directives, however, cause a problem: newlines *are* "whitespace",
and that leading " " in " %[..." will happily scan right over
dozens or hundreds of them, without alerting you. We could fix
that by removing the leading blank, of course, and demanding that
names like "John" and "Mary" appear left-aligned. But what about
the rest of the whitespace directives in the format? There are
four more in "; %i; %c\n", even if two of them are hard to see.
Where are they? Well, the two blanks are obviously whitespace
directives. There is also one hidden inside "%i": %i, %d, %f,
and indeed most of the %-related directives all include one. (The
two exceptions are %[ and %c.) Finally, the newline at the end
of the format is also a whitespace directive. To the scanf family,
there is NO DIFFERENCE AT ALL between " ", "\t", "\n", "\b", and
so on! (In directives, that is.)
Call all this also be fixed? Most of it can, at the expense of
complicating the code enormously:
ret = fscanf("%99[^;\n];%c%i;%c%c%c", var1, &blank1,
&var2, &blank2, &var3, &newline1);
/* now inspect "ret", and if *it* seems OK, then check up on
blank1, blank2, and newline1 to make sure they are the
expected two blanks and a newline */
The "%i" directive is still troublesome. Also, note that %i
conversions are done as if via strtol() with 0 as a base, so that
numeric fields that begin with 0 or 0x are treated as octal or
hexadecimal, as in C. This is often not what one wants with
user input files (although sometimes it is).
There is a much better way to do this job, and that is to read
a complete input line with fgets() or some substitute (such as
CBFalconer's ggets()), then pick it apart, perhaps even with
sscanf(). Each time you read one complete input line, you can
be sure you have read one complete input line -- no more, and
no less. That makes it much easier to print error messages like:
input.file, line 47: wrong format -- I can't understand the
text "this line has no semicolons, oops!"
with something like:
fprintf(stderr, "%s, line %d: wrong format -- I can't "
"understand the text \"%s\"\n", filename, linenumber, inputline);
(note that this assumes the terminating newline has been stripped
from the input line).