Why doesn't fscanf fail when it doesn't find ordinary characters(specified in format) in the input?


D

Disc Magnet

I wrote a simple C program to read the following CSV file:

a.csv:

1,2,3 4,5
10,20,30,40,50
100,200,300,400,500

Code:

#include <stdio.h>
#include <stdlib.h>

#define ROWS 3
#define COLS 5

int main()
{
FILE *fp;
int csv[ROWS][COLS];
int i, j;

if ((fp= fopen("a.csv", "r")) == NULL) {
fprintf(stderr, "Error opening file.");
exit(1);
}

for (i = 0; i < ROWS; i++)
for (j = 0; j < COLS; j++)
fscanf(fp, "%d,", &csv[j]);
fclose(fp);;

for (i = 0; i < ROWS; i++) {
for (j = 0; j < COLS; j++)
printf("%d ", csv[j]);
printf("\n");
}
}

Note, that the fscanf call is expecting an integer to be followed by a
comma everywhere. But the last integer in every line of the CSV file
is not followed by a comma. But, I still get the output perfectly well
for all integers.

Could someone please explain why the fscanf() function doesn't fail
when it doesn't find a comma after the last integer of every line? I
would appreciate if someone can explain or point me to the precise
rules involved in this.
 
Ad

Advertisements

E

Eric Sosman

I wrote a simple C program to read the following CSV file:

a.csv:

1,2,3 4,5
10,20,30,40,50
100,200,300,400,500

Code:

#include<stdio.h>
#include<stdlib.h>

#define ROWS 3
#define COLS 5

int main()
{
FILE *fp;
int csv[ROWS][COLS];
int i, j;

if ((fp= fopen("a.csv", "r")) == NULL) {
fprintf(stderr, "Error opening file.");
exit(1);
}

for (i = 0; i< ROWS; i++)
for (j = 0; j< COLS; j++)
fscanf(fp, "%d,",&csv[j]);
fclose(fp);;

for (i = 0; i< ROWS; i++) {
for (j = 0; j< COLS; j++)
printf("%d ", csv[j]);
printf("\n");
}
}

Note, that the fscanf call is expecting an integer to be followed by a
comma everywhere. But the last integer in every line of the CSV file
is not followed by a comma.


Neither is the third number in the first line.
But, I still get the output perfectly well
for all integers.

Could someone please explain why the fscanf() function doesn't fail
when it doesn't find a comma after the last integer of every line? I
would appreciate if someone can explain or point me to the precise
rules involved in this.

Since you never, not even once, checked the value returned
by fscanf(), why are you so sure it didn't fail? Taking the
batteries out of your smoke detectors doesn't mean nothing's
burning.

However, in this case fscanf() would not have reported a
failure anyhow. The value it returns tells you how many values
it converted and assigned before it stopped, for whatever reason.
Since it sees and converts the integer successfully, and then
stops when it doesn't find a comma, it returns 1 to indicate that
it assigned one value. There's no way to find out whether it
then stopped after matching a comma, or because it failed to
match a comma. The reporting channel is "too narrow," if you
like.

Here's the specification from the Standard (7.19.6.2p16):

The fscanf function returns the value of the macro EOF if
an input failure occurs before any conversion. Otherwise,
the function returns the number of input items assigned,
which can be fewer than provided for, or even zero, in the
event of an early matching failure.

With this in mind, there's no way to detect a matching failure
that occurs *after* the last assignment. That's one of the
reasons scanf() and fscanf() are seldom used for "serious" input
where finer control and greater transparency are needed. More
often, the program reads an entire uninterpreted line into an
array of char, and then picks the line apart with sscanf() or
with things like strtol().

Note that if one of your input lines had one extra number,
the plain fscanf() loop would get "out of step" with the line
breaks thereafter ...
 
B

Ben Bacarisse

Disc Magnet said:
I wrote a simple C program to read the following CSV file:

a.csv:

1,2,3 4,5
10,20,30,40,50
100,200,300,400,500

Code:

#include <stdio.h>
#include <stdlib.h>

#define ROWS 3
#define COLS 5

int main()
{
FILE *fp;
int csv[ROWS][COLS];
int i, j;

if ((fp= fopen("a.csv", "r")) == NULL) {
fprintf(stderr, "Error opening file.");
exit(1);
}

for (i = 0; i < ROWS; i++)
for (j = 0; j < COLS; j++)
fscanf(fp, "%d,", &csv[j]);
fclose(fp);;

for (i = 0; i < ROWS; i++) {
for (j = 0; j < COLS; j++)
printf("%d ", csv[j]);
printf("\n");
}
}

Note, that the fscanf call is expecting an integer to be followed by a
comma everywhere. But the last integer in every line of the CSV file
is not followed by a comma. But, I still get the output perfectly well
for all integers.

Could someone please explain why the fscanf() function doesn't fail
when it doesn't find a comma after the last integer of every line? I
would appreciate if someone can explain or point me to the precise
rules involved in this.


It does fail. It will try to find a ',' and, when it
does not, it stops scanning but since the , is at the end of the
format after it has read the integer the failure is not visible to
you. fscanf returns 1 even if the pattern fails.

You would have seen a problem if you tried to read in pairs:

scanf("%d,%d", &i1, &i2)

will stop after the first int and return 1 (rather then 2) if it sees
"42 43" in the input.

If you want to force fscanf to fail visibly (i.e. return a count of
matching items less than you'd expect) you can ask it to read and
return a comma:

char dummy;
if (fscanf(fp, "%d%1[,]", &csv[j], &dummy) != 2) break;

but, of course, that is less useful to you since it won't read you
input file.

If you need to check the CSV format as you read it, you will be forced
to read the file line by line and break it up using sscanf or maybe
strtok and strtod.

As for the precise rules, they should be in any good C text. If you
want it from the horse's mouth you can find a free PDF of the C
standard here:

http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
 
Ad

Advertisements

D

David Thompson

Disc Magnet <[email protected]> writes:
If you want to force fscanf to fail visibly (i.e. return a count of
matching items less than you'd expect) you can ask it to read and
return a comma:

char dummy;
if (fscanf(fp, "%d%1[,]", &csv[j], &dummy) != 2) break;

Nit: %N[chars] can store N chars (here 1) PLUS a NUL, so officially
you need char dummy[2]; and &dummy[0] or just dummy.

<snip rest>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top