Storing values that are too large for that data type

A

Albert

Hello
If I have some integer x that's written to a file, "file.txt" that
exceeds the range of INT_MAX on the operating system, and my first
call to fscanf (or any 'input' function) reads:

fscanf("file.txt", "%d", &number);

where I've declared an
int number'
above the fscanf call

will fscanf return an error?

TIA
Albert
 
C

CBFalconer

Albert said:
If I have some integer x that's written to a file, "file.txt" that
exceeds the range of INT_MAX on the operating system, and my first
call to fscanf (or any 'input' function) reads:

fscanf("file.txt", "%d", &number);

where I've declared an
int number'
above the fscanf call

will fscanf return an error?

It returns the number of conversions completed. If successful, the
above code will return 1. If not, it will return 0. However you
have to acquire the returned value in order to investigate it.

Just read the standard description of the function. The standard
is the things marked 'C99' below. Note that .bz2 files are
compressed with bzip2.

Some useful references about C:
<http://www.ungerhu.com/jxh/clc.welcome.txt>
<http://c-faq.com/> (C-faq)
<http://benpfaff.org/writings/clc/off-topic.html>
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> (C99)
<http://cbfalconer.home.att.net/download/n869_txt.bz2> (pre-C99)
<http://www.dinkumware.com/c99.aspx> (C-library}
<http://gcc.gnu.org/onlinedocs/> (GNU docs)
<http://clc-wiki.net/wiki/C_community:comp.lang.c:Introduction>
<http://clc-wiki.net/wiki/Introduction_to_comp.lang.c>
 
A

Albert

It returns the number of conversions completed.  If successful, the
above code will return 1.  If not, it will return 0.  However you
have to acquire the returned value in order to investigate it.

So there's no way of predicting the return value of fscanf in this
specific situation?
 
B

Ben Bacarisse

Albert said:
So there's no way of predicting the return value of fscanf in this
specific situation?

No, despite CBFalconer's first sentence, too large a number causes
undefined behaviour and all bets are off. Many versions of the scanf
family read the number, set the argument to INT_MAX and return 1. You
can't reply on this on all systems, and the value of INT_MAX might
have been the actual input so you can't conclude that there was an
error if the argument is set to INT_MAX.

You can limit the input length and check for a non-digit afterwards:

fscanf(fp, "%9d%1[^0-9]", &num, after)

but to be portable you need macro magic to insert the right digit
after the first % or you can use %9ld since long ints are guaranteed
to hold all 9-digit integers. There is a can of worms here with
negative numbers.

These are some of the reasons why many people prefer to read a line
and deal with it using the strto* functions.
 
K

Keith Thompson

Ben Bacarisse said:
You can limit the input length and check for a non-digit afterwards:

fscanf(fp, "%9d%1[^0-9]", &num, after)

but to be portable you need macro magic to insert the right digit
after the first % or you can use %9ld since long ints are guaranteed
to hold all 9-digit integers. There is a can of worms here with
negative numbers.

This is still tricky if you want to handle large numbers. For
example, suppose INT_MAX==32767, and let's ignore negative numbers for
now. Limiting the input to 5 digits allows inputs from 10000 (which
is ok) to 99999 (which causes undefined behavior).

You can use this method if the range of numbers you're willing to
accept is much smaller than the range of the type you're storing them
in.

It would have been nice if the *scanf() functions were required to
deal cleanly with numeric overflow.
 
K

Keith Thompson

Ben Bacarisse said:
You can limit the input length and check for a non-digit afterwards:

fscanf(fp, "%9d%1[^0-9]", &num, after)

but to be portable you need macro magic to insert the right digit
after the first % or you can use %9ld since long ints are guaranteed
to hold all 9-digit integers. There is a can of worms here with
negative numbers.

This is still tricky if you want to handle large numbers. For
example, suppose INT_MAX==32767, and let's ignore negative numbers for
now. Limiting the input to 5 digits allows inputs from 10000 (which
is ok) to 99999 (which causes undefined behavior).

You can use this method if the range of numbers you're willing to
accept is much smaller than the range of the type you're storing them
in.

It would have been nice if the *scanf() functions were required to
deal cleanly with numeric overflow.

[Second attempt to send this; sorry if it appears twice.]
 
C

CBFalconer

Jack said:
Actually, the language places on requirements at all on the return
value of fscanf(), or anything else that happens, once the program
invokes undefined behavior. Which it does.

Yes. I didn't notice that he wasn't giving it an opened FILE* to
read from. Again, read the function description.
 
C

CBFalconer

Jack said:
Actually, the language places on requirements at all on the return
value of fscanf(), or anything else that happens, once the program
invokes undefined behavior. Which it does.

Yes. I didn't notice that he wasn't giving it an opened FILE* to
read from. Again, read the function description.
 
C

CBFalconer

Jack said:
Actually, the language places on requirements at all on the
return value of fscanf(), or anything else that happens, once
the program invokes undefined behavior. Which it does.

Yes. I didn't notice that he wasn't giving it an opened FILE* to
read from. Again, read the function description.
 
B

Ben Bacarisse

CBFalconer said:
Yes. I didn't notice that he wasn't giving it an opened FILE* to
read from. Again, read the function description.

I don't know to which UB Jack Klein was referring, but I view the
problem with the first parameter is a side-issue to the OP's question.
Reading an unrepresentable value causes UB, so your comment that "it
returns the number of conversions completed" is, shall we say,
unhelpful in answering this question.
 
C

CBFalconer

Keith said:
.... snip ...

This is still tricky if you want to handle large numbers. For
example, suppose INT_MAX==32767, and let's ignore negative numbers
for now. Limiting the input to 5 digits allows inputs from 10000
(which is ok) to 99999 (which causes undefined behavior).

You can use this method if the range of numbers you're willing to
accept is much smaller than the range of the type you're storing
them in.

It would have been nice if the *scanf() functions were required to
deal cleanly with numeric overflow.

That is one of the reasons I developed the readxwd and readxint
routines some time ago. They should have input to a long, but that
is easily changed. They read values directly from streams, detect
all errors, and preserve the termination char (via ungetc). They
also detect unsigned wrap-around. This eliminates any need for
buffers. They are restricted to decimal input.

I have published that code here previously.
 
C

CBFalconer

Keith said:
.... snip ...

[Second attempt to send this; sorry if it appears twice.]

You seem to also be using motzarella. It was especially evil
around 4pm EST today.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top