strto[u]l and ERANGE

  • Thread starter Dave Vandervies
  • Start date
D

Dave Vandervies

strtol and strtoul are defined as setting errno to ERANGE if they get
input representing a number too large or too small to represent in a
long or unsigned long (n869 7.20.1.4#8).

Does this mean that this (from n869 7.5):
--------
[#3] The value of errno is zero at program startup, but is
never set to zero by any library function.159) The value of
errno may be set to nonzero by a library function call
whether or not there is an error, provided the use of errno
^^^^^^^^^^^^^^^^^^^^^^^^^^
is not documented in the description of the function in this
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
International Standard.
--------
prohibits strtol and friends from setting errno to ERANGE if their input
represents a value within the allowed range?

If they're allowed to set errno to ERANGE for any input, are there any
reasonable library implementations that actually do this, or would this
code still be safe to use?
--------
errno=0;
value=strtol(str,&endptr,10);
if(errno==ERANGE)
{
whine_and_handle_error("strtol set ERANGE");
}
if(endptr==str)
{
whine_and_handle_error("strtol couldn't find value");
}
/*Carry on...*/
--------
Or would I also need to check for value being one of the values that fgets
returned on out-of-range input? (In that case, is there any guaranteed
way to tell the difference between maximum-value and too-large?)


dave
 
P

Peter Nilsson

Dave said:
strtol and strtoul are defined as setting errno to ERANGE if they get
input representing a number too large or too small to represent in a
long or unsigned long (n869 7.20.1.4#8).

Does this mean that this (from n869 7.5):
--------
[#3] The value of errno is zero at program startup, but is
never set to zero by any library function.159) The value of
errno may be set to nonzero by a library function call
whether or not there is an error, provided the use of errno
^^^^^^^^^^^^^^^^^^^^^^^^^^
is not documented in the description of the function in this
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
International Standard.

Yes. Although there is a caveat with strtoul in that most
implementations
read the standard as accepting negative values. So input of '-1' would
produce UINT_MAX without setting errno.

% type strtoul.c
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
unsigned u = strtoul("-1", 0, 10);
printf("%u [%d]\n", u, errno);
return 0;
}

% gcc -ansi -pedantic strtoul.c

% a.exe
4294967295 [0]

%
 
J

Jack Klein

strtol and strtoul are defined as setting errno to ERANGE if they get
input representing a number too large or too small to represent in a
long or unsigned long (n869 7.20.1.4#8).

Does this mean that this (from n869 7.5):
--------
[#3] The value of errno is zero at program startup, but is
never set to zero by any library function.159) The value of
errno may be set to nonzero by a library function call
whether or not there is an error, provided the use of errno
^^^^^^^^^^^^^^^^^^^^^^^^^^
is not documented in the description of the function in this
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
International Standard.

Since the description of the function specifies under what conditions
it sets errno, it may not set it to any nonzero value under any other
condition. Since you read and highlighted the wording, and it is now
terribly complex or convoluted, why are you questioning it?
If they're allowed to set errno to ERANGE for any input, are there any
reasonable library implementations that actually do this, or would this
code still be safe to use?
--------
errno=0;
value=strtol(str,&endptr,10);
if(errno==ERANGE)
{
whine_and_handle_error("strtol set ERANGE");
}
if(endptr==str)
{
whine_and_handle_error("strtol couldn't find value");
}
/*Carry on...*/

I have not got the faintest idea of what you mean about values fgets()
returns on out-of-range input, since fgets() has no concept of in
range or out of range.

Perhaps you should post again and reread your post a few times before
sending.
 
S

S.Tobias

Peter Nilsson said:
Yes. Although there is a caveat with strtoul in that most
implementations
read the standard as accepting negative values. So input of '-1' would
produce UINT_MAX without setting errno.

Is that correct? According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?
 
S

S.Tobias

Jack Klein said:
I have not got the faintest idea of what you mean about values fgets()
returns on out-of-range input, since fgets() has no concept of in
range or out of range.

I think he meant to ask if one has to check strtol() for return
indicating and error, and then check `errno' (like with fgets(),
first you have to have to compare it against NULL, and *then* call
`ferror()' or `feof()'); or whether it's enough to check `errno'.
I think in case of strtol() it's only correct and enough to look
at the `errno' (for there're not many possibilities for an error).
Is that the case for all errno-setting functions, ie. do they *have*
*to* set `errno' to some non-zero value if an error occured?
 
P

P.J. Plauger

Is that correct? According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?

Yep, the "it is clearly not" part. Negating an unsigned value
is well defined, since "unsigned" really means modulus arithmetic.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
E

Eric Sosman

P.J. Plauger said:
S.Tobias said:
[...] According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?

Yep, the "it is clearly not" part. Negating an unsigned value
is well defined, since "unsigned" really means modulus arithmetic.

Can we conclude that

strtoul("99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999",
NULL, 0)

.... is similarly well-defined, and should not report an out-of-
range error?

(I'm not being facetious; I'm looking for information.
The Standard's text seems foggy about such matters.)
 
P

P.J. Plauger

P.J. Plauger said:
S.Tobias said:
[...] According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?

Yep, the "it is clearly not" part. Negating an unsigned value
is well defined, since "unsigned" really means modulus arithmetic.

Can we conclude that

strtoul("99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999",
NULL, 0)

... is similarly well-defined, and should not report an out-of-
range error?

(I'm not being facetious; I'm looking for information.
The Standard's text seems foggy about such matters.)

No. The converted value must be representable as a value
of the specified unsigned type. Negating a representable
value always yields a representable value; that was the
narrow point I was addressing.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
L

lawrence.jones

S.Tobias said:
Is that correct? According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?

Yes -- the negation is in the return type, so for strtoul it's unsigned
negation. Thus, -1 becomes ULONG_MAX, which is in range.

-Larry Jones

You're going to be pretty lonely in the nursing home. -- Calvin
 
C

CBFalconer

P.J. Plauger said:
Yep, the "it is clearly not" part. Negating an unsigned value
is well defined, since "unsigned" really means modulus arithmetic.

I have had long bitter arguments about this (and been outshouted).
If we are going to perform modular transformations on the value
returned by strtoul, we should do it for all values, which includes
values beyond ULONG_MAX. I still consider the only sane mechanism
is to return ERANGE for anything beyond the _principal_ value
range. After that I would be willing to go along with the modular
conversion.

My attitude is that I want to know if the user is getting something
he doesn't expect.
 
A

Anonymous 7843

My attitude is that I want to know if the user is getting something
he doesn't expect.

An approach I used (a long time ago) was to do a second conversion
back into a string, then do a strcmp-like thing to see if the value
made the round-trip intact. This is pretty simple for integers,
but floating point is a bit more difficult.
 
R

Robert Gamble

Anonymous said:
An approach I used (a long time ago) was to do a second conversion
back into a string, then do a strcmp-like thing to see if the value
made the round-trip intact.

Or you could just use sscanf:

(Range checking omitted)

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
unsigned u = 0;

if (argc < 2) {
fprintf(stderr, "no value provided\n");
exit(EXIT_FAILURE);
}

char *ptr = argv[1];

if (sscanf(ptr, "%u", &u) == 1) {
if (sscanf(ptr, "-%u", &u) == 1 && u)
puts("value is negative");
else
printf("value is %u\n", u);
} else
puts("invalid input");
return 0;
}

$ gcc -Wall -W -std=c99 -pedantic test_sscanf3.c -o test_sscanf3
$ ./test_sscanf3 "1"
value is 1
$ ./test_sscanf3 "-1"
value is negative
$ ./test_sscanf3 "-0"
value is 0
This is pretty simple for integers,
but floating point is a bit more difficult.

But since C doesn't have unsigned floating point types this isn't an
issue.

Robert Gamble
 
P

Peter Nilsson

Robert said:
Or you could just use sscanf:

The behaviour of the scanf family is undefined if the text being
converted is outside the range of the target type, even for unsigned
types. The strtoxxx functions on the other hand have a well defined
(if not always intuitively obvious) behaviour.
 
J

Jack Klein

Anonymous said:
An approach I used (a long time ago) was to do a second conversion
back into a string, then do a strcmp-like thing to see if the value
made the round-trip intact.

Or you could just use sscanf:

(Range checking omitted)

#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
unsigned u = 0;

if (argc < 2) {
fprintf(stderr, "no value provided\n");
exit(EXIT_FAILURE);
}

char *ptr = argv[1];

if (sscanf(ptr, "%u", &u) == 1) {
if (sscanf(ptr, "-%u", &u) == 1 && u)
puts("value is negative");
else
printf("value is %u\n", u);
} else
puts("invalid input");
return 0;
}

$ gcc -Wall -W -std=c99 -pedantic test_sscanf3.c -o test_sscanf3
$ ./test_sscanf3 "1"
value is 1
$ ./test_sscanf3 "-1"
value is negative
$ ./test_sscanf3 "-0"
value is 0
This is pretty simple for integers,
but floating point is a bit more difficult.

But since C doesn't have unsigned floating point types this isn't an
issue.

Robert Gamble

Except that scanf() and siblings, unlike strtoul() and siblings, does
produce undefined behavior on appropriately invalid input.
 
C

CBFalconer

Anonymous said:
An approach I used (a long time ago) was to do a second conversion
back into a string, then do a strcmp-like thing to see if the value
made the round-trip intact. This is pretty simple for integers,
but floating point is a bit more difficult.

Here is code I wrote some time ago to handle input from streams,
and catch all overflows.

/* ------------------------------------------------- *
* File txtinput.c *
* ------------------------------------------------- */

#include <limits.h> /* xxxx_MAX, xxxx_MIN */
#include <ctype.h> /* isdigit, isblank, isspace */
#include <stdio.h> /* FILE, getc, ungetc */
#include "txtinput.h"

#define UCHAR unsigned char

/* These stream input routines are written so that simple
* conditionals can be used:
*
* if (readxint(&myint, stdin)) {
* do_error_recovery; normally_abort_to_somewhere;
* }
* else {
* do_normal_things; usually_much_longer_than_bad_case;
* }
*
* They allow overflow detection, and permit other routines to
* detect the character that terminated a numerical field. No
* string storage is required, thus there is no limitation on
* the length of input fields. For example, a number entered
* with a string of 1000 leading zeroes will not annoy these.
*
* The numerical input routines *NEVER* absorb a terminal '\n'.
* Thus a sequence such as:
*
* err = readxint(&myint, stdin);
* flushln(stdin);
*
* will always consume complete lines.
*
* They are also re-entrant, subject to the limitations of file
* systems. e.g interrupting readxint(v, stdin) operation with
* a call to readxwd(wd, stdin) would not be well defined, if
* the same stdin is being used for both calls. If ungetc is
* interruptible the run-time system is broken.
*/

/*--------------------------------------------------------------
* Skip all blanks on f. At completion getc(f) will return
* a non-blank character, which may be \n or EOF
*
* Skipblks returns the char that getc will next return, or EOF.
*/
int skipblks(FILE *f)
{
int ch;

do {
ch = getc(f);
} while ((' ' == ch) || ('\t' == ch));
/* while (isblank((UCHAR)ch)); */ /* for C99 */
return ungetc(ch, f);
} /* skipblks */

/*--------------------------------------------------------------
* Skip all whitespace on f, including \n, \f, \v, \r. At
* completion getc(f) will return a non-blank character, which
* may be EOF
*
* Skipwhite returns the char that getc will next return, or EOF.
*/
int skipwhite(FILE *f)
{
int ch;

do {
ch = getc(f);
} while (isspace((UCHAR)ch));
return ungetc(ch, f);
} /* skipwhite */

/*--------------------------------------------------------------
* Read an unsigned value. Signal error for overflow or no
* valid number found. Returns true for error, false for noerror
*
* Skip all leading whitespace on f. At completion getc(f) will
* return the character terminating the number, which may be \n
* or EOF among others. Barring EOF it will NOT be a digit. The
* combination of error and the following getc returning \n
* indicates that no numerical value was found on the line.
*
* If the user wants to skip all leading white space including
* \n, \f, \v, \r, he should first call "skipwhite(f);"
*
* Peculiarity: This specifically forbids a leading '+' or '-'.
* Peculiarity: This forbids overflow, unlike C unsigned usage.
* on overflow, UINT_MAX is returned.
*/
int readxwd(unsigned int *wd, FILE *f)
{
unsigned int value, digit;
int status;
int ch;

#define UWARNLVL (UINT_MAX / 10U)
#define UWARNDIG (UINT_MAX - UWARNLVL * 10U)

value = 0; /* default */
status = 1; /* default error */

do {
ch = getc(f);
} while ((' ' == ch) || ('\t' == ch)); /* skipblanks */
/* while (isblank((UCHAR)ch)); */ /* for C99 */

if (!(EOF == ch)) {
if (isdigit((UCHAR)ch)) /* digit, no error */
status = 0;
while (isdigit((UCHAR)ch)) {
digit = (unsigned) (ch - '0');
if ((value < UWARNLVL) ||
((UWARNLVL == value) && (UWARNDIG >= digit)))
value = 10 * value + digit;
else { /* overflow */
status = 1;
value = UINT_MAX;
}
ch = getc(f);
} /* while (ch is a digit) */
}
*wd = value;
ungetc(ch, f);
return status;
} /* readxwd */

/*--------------------------------------------------------------
* Read a signed value. Signal error for overflow or no valid
* number found. Returns true for error, false for noerror. On
* overflow either INT_MAX or INT_MIN is returned in *val.
*
* Skip all leading whitespace on f. At completion getc(f) will
* return the character terminating the number, which may be \n
* or EOF among others. Barring EOF it will NOT be a digit. The
* combination of error and the following getc returning \n
* indicates that no numerical value was found on the line.
*
* If the user wants to skip all leading white space including
* \n, \f, \v, \r, he should first call "skipwhite(f);"
*
* Peculiarity: an isolated leading '+' or '-' NOT immediately
* followed by a digit will return error and a value of 0, when
* the next getc will return that following non-digit. This is
* caused by the single level ungetc available.
*/
int readxint(int *val, FILE *f)
{
unsigned int value;
int status, negative;
int ch;

*val = value = 0; /* default */
status = 1; /* default error */
negative = 0;

do {
ch = getc(f);
} while ((' ' == ch) || ('\t' == ch)); /* skipwhite */
/* while (isblank((UCHAR)ch)); */ /* for C99 */

if (!(EOF == ch)) {
if (('+' == ch) || ('-' == ch)) {
negative = ('-' == ch);
ch = getc(f); /* absorb any sign */
}

if (isdigit((UCHAR)ch)) { /* digit, no error */
ungetc(ch, f);
status = readxwd(&value, f);
ch = getc(f); /* This terminated readxwd */
}

if (negative && (value < UINT_MAX) &&
((value - 1) <= -(1 + INT_MIN))) *val = -value;
else if (value <= INT_MAX) *val = value;
else { /* overflow */
status = 1;
if (value)
if (negative) *val = INT_MIN;
else *val = INT_MAX;
}
}
ungetc(ch, f);
return status;
} /* readxint */

/*-----------------------------------------------------
* Flush input through an end-of-line marker inclusive.
*/
int flushln(FILE *f)
{
int ch;

do {
ch = getc(f);
} while (('\n' != ch) && (EOF != ch));
return ch;
} /* flushln */

/* End of txtinput.c */
 
S

S.Tobias

P.J. Plauger said:
Eric Sosman said:
P.J. Plauger said:
[...] According to my reading of 7.20.1.4 (too long to quote)
in `strtoul("-1", 0, 10)', first "1" is converted
into a (mathematical) value `1'. Then, since there's
a "-", the value is negated (value `-1'). Then a check is made
if the value is representable (0..ULONG_MAX); it is clearly not,
so strtoul should return ULONG_MAX and set `errno' to ERANGE.
Have I got something wrong?

Yep, the "it is clearly not" part. Negating an unsigned value
is well defined, since "unsigned" really means modulus arithmetic.

Can we conclude that

strtoul("99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999",
NULL, 0)

... is similarly well-defined, and should not report an out-of-
range error?

(I'm not being facetious; I'm looking for information.
The Standard's text seems foggy about such matters.)

No. The converted value must be representable as a value
of the specified unsigned type. Negating a representable
value always yields a representable value; that was the
narrow point I was addressing.

Although I agree this is a reasonable interpretation, I think
the Standard is not quite clear about it. It doesn't actually
say that the intermediate value is range-checked *before* negation
(or that it has a type; it only says "the value [...] is negated
(in the return type)").

I think one could argue that this:

strtoul("-99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999"
"99999999999999999999999999999999999999999",
NULL, 10)

should return some valid, non-error result.
 
P

Peter Nilsson

S.Tobias said:
I think he meant to ask if one has to check strtol() for return
indicating and error, and then check `errno' (like with fgets(),
first you have to have to compare it against NULL, and *then* call
`ferror()' or `feof()'); or whether it's enough to check `errno'.
I think in case of strtol() it's only correct and enough to look
at the `errno' (for there're not many possibilities for an error).

You still need to be careful with input like, e.g. "", "-", "x" where
errno will not be set.
Is that the case for all errno-setting functions, ie. do they *have*
*to* set `errno' to some non-zero value if an error occured?

Yes, but only for the documented conditions [error is a bit of a
misnomer, as the above samples should show.]

Only non 'errno-setting functions' have freedom to do otherwise...

7.5p3
The value of errno may be set to nonzero by a library function call
whether or not there is an error, provided the use of errno is not
documented in the description of the function in this International
Standard.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,167
Latest member
SusanaSwan
Top