atoi return

K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:
Richard Heathfield said:
CBFalconer said:

<snip>

atoi is always safe if you limit the input string to
length 4,

Rubbish. Even a length of 1 isn't guaranteed to be safe. The behaviour
of atoi("X") is undefined.
[...]

I'm not quite sure about that. Here's what the C99 standard says:

The functions atof, atoi, atol, and atoll need not affect the
value of the integer expression errno on an error. If the value
of the result cannot be represented, the behavior is undefined.

The value of the result of converting "X" to int representation cannot be
represented. Therefore, atoi("X") is undefined.

Perhaps. Or there is no "value of the result", and so the statement
that the behavior is undefined doesn't apply.
Yes. And atoi("X") is an error.

How so, if strtol("X", NULL, 10) isn't an error, but has well defined
behavior?

[...]

At the very least, I think it's reasonable to say that the standard's
wording is ambiguous, and allows for the interpretation that atoi("X")
must yield 0.
 
C

CBFalconer

Richard said:
CBFalconer said:



Rubbish. Even a length of 1 isn't guaranteed to be safe. The
behaviour of atoi("X") is undefined.

You are slow today. The "X" contains no digits, so atoi accurately
returns zero.
INT_MAX has nothing to do with it.

Still slow. Since INT_MAX must be at least 32767, a string with
length 4 cannot contain a value exceeding it. Thus no overflow can
occur.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:

It sure looks like an error to me, since 'X' isn't a digit in base ten.

The question isn't whether it *looks* like an error, it's whether it
*is* an error within the meaning of the standard. strtol() indicates
an error by setting errno to ERANGE. strtol("X", NULL, 10) doesn't do
so.

The following program is strictly conforming:

#include <stdlib.h>
int main(void)
{
return strtol("X", NULL, 10);
}
[...]

At the very least, I think it's reasonable to say that the standard's
wording is ambiguous, and allows for the interpretation that atoi("X")
must yield 0.

Would you bet the farm on that interpretation? I wouldn't.

Huh? I'm not betting the farm on anything. I accept that both
interpretations are reasonable.
 
K

Kenny McCormack

Richard Heathfield said:
Keith Thompson said:

It sure looks like an error to me, since 'X' isn't a digit in base ten.

The question isn't whether it *looks* like an error, it's whether it
*is* an error within the meaning of the standard. strtol() indicates
an error by setting errno to ERANGE. strtol("X", NULL, 10) doesn't do
so.

The following program is strictly conforming:

#include <stdlib.h>
int main(void)
{
return strtol("X", NULL, 10);
}
[...]

At the very least, I think it's reasonable to say that the standard's
wording is ambiguous, and allows for the interpretation that atoi("X")
must yield 0.

Would you bet the farm on that interpretation? I wouldn't.

Huh? I'm not betting the farm on anything. I accept that both
interpretations are reasonable.

I think RH's point is that the above is not pedantically, religiously,
dogmatically CLC-correct *unless* the strtol() does, in fact, return 0
(and can be proven to always return 0 under all the silly CLC-required
conditions).
 
T

Tim Rentsch

Richard Heathfield said:
CBFalconer said:


Wrong. It would be accurate to return 0 for "0", and indeed for "0X", since
the "initial portion of the string" (as the Standard has it) can be
represented as an int. But for "X", the 'X' marks a non-convertible
character, so any "initial portion of the string" must precede it, but
there isn't any string portion preceding it. Since the behaviour is
undefined, atoi *may* return 0, but it is not required to do that.

This issue can be resolved by a reading of 7.20.1.4, viz.,

7.20.1.4 p 2 -

The strtol, strtoll, strtoul, and strtoull functions convert
the initial portion of the string pointed to by nptr to long
int, long long int, unsigned long int, and unsigned long long
int representation, respectively. First, they decompose the
input string into three parts: an initial, possibly empty,
sequence of white-space characters (as specified by the isspace
function), a subject sequence resembling an integer represented
in some radix determined by the value of base, and a final
string of one or more unrecognized characters, including the
terminating null character of the input string. Then, they
attempt to convert the subject sequence to an integer, and
return the result.

7.20.1.4 p 4 -

The subject sequence is defined as the longest initial subsequence
of the input string, starting with the first non-white-space
character, that is of the expected form. The subject sequence
contains no characters if the input string is empty or consists
entirely of white space, or if the first non-white-space character
is other than a sign or permissible letter or digit.

7.20.1.4 p 7 -

If the subject sequence is empty or does not have the expected form,
no conversion is performed; the value of nptr is stored in the object
pointed to by endptr, provided that endptr is not a null pointer.

7.20.1.4 p 8 -

The strtol, strtoll, strtoul, and strtoull functions return the
converted value, if any. If no conversion could be performed,
zero is returned. If the correct value is outside the range of
representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX,
ULONG_MAX, or ULLONG_MAX is returned (according to the return
type and sign of the value, if any), and the value of the macro
ERANGE is stored in errno.


7.20.1.4 p 2,4 make clear that the phrase "the initial portion"
includes the possibility that this portion could be of zero length.

By 7.20.1.4 p 4, the subject sequence is empty.

By 7.20.1.4 p 7, no conversion is performed.

By 7.20.1.4 p 8, the return value is zero.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:
The question isn't whether it *looks* like an error, it's whether it
*is* an error within the meaning of the standard.

Since the Standard does not define the term "error", we must look to the
normal English meaning of the word (unless a definition can be found
within a normative reference[1]). The Chambers dictionary defines the word
"error" as: "1 a mistake, inaccuracy, or misapprehension. 2 the state of
being mistaken. 3 the possible discrepancy between an estimate and an
actual value or amount."

Taking "X" to represent an integer value in base ten certainly qualifies
under the first two senses cited above.
[...]

strtol() isn't just used to determine the integer value of a string.
It can also be used to determine *whether* a given string represents
an integer value. If you give it a string representing a value
outside the range of long, that's treated as an error (errno is set to
ERANGE), because there's a mathematically correct value but it's not
able to return it. If you give it the string "X", it *correctly*
returns 0 (and sets *endptr appropriately if endptr!=NULL).

If strtol("X", NULL, 10) is an "error", why doesn't it set errno?

I'm not saying that my interpretation is the only possible one, just
that it's a reasonable interpretation. (I'm also saying that I think
it's the correct interpretation, but I'm less certain of that.) Would
you bet the farm that your interpretation is the *only* possible one?
 
C

CBFalconer

Richard said:
CBFalconer said:
.... snip ...


The discussion is not about overflow.

But it is. An overflow is the only case in which atoi gets into
UB. Thus safe use requires only avoidance of such.
 
K

Keith Thompson

CBFalconer said:
But it is. An overflow is the only case in which atoi gets into
UB. Thus safe use requires only avoidance of such.

I think you're talking past each other. Richard's claim is that
atoi("X") also invokes undefined behavior. The standard says that
atoi(ptr) behaves like strtol(ptr, NULL, 10) "except for the behavior
on error". If passing "X" to strtol is considered an error, then
atoi("X") probably invokes undefined behavior. I don't agree with
him, but I think it's a plausible interpretation.
 
T

Tim Rentsch

Keith Thompson said:
I think you're talking past each other. Richard's claim is that
atoi("X") also invokes undefined behavior. The standard says that
atoi(ptr) behaves like strtol(ptr, NULL, 10) "except for the behavior
on error". If passing "X" to strtol is considered an error, then
atoi("X") probably invokes undefined behavior. I don't agree with
him, but I think it's a plausible interpretation.

Can you suggest any reasonable reading of 7.20.1.4 under which
strtol("X", NULL, 10) is an error? I understand that there are
captious arguments related either to the vagueness of the word
error or the wide latitude given to setting errno, but are there
any that pass the laugh test? The wording in 7.20.1.4 p 8 seems
pretty clear that the only errors are values outside the range
of the corresponding type (which doesn't hold in this case).
 
C

CBFalconer

Keith said:
I think you're talking past each other. Richard's claim is that
atoi("X") also invokes undefined behavior. The standard says that
atoi(ptr) behaves like strtol(ptr, NULL, 10) "except for the behavior
on error". If passing "X" to strtol is considered an error, then
atoi("X") probably invokes undefined behavior. I don't agree with
him, but I think it's a plausible interpretation.

However Tim Rentsch has recently published some standard
extractions (7.20.1.4) which show that that case is covered, and
returns the value 0 without causing an error.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:



If fopen(NULL, NULL) is an "error", why doesn't it set errno? (And yes, I
know that it does, on some implementations. But it isn't required to, by
the Standard.)

Which means that setting or not setting errno isn't useful in
determining the authors' intent regarding what is or isn't an error.

The description of strtol(), on the other hand, specifies that it sets
errno in some circumstances. The fact that it doesn't do so in other
cases does, I think, imply that the authors didn't think those other
cases are errors.
No, *but* if I were about to bet the farm, I'd make sure I backed the most
conservative of all reasonable interpretations, to minimise the risk of
farm loss. The interpretation I've backed is the one that concludes "avoid
atoi like the plague and use the better-designed strtol instead". Would
you disagree with such a choice? Do *you* use atoi?

Under either interpretation, some uses of atoi invoke undefined
behavior. My opinion about what the standard means is based on my
best reading of the standard, not on any risk analysis. My opinion
about whether to use atoi (answer: no) *is* based on a risk analysis
that's largely unchanged by this discussion.

I see that I've used atoi in at least a couple of small programs from
several years ago; they were test programs that apply atoi to their
command-line arguments, and that I don't expect anyone else ever to
run. At the time, I wasn't aware of the UB issue; I think I assumed
atoi would always return either a correct result or 0, and I didn't
need to distinguish between them. If I were going to release them,
I'd use strtol.
 
R

Richard

CBFalconer said:
No, gets is never safe. Read the literature, or scan the c.l.c
archives. atoi is always safe if you limit the input string to
length 4, or even more according to the value of INT_MAX.

Total and utter nonsense as usual. gets is totally safe in a controlled
system where YOU marshall the data fed to it. And if there is a bug in
that there could just as easily be a bug in one of a million places
where you stuff malloc'ed memory for example.
 
R

Richard Bos

Richard said:
Total and utter nonsense as usual. gets is totally safe in a controlled
system where YOU marshall the data fed to it.

Please demonstrate how to do so _reliably_. Note: you may not assume
bondage gear, a lack of bouncy kittens, or perfect typing fingwers.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top