string validation for int and long (strisint, strislong)

J

jake1138

Maybe this is a newbie thing and everyone already knows how to do this,
but I figured I'd post these functions anyway in case someone finds
them useful. I used Jack Klein's example (see link below) and made
functions out of it.

http://home.att.net/~jackklein/c/code/strtol.html

Here are functions that will validate a string to see if it represents
an integer or long, respectively:

/*
* checks to see if string is an integer
*
* returns 1 if true, 0 if false
*/
int strisint(const char *str, size_t size)
{
char *endptr;
long longint;

errno = 0;
longint = strtol(str, &endptr, 10);
if (errno == ERANGE || longint < INT_MIN || longint > INT_MAX
|| endptr == str || *endptr != '\0') {
return 0;
} else {
return 1;
}
}

/*
* checks to see if string is a long integer
*
* returns 1 if true, 0 if false
*/
int strislong(const char *str, size_t size)
{
char *endptr;
long longint;

errno = 0;
longint = strtol(str, &endptr, 10);
if (errno == ERANGE || endptr == str || *endptr != '\0') {
return 0;
} else {
return 1;
}
}
 
P

Peter Nilsson

jake1138 said:
Maybe this is a newbie thing and everyone already knows how to
do this,but I figured I'd post these functions anyway in case
someone finds them useful. ...

int strisint(const char *str, size_t size)
int strislong(const char *str, size_t size)

Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]
{
char *endptr;
long longint;

errno = 0;

You should allow the caller to preserve the prior errno if
strtol succeeds...

int errno_save = errno;
errno = 0;
...
if (errno) return 0;
errno_save = errno;
return endptr == str || *(endptr + strspn(endptr, " \t"));
longint = strtol(str, &endptr, 10);
if (errno == ERANGE || endptr == str || *endptr != '\0') {
return 0;
} else {
return 1;
}
}

One problem with these wrappers is that they don't return the
converted value, so the caller is likely going to have to call a
conversion routine like strtol _anyway_!

Testing for a decimal number, without conversion, can be done
with something like...

#include <ctype.h>

int is_number(const char *p)
{
int d = 0;
const unsigned char *up = (const unsigned char *) p;
while (isspace(*up)) up++;
if (*up == '+' || *up == '-') up++;
while (isdigit(*up++)) d = 1;
return d && *up == 0;
}
 
J

jake1138

Thanks for your comments. Read on...

Peter said:
jake1138 said:
Maybe this is a newbie thing and everyone already knows how to
do this,but I figured I'd post these functions anyway in case
someone finds them useful. ...

int strisint(const char *str, size_t size)
int strislong(const char *str, size_t size)

Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]

I've never heard that before (but I'm fairly new at C). In fact, I've
never read much at all about the rules of usage with the library
functions. Perhaps you could point me to documentation?
You should allow the caller to preserve the prior errno if
strtol succeeds...

int errno_save = errno;
errno = 0;
...
if (errno) return 0;
errno_save = errno;
return endptr == str || *(endptr + strspn(endptr, " \t"));

Did you mean "errno = errno_save;" on line 5? I'm not sure what that
buys you since I believe any function can potentially change the value
of errno and thus you should not rely on it being preserved between
function calls. It seems to me if the caller wants the value of errno,
they would be expected to store it before calling any given function.
Am I missing something?
One problem with these wrappers is that they don't return the
converted value, so the caller is likely going to have to call a
conversion routine like strtol _anyway_!

That is by design. I have these validation functions for several
reasons:
1) I can log an error and exit when I detect invalid input data.
2) I can use atoi (which doesn't detect errors).
3) I can write simple code: read -> validate -> convert -> store
Testing for a decimal number, without conversion, can be done
with something like...

#include <ctype.h>

int is_number(const char *p)
{
int d = 0;
const unsigned char *up = (const unsigned char *) p;
while (isspace(*up)) up++;
if (*up == '+' || *up == '-') up++;
while (isdigit(*up++)) d = 1;
return d && *up == 0;
}

I don't see how this handles the decimal in a decimal number. If you
pass in "3.14", it will fail. If you meant an integer number, then
this works except it doesn't handle invalid sizes. I guess the caller
could check against INT_MIN and INT_MAX, but I'd rather that be in the
validation routine.
 
P

Peter Nilsson

jake1138 said:
Peter said:
jake1138 said:
Maybe this is a newbie thing and everyone already knows how to
do this,but I figured I'd post these functions anyway in case
someone finds them useful. ...

int strisint(const char *str, size_t size)
int strislong(const char *str, size_t size)

Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]

I've never heard that before (but I'm fairly new at C). In fact,
I've never read much at all about the rules of usage with the
library functions. Perhaps you could point me to documentation?

The standards, or even just the public drafts. N869 is available
for public reading. The relevant section is 7.26 Future library
directions.
Did you mean "errno = errno_save;" on line 5?

Yes. Thanks.
I'm not sure what that buys you since I believe any function can
potentially change the value of errno and thus you should not rely
on it being preserved between function calls. It seems to me if
the caller wants the value of errno, they would be expected to
store it before calling any given function. Am I missing
something?

Note that I said "if strtol succeeds..."

No standard library function is allowed to set errno to 0. If you
think about this, you'll realise that this allows the caller to
delay error detection. If the caller sets errno to zero, then rather
than having to check errno after every function, it can delay the
test until later, possibly culling mutliple tests in the process.

Apart from general efficiency, it helps to make programs more
readable, since they are not cluttered with repeated tests.
I don't see how this handles the decimal in a decimal number. If
you pass in "3.14", it will fail.

As will strtol. Decimal is a number base name, not necessarily a
distinction between integer and floating point (which has decimal
_points_.) But then I could be wrong, according to whichever
literature you read. No matter...
If you meant an integer number, then
this works except it doesn't handle invalid sizes.
True.

I guess the caller could check against INT_MIN and INT_MAX, but
I'd rather that be in the validation routine.

I understand what you're doing, but realise that robustness would see
your processing routines checking for the same errors that your
validation suite are supposed to detect.

Reviewing the design of your (hypothetical) program, I would ask:
Why aren't you processing, or at least storing, the converted data
as you validate?
 
C

Chris Torek

No standard library function is allowed to set errno to 0. If you
think about this, you'll realise that this allows the caller to
delay error detection. If the caller sets errno to zero, then rather
than having to check errno after every function, it can delay the
test until later, possibly culling mutliple tests in the process.

The premise here is correct (no standard library function can clear
errno), but the conclusion is not. Successful operations are
allowed to set errno to some nonzero value. Hence:

errno = 0;
do_some_work();
if (errno) ...

may misfire, thinking something went wrong when all went well. I
think this was a bad design decision (not that errno itself is
exactly wonderful :) ), but it is in the C standards, so we must
live with it.

As a practical matter, many (far too many) Unix-derived systems
actually do set errno to ENOTTY on the first successful I/O from
or to a (non-device) file. If you have ever had email returned
with something like:

Subject: cannot send mail to joe.typo@host: Not a typewriter

this is the reason. Someone did an "errno = 0; do_some_work();
if (errno)". In this case, the work involved looking up the
user (whose name has a typo and hence does not exist); along
the way, something did some I/O; this set errno to ENOTTY, and
strerror(ENOTTY) is "Not a typewriter". Well, of course Joe
is not a typewriter, but what has that to do with anything? :)
 
E

Eric Sosman

jake1138 said:
Peter said:
jake1138 said:
[...]
int strisint(const char *str, size_t size)
int strislong(const char *str, size_t size)

Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]

I've never heard that before (but I'm fairly new at C). In fact, I've
never read much at all about the rules of usage with the library
functions. Perhaps you could point me to documentation?

A useful compilation of names to avoid can be found at

http://www.oakroadsystems.com/tech/c-predef.htm
 
P

Peter Nilsson

Chris said:
The premise here is correct (no standard library function can clear
errno), but the conclusion is not. Successful operations are
allowed to set errno to some nonzero value. Hence:

errno = 0;
do_some_work();
if (errno) ...

may misfire, thinking something went wrong when all went well. I
think this was a bad design decision (not that errno itself is
exactly wonderful :) ), but it is in the C standards, so we must
live with it.

Quite right, but where standard functions _are_ required to set
errno on certain conditions, the standards preclude such functions
from setting errno to other values, outside of those precise
conditions.

So it is certainly possible to perform bulk strtol calculations,
deferring error detection till later.
 
J

jake1138

Peter said:
Note that I said "if strtol succeeds..."

No standard library function is allowed to set errno to 0. If you
think about this, you'll realise that this allows the caller to
delay error detection. If the caller sets errno to zero, then rather
than having to check errno after every function, it can delay the
test until later, possibly culling mutliple tests in the process.

Apart from general efficiency, it helps to make programs more
readable, since they are not cluttered with repeated tests.

I see.
I understand what you're doing, but realise that robustness would see
your processing routines checking for the same errors that your
validation suite are supposed to detect.

Reviewing the design of your (hypothetical) program, I would ask:
Why aren't you processing, or at least storing, the converted data
as you validate?

Because I'm stupid. :) No, I just didn't think about it that way at
first. I realize now it makes more sense to do both with one routine.
 
D

Dave Thompson

jake1138 said:
Maybe this is a newbie thing and everyone already knows how to
do this,but I figured I'd post these functions anyway in case
someone finds them useful. ...

int strisint(const char *str, size_t size)
int strislong(const char *str, size_t size)

Both function identifiers are reserved for use as external
identifiers, since they begin with str and are followed
by a lowercase letter. [Capitalising one or more letters
won't help under C89's potential case insensitive linking.]
And so does/would is[a-z]*. Unfortunately.

Although C89 is officially obsolete, and I haven't heard of any
linkers with the uncased or 6-char problems for a long time now.

Testing for a decimal number, without conversion, can be done
with something like...

#include <ctype.h>

int is_number(const char *p)
{
int d = 0;
const unsigned char *up = (const unsigned char *) p;
while (isspace(*up)) up++;
if (*up == '+' || *up == '-') up++;
while (isdigit(*up++)) d = 1;
return d && *up == 0;

Needs to be *--up or up[-1].

But this doesn't check for _the value in range_ of int, or long, or
whatever, as the OP's versions did, and you often need or want. And
can't be fixed to do so without doing at least most of the conversion.


- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top