Converting strings to int

Discussion in 'C Programming' started by allthecoolkidshaveone@gmail.com, May 24, 2007.

  1. Guest

    I want to convert a string representation of a number ("1234") to an
    int, with overflow and underflow checking. Essentially, I'm looking
    for a strtol() that converts int instead of long. The problem with
    strtol() is that a number that fits into a long might be too big for
    an int. sscanf() doesn't seem to do the over/underflow checking.
    atoi(), of course, doesn't do any checking. I've long thought it odd
    that there aren't strtoi() and friends for int and short types in the
    standard.

    Any suggestions?
    , May 24, 2007
    #1
    1. Advertising

  2. Ian Collins Guest

    wrote:
    > I want to convert a string representation of a number ("1234") to an
    > int, with overflow and underflow checking. Essentially, I'm looking
    > for a strtol() that converts int instead of long. The problem with
    > strtol() is that a number that fits into a long might be too big for
    > an int. sscanf() doesn't seem to do the over/underflow checking.
    > atoi(), of course, doesn't do any checking. I've long thought it odd
    > that there aren't strtoi() and friends for int and short types in the
    > standard.
    >
    > Any suggestions?
    >

    Use strtol() and check the result to see if it fits in an int.

    --
    Ian Collins.
    Ian Collins, May 24, 2007
    #2
    1. Advertising

  3. Guest

    On May 24, 12:22 am, wrote:
    > I want to convert a string representation of a number ("1234") to an
    > int, with overflow and underflow checking.


    Of course, 30 seconds later, I think to myself "Why not convert to a
    long and see if it's between INT_MIN and INT_MAX and if so return
    that value casted to an int?"
    , May 24, 2007
    #3
  4. wrote:
    > I want to convert a string representation of a number ("1234") to an
    > int, with overflow and underflow checking. Essentially, I'm looking
    > for a strtol() that converts int instead of long. The problem with
    > strtol() is that a number that fits into a long might be too big for
    > an int. sscanf() doesn't seem to do the over/underflow checking.
    > atoi(), of course, doesn't do any checking. I've long thought it odd
    > that there aren't strtoi() and friends for int and short types in the
    > standard.


    Check the long value against INT_MAX and INT_MIN.
    Then you have the value (if the conversion to long worked), even if out
    of range for an int, and the error checking you want.
    Martin Ambuhl, May 24, 2007
    #4
  5. said:

    > On May 24, 12:22 am, wrote:
    >> I want to convert a string representation of a number ("1234") to an
    >> int, with overflow and underflow checking.

    >
    > Of course, 30 seconds later, I think to myself "Why not convert to a
    > long and see if it's between INT_MIN and INT_MAX and if so return
    > that value casted to an int?"


    If it is between those values, you don't need a cast. And if it isn't, a
    cast won't do any good anyway.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, May 24, 2007
    #5
  6. Richard Heathfield <> writes:
    > said:
    >> On May 24, 12:22 am, wrote:
    >>> I want to convert a string representation of a number ("1234") to an
    >>> int, with overflow and underflow checking.

    >>
    >> Of course, 30 seconds later, I think to myself "Why not convert to a
    >> long and see if it's between INT_MIN and INT_MAX and if so return
    >> that value casted to an int?"

    >
    > If it is between those values, you don't need a cast. And if it isn't, a
    > cast won't do any good anyway.


    But if you want to store the result in an int, you *will* need a
    conversion. This conversion will be done implicitly when you assign
    the value.

    A lot of people aren't aware that the term "cast" refers *only* to the
    explicit cast operator, using a type name in parentheses.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 24, 2007
    #6
  7. Keith Thompson said:

    > Richard Heathfield <> writes:
    >> said:
    >>>
    >>> [...] "Why not convert to a
    >>> long and see if it's between INT_MIN and INT_MAX and if so return
    >>> that value casted to an int?"

    >>
    >> If it is between those values, you don't need a cast. And if it
    >> isn't, a cast won't do any good anyway.

    >
    > But if you want to store the result in an int, you *will* need a
    > conversion. This conversion will be done implicitly when you assign
    > the value.


    Or you can simply return it:

    int foo(const char *s)
    {
    long int x = whatever(s);
    validate_or_die(x);
    return x;
    }

    >
    > A lot of people aren't aware that the term "cast" refers *only* to the
    > explicit cast operator, using a type name in parentheses.


    Yes, sure, but do we really need to include a full chapter of
    explanation in every single reply we post?

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, May 24, 2007
    #7
  8. Richard Heathfield <> writes:
    > Keith Thompson said:
    >> Richard Heathfield <> writes:
    >>> said:
    >>>> [...] "Why not convert to a
    >>>> long and see if it's between INT_MIN and INT_MAX and if so return
    >>>> that value casted to an int?"
    >>>
    >>> If it is between those values, you don't need a cast. And if it
    >>> isn't, a cast won't do any good anyway.

    >>
    >> But if you want to store the result in an int, you *will* need a
    >> conversion. This conversion will be done implicitly when you assign
    >> the value.

    [...]
    >> A lot of people aren't aware that the term "cast" refers *only* to the
    >> explicit cast operator, using a type name in parentheses.

    >
    > Yes, sure, but do we really need to include a full chapter of
    > explanation in every single reply we post?


    No, but it seemed reasonable in this case. The OP incorrectly thought
    he needed a cast; the common confusion between "cast" and "conversion"
    is a likely explanation of his confusion.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, May 24, 2007
    #8
  9. David Tiktin Guest

    On 24 May 2007, wrote:

    > I want to convert a string representation of a number ("1234") to

    an
    > int, with overflow and underflow checking. Essentially, I'm looking
    > for a strtol() that converts int instead of long. The problem with
    > strtol() is that a number that fits into a long might be too big

    for
    > an int. sscanf() doesn't seem to do the over/underflow checking.
    > atoi(), of course, doesn't do any checking. I've long thought it

    odd
    > that there aren't strtoi() and friends for int and short types in

    the
    > standard.
    >
    > Any suggestions?


    It's actually harder than it looks to use strtol() properly. Here's
    the guts a wrapper function I wrote for ints. The wrapper returns 1
    if the conversion was OK, 0 otherwise and outputs the value through a
    parameter:

    Code:
    
    char *  end = NULL;
    long    value;
    
    errno = 0;
    value = strtol(str, &end, base);
    
    /*
         end == NULL if the base is invalid.
         end == str  if no conversion was done.
        *end == '\0' or *end is whitespace if the number was
                whitespace delimited (a reasonable assumption).
        errno is 0 if no overflow or underflow occurred.
    */
    if (end != NULL && end != str && errno == 0 &&
         (*end == '\0' || isspace(*end)))
    {
        if (INT_MIN <= value && value <= INT_MAX)
        {
            *integer = (int) value;
    
            return 1;
        }
    }
    
    return 0;
    
    
    I wonder if anyone would care to comment on whether this method is
    adequate.

    Dave

    --
    D.a.v.i.d T.i.k.t.i.n
    t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
    David Tiktin, May 24, 2007
    #9
  10. David Tiktin said:

    <snip>

    > if (end != NULL && end != str && errno == 0 &&
    > (*end == '\0' || isspace(*end)))


    <snip>

    > I wonder if anyone would care to comment on whether this method is
    > adequate.


    A cursory glance reveals to me only that you are perhaps a little
    optimistic in passing *end to isspace(), which requires that its
    parameter be representable as an unsigned char. If, for example, *end
    were -1, this would not qualify, and the behaviour would be undefined.

    This is one of those very rare and bizarre cases where it is actually a
    *good* idea to use a cast - isspace((unsigned char)*end) - and the
    normal promotion rules will of course take care of the conversion to
    int for you.


    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, May 25, 2007
    #10
  11. David Tiktin Guest

    On 24 May 2007, Richard Heathfield <> wrote:

    > David Tiktin said:
    >
    > <snip>
    >
    >> if (end != NULL && end != str && errno == 0 &&
    >> (*end == '\0' || isspace(*end)))

    >
    > <snip>
    >
    >> I wonder if anyone would care to comment on whether this method
    >> is adequate.

    >
    > A cursory glance reveals to me only that you are perhaps a little
    > optimistic in passing *end to isspace(), which requires that its
    > parameter be representable as an unsigned char. If, for example,
    > *end were -1, this would not qualify, and the behaviour would be
    > undefined.
    >
    > This is one of those very rare and bizarre cases where it is
    > actually a *good* idea to use a cast - isspace((unsigned
    > char)*end) - and the normal promotion rules will of course take
    > care of the conversion to int for you.


    Good catch! I actually "knew" that ;-) I have a bunch of macros
    like:

    #define TO_LOWER(c) ((char) tolower((unsigned char)(c)))

    But not for isspace(). I can't figure out why. Fixed now, though.

    Thanks!

    Dave

    --
    D.a.v.i.d T.i.k.t.i.n
    t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
    David Tiktin, May 25, 2007
    #11
  12. David Tiktin <-bogus.com> wrote:
    > Richard Heathfield <> wrote:
    > > ...
    > > This is one of those very rare and bizarre cases where
    > > it is actually a *good* idea to use a cast - isspace(
    > > (unsigned char)*end) - and the normal promotion rules
    > > will of course take care of the conversion to int for
    > > you.

    >
    > Good catch! I actually "knew" that ;-) I have a bunch
    > of macros like:
    >
    > #define TO_LOWER(c) ((char) tolower((unsigned char)(c)))


    How is the (char) cast useful?

    P.S. I find the (unsigned char) application above
    contentious in that it assumes that 1c and sm
    implementations will make plain char unsigned.

    --
    Peter
    Peter Nilsson, May 25, 2007
    #12
  13. David Tiktin Guest

    On 24 May 2007, Peter Nilsson <> wrote:

    > David Tiktin <-bogus.com> wrote:
    >> Richard Heathfield <> wrote:
    >> > ...
    >> > This is one of those very rare and bizarre cases where
    >> > it is actually a *good* idea to use a cast - isspace(
    >> > (unsigned char)*end) - and the normal promotion rules
    >> > will of course take care of the conversion to int for
    >> > you.

    >>
    >> Good catch! I actually "knew" that ;-) I have a bunch
    >> of macros like:
    >>
    >> #define TO_LOWER(c) ((char) tolower((unsigned char)(c)))

    >
    > How is the (char) cast useful?


    In it's typical use:

    char * ptr = str;

    while (*ptr)
    {
    *ptr = TO_LOWER(*ptr);
    ptr++;
    }

    some compilers I've used over the years complain about the assignment
    of an int to a char due to loss of precision. I generally run with
    the highest warning levels I can get, so the cast silences a warning
    I've investigated and found not to be a problem in this situation.

    > P.S. I find the (unsigned char) application above
    > contentious in that it assumes that 1c and sm
    > implementations will make plain char unsigned.


    Sorry, I don't understand your point here or where that assumption is
    made.

    Is there a problem that the code should be:

    *ptr = tolower((int)(*ptr) & 0xFF);

    to assure the passed value and result are in the range 0-255 even if
    CHAR_BITS is greater than 8?

    Dave

    --
    D.a.v.i.d T.i.k.t.i.n
    t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
    David Tiktin, May 25, 2007
    #13
  14. David Tiktin <-bogus.com> wrote:
    > Peter Nilsson <> wrote:
    > > David Tiktin <-bogus.com> wrote:
    > > > ... I have a bunch of macros like:
    > > >
    > > > #define TO_LOWER(c) ((char) tolower((unsigned char)(c)))

    >
    > > How is the (char) cast useful?

    >
    > In it's typical use:
    >
    > char * ptr = str;
    >
    > while (*ptr)
    > {
    > *ptr = TO_LOWER(*ptr);
    > ptr++;
    > }


    There is no semantic difference.

    > some compilers I've used over the years complain about the
    > assignment of an int to a char due to loss of precision.


    Assignment of int values to a char is probably the most
    fundamental of useful constructs that C has. Putting a
    warning on that is to me like putting a warning on every
    #include asking if that's the file you actually meant to
    include.

    > I generally run with the highest warning levels I can get,


    A good move, but you shouldn't change code to silence one
    compiler's warnings unecessarily. Different compilers will
    issue warnings for different reasons and two different
    compilers can even issue warnings for opposing reasons.

    > so the cast silences a warning I've investigated and found
    > not to be a problem in this situation.


    The simpler option is to acknowledge that no action is
    required as a consequence of the warning.

    It's easy to fall into the belief that the absense of
    warnings is a strong measure of correctness. 'Clean'
    compiles give a sense of confidence. But it's a small
    step away from introducing bugs, just to silence a
    compiler.

    > > P.S. I find the (unsigned char) application above
    > > contentious in that it assumes that 1c and sm
    > > implementations will make plain char unsigned.

    >
    > Sorry, I don't understand your point here or where that
    > assumption is made.


    Depending on how you use them, input routines often read
    and store bytes, not (plain) chars. On an sm machine
    interpreting an input byte as a char representation and
    converting it to an unsigned char can potentially yield
    a different character code to the original for some
    characters outside the basic character set.

    It's a highly unlikely scenario, and it's dismissed with
    a little handwaving about QoI guaranteeing that 1c and sm
    machines will always make plain char unsigned.

    > Is there a problem that the code should be:
    >
    > *ptr = tolower((int)(*ptr) & 0xFF);
    >
    > to assure the passed value and result are in the range
    > 0-255 even if CHAR_BITS is greater than 8?


    No. I'm suggesting, in some cases, it should be...

    *ptr = tolower(* (unsigned char *) ptr);

    Obviously that's not as aesthetic as the direct conversion
    (unsigned char) *ptr, but it does have the advantage of
    working on the hypothetical machines (contrived if you
    like) as well as the vanilla ones.

    --
    Peter
    Peter Nilsson, May 26, 2007
    #14
  15. David Tiktin Guest

    On 25 May 2007, Peter Nilsson <> wrote:

    > David Tiktin <-bogus.com> wrote:
    >> Peter Nilsson <> wrote:
    >> > David Tiktin <-bogus.com> wrote:
    >> > > ... I have a bunch of macros like:
    >> > >
    >> > > #define TO_LOWER(c) ((char) tolower((unsigned char)(c)))

    >>
    >> > How is the (char) cast useful?

    >>
    >> In it's typical use:
    >>
    >> char * ptr = str;
    >>
    >> while (*ptr)
    >> {
    >> *ptr = TO_LOWER(*ptr);
    >> ptr++;
    >> }

    >
    > There is no semantic difference.


    Semantic difference between what:

    *ptr = (char) tolower(c);

    and

    *ptr = tolower(c);

    ?

    >> some compilers I've used over the years complain about the
    >> assignment of an int to a char due to loss of precision.

    >
    > Assignment of int values to a char is probably the most
    > fundamental of useful constructs that C has. Putting a
    > warning on that is to me like putting a warning on every
    > #include asking if that's the file you actually meant to
    > include.


    Sorry, I just don't agree. How many times have we seen code in this
    group that goes:

    [bad code]

    char c;

    while ((c = getc()) != EOF)
    {
    /* infinite loop */
    }

    [/bad code]

    I suspect that the int -> char warnings are there to prevent things
    like this.

    >> I generally run with the highest warning levels I can get,

    >
    > A good move, but you shouldn't change code to silence one
    > compiler's warnings unecessarily. Different compilers will
    > issue warnings for different reasons and two different
    > compilers can even issue warnings for opposing reasons.
    >
    >> so the cast silences a warning I've investigated and found
    >> not to be a problem in this situation.

    >
    > The simpler option is to acknowledge that no action is
    > required as a consequence of the warning.
    >
    > It's easy to fall into the belief that the absense of
    > warnings is a strong measure of correctness. 'Clean'
    > compiles give a sense of confidence. But it's a small
    > step away from introducing bugs, just to silence a
    > compiler.


    I *never* fall into that belief ;-) What I do assume is that code
    that compiles *with* warnings is likely *not* correct. I routinely
    compile with at least 4 different compilers on 4 different platforms
    (2 of them big-endian). I expect my code to compile without warnings
    on all of them (and to be correct on all of them ;-) Yes, that
    sometimes means adding a cast for a "picky" compiler. It also
    sometimes means changing the code to something simpler, clearer and
    better. But if I don't fix the code to silence the warnings, even if
    they don't signal a real problem, I'll continue to get the warnings
    and waste time looking at things I've already thought about, tested
    and fixed. I don't do this in a calavier manner, but when I rebuild
    a 50 file project, I need to be able to *see* it builds warning free.

    >> > P.S. I find the (unsigned char) application above
    >> > contentious in that it assumes that 1c and sm
    >> > implementations will make plain char unsigned.

    >>
    >> Sorry, I don't understand your point here or where that
    >> assumption is made.

    >
    > Depending on how you use them, input routines often read
    > and store bytes, not (plain) chars. On an sm machine
    > interpreting an input byte as a char representation and
    > converting it to an unsigned char can potentially yield
    > a different character code to the original for some
    > characters outside the basic character set.
    >
    > It's a highly unlikely scenario, and it's dismissed with
    > a little handwaving about QoI guaranteeing that 1c and sm
    > machines will always make plain char unsigned.


    OK, thanks for the warning ;-) I've never had to code for a platform
    that's 1s complement or sign-magnitude, but if I did, I imagine I'd
    have more to worry about that int -> char casts. I know of at least
    one piece of code I have that explicitly assumes 2s complement, and
    I'm sure *none* of my networking code would work!

    Dave

    --
    D.a.v.i.d T.i.k.t.i.n
    t.i.k.t.i.n [at] a.d.v.a.n.c.e.d.r.e.l.a.y [dot] c.o.m
    David Tiktin, May 29, 2007
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Schnoffos
    Replies:
    2
    Views:
    1,207
    Martien Verbruggen
    Jun 27, 2003
  2. Hal Styli
    Replies:
    14
    Views:
    1,627
    Old Wolf
    Jan 20, 2004
  3. arun
    Replies:
    8
    Views:
    450
    Dave Thompson
    Jul 31, 2006
  4. Replies:
    6
    Views:
    1,516
    Richard Tobin
    Mar 19, 2009
  5. jdm
    Replies:
    1
    Views:
    662
    Victor Bazarov
    May 18, 2010
Loading...

Share This Page