Requesting advice how to clean up C code for validating string represents integer

Discussion in 'C Programming' started by robert maas, see http://tinyurl.com/uh3t, Feb 11, 2007.

  1. I'm working on examples of programming in several languages, all
    (except PHP) running under CGI so that I can show both the source
    files and the actually running of the examples online. The first
    set of examples, after decoding the HTML FORM contents, merely
    verifies the text within a field to make sure it is a valid
    representation of an integer, without any junk thrown in, i.e. it
    must satisfy the regular expression: ^ *[-+]?[0-9]+ *$

    If the contents of the field are wrong I want to diagnose as much
    as reasonable what's wrong, not just say "syntax error".

    Because perl and PHP include support for regular expressions, it
    was obvious how to do it, and easy to accomplish:
    http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intperl
    http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intphp

    Because Common Lisp has good utilities for scanning strings, mostly
    using position, position-if, and position-if-not, it was equally
    easy, and equally obvious, how to do it:
    http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intlisp

    The Java API is missing some of the functions available in Common
    Lisp, so I had to augment the API, but then it was as easy as in
    Common Lisp, with nearly the same algorithm:
    http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intjava

    Now we come to C: I presently have a horrible mess:
    http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intc
    I'm thinking of pulling out all the character-case testing into a
    function that converts a character into a class-number (such as 1
    for space, 2 for digit, 3 for sign, etc.), calling that all over
    the place, and the using a SELECT statement on the result, which
    won't change the logic of the code but might make it tidier.
    Alternately I might hand-code replacements for the Lisp/Java
    utilities for scanning strings, or find something in one of the C
    libraries that would help, and then translate the Lisp or Java code
    to C. Do any of you have any other ideas what I might do to clean
    up the C code? Don't write my code for me, but just give hints what
    library routines might do 90% of the work for me, or suggest
    re-design of the algorithm? One thing I don't want to do is
    download a REGEX package for C. I'm trying to give examples of how
    to do things from scratch in C, not how to simply use somebody
    else's program, even if the source for the REGEX module is
    available. If something isn't in the a standard library for C, then
    it doesn't exist for the purpose of this project. (The only
    exception I made is the module for collecting and decoding HTML
    FORM contents, which is a prerequisite for this whole project.)
    robert maas, see http://tinyurl.com/uh3t, Feb 11, 2007
    #1
    1. Advertising

  2. robert maas, see http://tinyurl.com/uh3t

    Bill Pursell Guest

    On Feb 11, 6:57 am, (robert maas, see http://
    tinyurl.com/uh3t) was asking about code that:
    > verifies the text within a field to make sure it is a valid
    > representation of an integer, without any junk thrown in, i.e. it
    > must satisfy the regular expression: ^ *[-+]?[0-9]+ *$
    >
    > If the contents of the field are wrong I want to diagnose as much
    > as reasonable what's wrong, not just say "syntax error".
    >

    [snip]
    > Do any of you have any other ideas what I might do to clean
    > up the C code? Don't write my code for me, but just give hints what
    > library routines might do 90% of the work for me, or suggest
    > re-design of the algorithm?



    You could try something as simple as this:

    strtol( string, &end, BASE );
    if( *end != '\0' )
    fprintf( stderr, "syntax error starting at '%c'\n", *end);

    I'm not sure that this gives you as much syntax error
    as you want, but it tells you where it occurs. (Also,
    this doesn't exactly match your specification, since
    this doesn't allow trailing whitespace, but that's
    a trivial fix.)

    --
    Bill Pursell
    Bill Pursell, Feb 11, 2007
    #2
    1. Advertising

  3. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    robert maas, see http://tinyurl.com/uh3t wrote, On 11/02/07 06:57:
    > I'm working on examples of programming in several languages, all
    > (except PHP) running under CGI so that I can show both the source
    > files and the actually running of the examples online. The first
    > set of examples, after decoding the HTML FORM contents, merely
    > verifies the text within a field to make sure it is a valid
    > representation of an integer, without any junk thrown in, i.e. it
    > must satisfy the regular expression: ^ *[-+]?[0-9]+ *$
    >
    > If the contents of the field are wrong I want to diagnose as much
    > as reasonable what's wrong, not just say "syntax error".


    <snip>

    > Now we come to C: I presently have a horrible mess:
    > http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intc


    <snip>

    > to C. Do any of you have any other ideas what I might do to clean
    > up the C code? Don't write my code for me, but just give hints what
    > library routines might do 90% of the work for me, or suggest
    > re-design of the algorithm? One thing I don't want to do is


    <snip>

    Good, we generally prefer to give help rather than do peoples work for
    them :)

    I would suggest you look at the strto* functions which are part of
    standard C taking specific note of the second and third parameters,
    since you want to use both. The second parameter is used to tell you the
    first invalid character (or the end of the string if completely valid)
    and the last parameter to specify base 10 which is what the user will
    expect. These functions will even tell you if the number in the string
    is out of range for the type it is converted to. Finally, since there is
    no strtoi you will probably have to use strtol and then check if it is
    in the range of an int before assigning it to an int.
    --
    Flash Gordon
    Flash Gordon, Feb 11, 2007
    #3
  4. robert maas, see http://tinyurl.com/uh3t

    Random832 Guest

    2007-02-11 <>,
    robert maas, see http://tinyurl.com/uh3t wrote:
    > I'm working on examples of programming in several languages, all
    > (except PHP) running under CGI so that I can show both the source
    > files and the actually running of the examples online. The first
    > set of examples, after decoding the HTML FORM contents, merely
    > verifies the text within a field to make sure it is a valid
    > representation of an integer, without any junk thrown in, i.e. it
    > must satisfy the regular expression: ^ *[-+]?[0-9]+ *$


    I'd use strtol with a base of 10.

    Things to consider:
    1. It doesn't care if there's junk after the numbers, but why do you?
    You can always examine *endptr.
    2. Won't work for converting integers greater than eleventy billion or
    however much your system supports. But how do you intend to convert
    them otherwise?
    Random832, Feb 11, 2007
    #4
  5. robert maas, see http://tinyurl.com/uh3t

    CBFalconer Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    "robert maas, see http://tinyurl.com/uh3t" wrote:
    >
    > I'm working on examples of programming in several languages, all
    > (except PHP) running under CGI so that I can show both the source
    > files and the actually running of the examples online. The first
    > set of examples, after decoding the HTML FORM contents, merely
    > verifies the text within a field to make sure it is a valid
    > representation of an integer, without any junk thrown in, i.e. it
    > must satisfy the regular expression: ^ *[-+]?[0-9]+ *$
    >
    > If the contents of the field are wrong I want to diagnose as much
    > as reasonable what's wrong, not just say "syntax error".
    >
    > Because perl and PHP include support for regular expressions, it
    > was obvious how to do it, and easy to accomplish:


    Perl and PHP are off-topic here. Regular expressions are only
    topical in reference to code to implement them. In addition, you
    RE is wrong. A numeric field ends when the next character cannot
    be used, not on a blank. This is easily done in C, see the
    following example:. Note that it leaves detection and use of +- to
    the calling function, similarly the decision about the termination
    char. Note that this parses a stream.

    /*--------------------------------------------------------------
    * Read an unsigned value. Signal error for overflow or no
    * valid number found. Returns 1 for error, 0 for noerror, EOF
    * for EOF encountered before parsing a value.
    *
    * Skip all leading blanks on f. At completion getc(f) will
    * return the character terminating the number, which may be \n
    * or EOF among others. Barring EOF it will NOT be a digit. The
    * combination of error, 0 result, and the next getc returning
    * \n indicates that no numerical value was found on the line.
    *
    * If the user wants to skip all leading white space including
    * \n, \f, \v, \r, he should first call "skipwhite(f);"
    *
    * Peculiarity: This specifically forbids a leading '+' or '-'.
    */
    int readxwd(unsigned int *wd, FILE *f)
    {
    unsigned int value, digit;
    int status;
    int ch;

    #define UWARNLVL (UINT_MAX / 10U)
    #define UWARNDIG (UINT_MAX - UWARNLVL * 10U)

    value = 0; /* default */
    status = 1; /* default error */

    ch = ignoreblks(f);

    if (EOF == ch) status = EOF;
    else if (isdigit(ch)) status = 0; /* digit, no error */

    while (isdigit(ch)) {
    digit = ch - '0';
    if ((value > UWARNLVL) ||
    ((UWARNLVL == value) && (digit > UWARNDIG))) {
    status = 1; /* overflow */
    value -= UWARNLVL;
    }
    value = 10 * value + digit;
    ch = getc(f);
    } /* while (ch is a digit) */

    *wd = value;
    ungetc(ch, f);
    return status;
    } /* readxwd */

    The #includes, skipwhite, and ignoreblks functions are omitted.

    --
    <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
    <http://www.securityfocus.com/columnists/423>

    "A man who is right every time is not likely to do very much."
    -- Francis Crick, co-discover of DNA
    "There is nothing more amazing than stupidity in action."
    -- Thomas Matthews
    CBFalconer, Feb 11, 2007
    #5
  6. "robert maas, see http://tinyurl.com/uh3t" <> wrote
    > Now we come to C: I presently have a horrible mess:
    > http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intc
    > I'm thinking of pulling out all the character-case testing into a
    > function that converts a character into a class-number (such as 1
    > for space, 2 for digit, 3 for sign, etc.), calling that all over
    > the place, and the using a SELECT statement on the result, which
    > won't change the logic of the code but might make it tidier.
    > Alternately I might hand-code replacements for the Lisp/Java
    > utilities for scanning strings, or find something in one of the C
    > libraries that would help, and then translate the Lisp or Java code
    > to C. Do any of you have any other ideas what I might do to clean
    > up the C code? Don't write my code for me, but just give hints what
    > library routines might do 90% of the work for me, or suggest
    > re-design of the algorithm? One thing I don't want to do is
    > download a REGEX package for C. I'm trying to give examples of how
    > to do things from scratch in C, not how to simply use somebody
    > else's program, even if the source for the REGEX module is
    > available. If something isn't in the a standard library for C, then
    > it doesn't exist for the purpose of this project. (The only
    > exception I made is the module for collecting and decoding HTML
    > FORM contents, which is a prerequisite for this whole project.)
    >

    The first thing is to make your interface clean.

    If you want to parse from a string, take a block out of strtol's book.

    int parseint(char *str, char **end).

    Return the integer you read, and the end of the input you pased up to. If
    you cannot read an integer successfully, make *end equal str and return
    INT_MIN. INT_MIN is much less lilely than 0 or -1 to be confused with a real
    integer if you have a lazy caller who doesn't check his end pointer
    properly.

    skip leading whitespace.
    Read the optional +/- character and make sure there aren't two of them.
    skip whitepace ?
    Read digit one by one into an usigned integer, amd multiply by ten if there
    are more digits to come. Terminate if the unsigned overflows.
    Check for INT_MAX or -INT_MIN if the negative flag is set. Terminate on
    overflow.
    Convert to a signed integer.
    Your spec now says to skip trailing whitespace. Probably a bad idea, but if
    the instructions say do it we must do it.
    Set the end pointer to end of input on success, input on fail.
    Return answer on success, INT_MIN on fail.
    Malcolm McLean, Feb 11, 2007
    #6
  7. > From: Random832 <>
    > I'd use strtol with a base of 10.


    Several people suggested that, but you made some additional
    comments I want to reply to, so I'm responding here.

    > Things to consider:
    > 1. It doesn't care if there's junk after the numbers, but why do you?


    This is for processing a HTML FORM filled out by a user, a typical
    user who is a total novice at computers yet is trying all sorts of
    things found on the Web. If the form asks for an integer to be
    entered, but the user enters something else, like two integers, or
    an algebraic formula which just happens to start with an integer,
    or a floating-point value or decimal fraction, or a fraction, I
    don't want to just gobble the first part and ignore all the rest,
    because obviously the luser didn't understand/follow instructions.
    If I just process the first part and ignore the rest, the luser
    will be totally confused why he/she didn't get the intended effect.
    Better that I complain about the slightest mess in the input field.

    > You can always examine *endptr.


    Per a nice example I found on the Web:
    Linkname: Bullet Proof Integer Input Using strtol()
    URL: http://home.att.net/~jackklein/c/code/strtol.html
    I'm indeed now checking for any diagnostics that can be obtained
    just from the results returned by strtol (the actual return value,
    the global error flag, and the reference pointer endptr). See end
    of this message for the code as I have it now.

    > 2. Won't work for converting integers greater than eleventy billion or
    > however much your system supports. But how do you intend to convert
    > them otherwise?


    Good point. My previous idea was for the user to get just the
    syntax correct for integers, and then if the result is mangled it
    obviously means this particular programming language (c, c++, java)
    is using fixed-length binary integers, whereas if the result is
    always correct no matter how many digits are given, then the
    language (lisp) is using unlimited-precision integers. But if it's
    easy to diagnose explicitly, such as provided by strtol, then
    perhaps I can actually tell the user when overflow happens, to make
    the lesson a bit less obscure.

    Anyway, using strtol, with all the possible tests on the result:

    All of these produce the correct diagnosis (note 15-char buffer for input):

    Type a number:555555555555555555
    You typed: [55555555555555]
    Number out of range.

    Type a number:2147000000
    Dropping EOL char from end of string.
    You typed: [2147000000]
    Looks good? N=2147000000

    Type a number:2148000000
    Dropping EOL char from end of string.
    You typed: [2148000000]
    Number out of range.

    Type a number:
    Dropping EOL char from end of string.
    You typed: []
    No number given.

    Type a number:5x
    Dropping EOL char from end of string.
    You typed: [5x]
    After number, extra characters on input line.

    But these are not the effects I want:

    Type a number:x5
    Dropping EOL char from end of string.
    You typed: [x5]
    No number given.

    Type a number:- 5
    Dropping EOL char from end of string.
    You typed: [- 5]
    No number given.

    There *is* a number given in each case, just that there's junk
    before the number in the first case, and gap between sign and
    number in second case. It seems I'll need to manually scan from the
    start of the field to the start of the number to distinguish these
    patterns (brackets indicate optional):
    [white] junk [white] sign [white] digits -- junk before start of number
    [white] sign white digits -- gap between sign and number
    [white] sign junk digits -- junk (or gap) between sign and number
    [white] sign digits -- good
    [white] digits -- good
    strtol doesn't seem to be helping me diagnose the cruft before the number.


    Listing of source code used for the above tests:

    #include <stdio.h>
    #include <errno.h>

    #define MAXCH 15
    /* Deliberately small buffer to test buffer-full condition */

    main() {
    char chars[MAXCH]; char* inres; /* Set by fgets */
    size_t len; /* Set by strlen */
    char onech;
    char* endptr; long long_var; /* Set by strtol */
    while (1) {
    fpurge(stdin);
    printf("\nType a number:");
    inres = fgets(chars, MAXCH, stdin);
    if (NULL==inres) {
    printf("*** Got NULL back, which maybe means end-of-stream?\n");
    break;
    }
    len = strlen(chars);
    /* printf("Length of string = %d\n", len); */
    if (0 >= len) {
    printf("Horrible: Input was 0 chars, not even EOL char, how??\n");
    break;
    }
    onech = chars[len-1];
    /* printf("The last character is [%c]\n", onech); */
    if ('\n' == onech) {
    printf("Dropping EOL char from end of string.\n");
    chars[len-1] = '\0';
    }
    printf("You typed: [%s]\n", inres, NULL, inres);
    errno = 0;
    long_var = strtol(chars, &endptr, 10);
    if (ERANGE == errno) {
    printf("Number out of range.\n");
    } else if (endptr==chars) {
    printf("No number given.\n");
    } else if ('\0' != *endptr) {
    printf("After number, extra characters on input line.\n");
    } else {
    printf("Looks good? N=%ld\n", long_var);
    }
    sleep(1);
    }
    }
    robert maas, see http://tinyurl.com/uh3t, Feb 11, 2007
    #7
  8. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    robert maas, see http://tinyurl.com/uh3t wrote, On 11/02/07 19:13:
    >> From: Random832 <>
    >> I'd use strtol with a base of 10.

    >
    > Several people suggested that, but you made some additional
    > comments I want to reply to, so I'm responding here.
    >
    >> Things to consider:
    >> 1. It doesn't care if there's junk after the numbers, but why do you?


    <snip comments about detecting bad input that happens to also contain a
    number>

    > Better that I complain about the slightest mess in the input field.


    That the the correct attitude for handling user input.

    >> You can always examine *endptr.

    >
    > Per a nice example I found on the Web:
    > Linkname: Bullet Proof Integer Input Using strtol()
    > URL: http://home.att.net/~jackklein/c/code/strtol.html
    > I'm indeed now checking for any diagnostics that can be obtained
    > just from the results returned by strtol (the actual return value,
    > the global error flag, and the reference pointer endptr). See end
    > of this message for the code as I have it now.


    Jack Klein knows his stuff. You have found a good reference.

    >> 2. Won't work for converting integers greater than eleventy billion or
    >> however much your system supports. But how do you intend to convert
    >> them otherwise?

    >
    > Good point. My previous idea was for the user to get just the
    > syntax correct for integers, and then if the result is mangled it
    > obviously means this particular programming language (c, c++, java)
    > is using fixed-length binary integers, whereas if the result is
    > always correct no matter how many digits are given, then the
    > language (lisp) is using unlimited-precision integers.


    With C and C++ assuming that bad input will lead to obviously bad output
    is not in general a good idea since in far too many situations it will
    produce something that is not obviously bad.

    > But if it's
    > easy to diagnose explicitly, such as provided by strtol, then
    > perhaps I can actually tell the user when overflow happens, to make
    > the lesson a bit less obscure.


    OK, that's good.

    <snip>

    > But these are not the effects I want:
    >
    > Type a number:x5
    > Dropping EOL char from end of string.
    > You typed: [x5]
    > No number given.
    >
    > Type a number:- 5
    > Dropping EOL char from end of string.
    > You typed: [- 5]
    > No number given.
    >
    > There *is* a number given in each case, just that there's junk
    > before the number in the first case, and gap between sign and
    > number in second case. It seems I'll need to manually scan from the
    > start of the field to the start of the number to distinguish these
    > patterns (brackets indicate optional):
    > [white] junk [white] sign [white] digits -- junk before start of number
    > [white] sign white digits -- gap between sign and number
    > [white] sign junk digits -- junk (or gap) between sign and number


    Yes, you need to check for the above yourself if you want to report
    them. strtol will only indicate that it the first non-space character
    was invalid, not whether there was something valid further in.

    > [white] sign digits -- good
    > [white] digits -- good


    The above, of course, are indicated by strtol succeeding ;-)

    > strtol doesn't seem to be helping me diagnose the cruft before the number.
    >
    >
    > Listing of source code used for the above tests:
    >
    > #include <stdio.h>
    > #include <errno.h>


    #include <stdlib.h> /* For strtol. Very important since otherwise the
    compiler is *required* to assume it returns an int not a long. */

    > #define MAXCH 15
    > /* Deliberately small buffer to test buffer-full condition */
    >
    > main() {


    Since no one has mentioned it yet I will. The above, whilst legal in the
    original C standard, is bad style and no longer supported in the
    new(ish) C standard that might one day become commonly implemented.
    Don't use implicit and if you don't want parameters be explicit about it.

    int main(void) {

    > char chars[MAXCH]; char* inres; /* Set by fgets */
    > size_t len; /* Set by strlen */
    > char onech;
    > char* endptr; long long_var; /* Set by strtol */
    > while (1) {


    I prefer 'for (;;)' but that is purely a matter of style.

    > fpurge(stdin);


    Standard C does not have an "fpurge" function or anything similar to
    what I am guessing it does.

    > printf("\nType a number:");


    As per Jack's example you need to flush stdout (or have a \n at the end
    of the above line). There is also an argument that using puts (which
    outputs a newline after the specified text) or fputs would be better
    since they do not scan the string for format specifiers.

    > inres = fgets(chars, MAXCH, stdin);


    Since chars is an array rather than a pointer you could use:
    inres = fgets(chars, sizeof chars, stdin);

    > if (NULL==inres) {
    > printf("*** Got NULL back, which maybe means end-of-stream?\n");


    It is end of stream or an error.

    > break;
    > }
    > len = strlen(chars);
    > /* printf("Length of string = %d\n", len); */
    > if (0 >= len) {


    len cannot be negative or even 0 here for at least three reasons. It is
    of type size_t which is unsigned and also strlen returns a size_t. The
    third reason is that fgets reads until it either has enough to fill the
    buffer (allowing space for the nul termination), until error or end of
    stream, or up to and including the newline, which ever comes first. So
    given a buffer length of 2 or more it will *always* either return NULL
    or it will have written a string with a strlen of at least 1. So this if
    cannot be taken.

    > printf("Horrible: Input was 0 chars, not even EOL char, how??\n");
    > break;
    > }
    > onech = chars[len-1];
    > /* printf("The last character is [%c]\n", onech); */
    > if ('\n' == onech) {
    > printf("Dropping EOL char from end of string.\n");
    > chars[len-1] = '\0';
    > }


    else {
    report that the line entered was too long and then probably read the
    rest of the line up to and including the next newline.
    }

    > printf("You typed: [%s]\n", inres, NULL, inres);
    > errno = 0;
    > long_var = strtol(chars, &endptr, 10);
    > if (ERANGE == errno) {
    > printf("Number out of range.\n");
    > } else if (endptr==chars) {


    At this point you could scan from the start of the string for the first
    character that is not white space and report a different error depending
    on what it is using the is* functions from ctype.h. Alternatively you
    could look at using strspn or strcspn from string.h

    > printf("No number given.\n");
    > } else if ('\0' != *endptr) {
    > printf("After number, extra characters on input line.\n");
    > } else {
    > printf("Looks good? N=%ld\n", long_var);
    > }
    > sleep(1);


    sleep is not a standard function and seems rather pointless in this program.

    > }
    > }

    --
    Flash Gordon
    Flash Gordon, Feb 11, 2007
    #8
  9. > From: Flash Gordon <>
    > > sleep(1);

    > sleep is not a standard function and seems rather pointless in this program.


    It's absolutely essential for peace of mind when dialed into a Unix
    shell with VT100 emulator at 19200 baud. The first time I ran this
    program, without the sleep call, and pressed ctrl-D to generate
    end-of-stream on stdin, the program went into infinite read-EOS
    spew-text loop, which filled up all modem buffers. I immediately
    pressed ctrl-C to abort C program, and held it down for about ten
    seconds, but it was too late, modem buffers were grossly full. I
    then pressed ctrl-Z and held that down for several minutes, but
    modem buffers were still spewing to the VT100 emulator. I then
    scrolled to the top of the past-screens buffer to see if I could
    save anything, but it was already too late, all the past-screens
    buffer (appx. 30-40 full VT100 screensfull) had already been
    overwritten by the spew. I then waited about ten minutes, watching
    spew spew spew incessantly, with no way to know whether the program
    had even seen my ctrl-C interrupt. Finally after ten minutes or so
    I finally saw a shell prompt. I immediately put in the sleep before
    any further work on the program. Now if it gets into an infinite
    loop, I press ctrl-C and get instant response because there's no
    ten minutes of spew already in the modem buffer.

    I copied a few cleanup suggestions from your message and will be
    responding about them later.
    robert maas, see http://tinyurl.com/uh3t, Feb 12, 2007
    #9
  10. On 11 Feb, 06:57, (robert maas, see http://
    tinyurl.com/uh3t) wrote:

    <snip>

    [the program]
    > verifies the text within a field to make sure it is a valid
    > representation of an integer, without any junk thrown in, i.e. it
    > must satisfy the regular expression: ^ *[-+]?[0-9]+ *$
    >
    > If the contents of the field are wrong I want to diagnose as much
    > as reasonable what's wrong, not just say "syntax error".


    <snip>

    > Alternately I might hand-code replacements for the Lisp/Java
    > utilities for scanning strings, or find something in one of the C
    > libraries that would help,


    if it was anything other than a number then sscanf() might
    be worth a look.

    <snip>


    --
    Nick Keighley
    Nick Keighley, Feb 12, 2007
    #10
  11. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    robert maas, see http://tinyurl.com/uh3t wrote, On 12/02/07 04:48:
    >> From: Flash Gordon <>
    >>> sleep(1);

    >> sleep is not a standard function and seems rather pointless in this program.

    >
    > It's absolutely essential for peace of mind when dialed into a Unix
    > shell with VT100 emulator at 19200 baud. The first time I ran this
    > program, without the sleep call, and pressed ctrl-D to generate
    > end-of-stream on stdin, the program went into infinite read-EOS
    > spew-text loop, which filled up all modem buffers. I immediately


    <snip>

    I can only suggest that you had some other bug in your program at that
    point or a but in your modem. As presented your program would not do
    that whether it detected an error or EOF it would break out of the loop
    and terminate.

    Having said that, I can see that if you are hitting that sort of problem
    that a delay could be useful.

    > I copied a few cleanup suggestions from your message and will be
    > responding about them later.


    OK.
    --
    Flash Gordon
    Flash Gordon, Feb 12, 2007
    #11
  12. > From: Flash Gordon <>
    > > ... The first time I ran this
    > > program, without the sleep call, and pressed ctrl-D to generate
    > > end-of-stream on stdin, the program went into infinite read-EOS
    > > spew-text loop, which filled up all modem buffers. ...

    > I can only suggest that you had some other bug in your program at that
    > point or a but in your modem. As presented your program would not do
    > that whether it detected an error or EOF it would break out of the loop
    > and terminate.


    Not a bug. It's just that the part of the program to detect EOF wasn't yet
    written, and that's the very part I was trying to develop.
    Step 1: Put in a printf to see what value comes back when I press ctrl-D.
    Step 2: Write code to detect that value and break out of loop.
    Step 3: Test that to see whether it works.
    Step 4: Remove the printf.
    Unfortunately step 1 blew me out for ten minutes or so without the sleep.

    Unfortunately c doesn't allow any sleep times except integers. I
    looked at nanosecond sleep but it requires loading a special module
    and building a special nanosecond object and then loading a number
    into that object before you can then pass that object to some OO
    method that does the actual sleep, a royal pain if it's just to
    prevent spew from filling up modem buffers on dialups. The amount
    of time I'd waste learning how to do all that would be worse than
    the amount of time I waste having a full one-second sleep at each
    interactive I/O transaction in the loop during the development of
    this code destinded for CGI where there's a completely different
    logic for interactive transactions and no chance for spew hence no
    need for the sleep.

    Anyway, here's the latest news on my task:

    While searching various clues the kind folks here sent me, I
    discovered some library functions (strspn, strcspn) which are
    useful for skipping across whole classes of characters or
    complements of such classes, similar to the functions I implemented
    in Java (explicitly) and in Common Lisp (via anonymous-function
    parameters). That made it possible to translate my lisp/java
    algorithms directly to c.

    I decided to completely separate the code for checking general
    integer syntax [white]* [sign]? [digit]+ [white]* (pseudo-regex
    notation), which is independent of the programming language (except
    Java where plus sign isn't allowed in integer literals or string to
    parseInt), from the petty code to check whether the resultant value
    is within the allowed range for this or that fixed-precision data
    type in this or that programming language as implemented by this or
    that vendor.

    So I have one function, stringCheckInteger, which checks whether
    the string is of the appropriate general format, making liberal use
    of strspn and strcspn, and another function, stringIntegerTellRange,
    which checks whether the string-number can be converted to an
    actual number by strtoll, and if so then also checks whether it's
    within ranges of the successively smaller integer data types. I
    think this is my final c version for the time being.
    If anyone is curious, see:
    <http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intc>
    go to the second form ("re-write").
    robert maas, see http://tinyurl.com/uh3t, Feb 13, 2007
    #12
  13. robert maas, see http://tinyurl.com/uh3t

    CBFalconer Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    "robert maas, see http://tinyurl.com/uh3t" wrote:
    >

    .... snip ...
    >
    > Not a bug. It's just that the part of the program to detect EOF
    > wasn't yet written, and that's the very part I was trying to
    > develop.
    > Step 1: Put in a printf to see what value comes back when I press
    > ctrl-D.


    What for? You have a macro called EOF available. Use it.

    --
    <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
    <http://www.securityfocus.com/columnists/423>

    "A man who is right every time is not likely to do very much."
    -- Francis Crick, co-discover of DNA
    "There is nothing more amazing than stupidity in action."
    -- Thomas Matthews
    CBFalconer, Feb 13, 2007
    #13
  14. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    robert maas, see http://tinyurl.com/uh3t wrote, On 13/02/07 04:24:
    >> From: Flash Gordon <>
    >>> ... The first time I ran this
    >>> program, without the sleep call, and pressed ctrl-D to generate
    >>> end-of-stream on stdin, the program went into infinite read-EOS
    >>> spew-text loop, which filled up all modem buffers. ...

    >> I can only suggest that you had some other bug in your program at that
    >> point or a but in your modem. As presented your program would not do
    >> that whether it detected an error or EOF it would break out of the loop
    >> and terminate.

    >
    > Not a bug. It's just that the part of the program to detect EOF wasn't yet
    > written, and that's the very part I was trying to develop.


    So it still was not needed in the program you posted.

    > Step 1: Put in a printf to see what value comes back when I press ctrl-D.
    > Step 2: Write code to detect that value and break out of loop.
    > Step 3: Test that to see whether it works.
    > Step 4: Remove the printf.
    > Unfortunately step 1 blew me out for ten minutes or so without the sleep.


    That is because it is the wrong approach
    1) read the documentation to see what the correct way to do it is
    2) write the code
    3) test it

    Fewer steps and more likely to give you a reliable result.

    If you used your method with "isspace" it might lead you to think it
    returns 1 to indicate a space, then due to a library upgrade your code
    could break because actually it returns any non-zero value for a space.

    > Unfortunately c doesn't allow any sleep times except integers. I


    Wrong. C does not allow *any* sleeping. The slepp function is *not* part
    of C it is part of something else your system provides and makes
    accessible from C as an extension.

    <snip>

    > think this is my final c version for the time being.
    > If anyone is curious, see:
    > <http://www.rawbw.com/~rem/HelloPlus/CookBook/h4s.html#h4intc>
    > go to the second form ("re-write").


    I may or may not look later.
    --
    Flash Gordon
    Flash Gordon, Feb 13, 2007
    #14
  15. Flash Gordon said:

    > robert maas, see http://tinyurl.com/uh3t wrote, On 13/02/07 04:24:

    <snip>
    >
    >> Unfortunately c doesn't allow any sleep times except integers. I

    >
    > Wrong. C does not allow *any* sleeping.


    Wrong. C does *allow* sleeping. It just doesn't *support* it.

    > The [sleep] function is *not* part of C


    Arguable. It's not defined by the Standard, I agree. But what is a
    language, if not the set of all sentences that can be formed according
    to the rules of that language? It is certainly possible to call a
    function named sleep(), within the rules of C.

    Incidentally, I am not arguing that sleep() is topical.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Feb 13, 2007
    #15
  16. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    Richard Heathfield wrote, On 13/02/07 10:36:
    > Flash Gordon said:
    >
    >> robert maas, see http://tinyurl.com/uh3t wrote, On 13/02/07 04:24:

    > <snip>
    >>> Unfortunately c doesn't allow any sleep times except integers. I

    >> Wrong. C does not allow *any* sleeping.

    >
    > Wrong. C does *allow* sleeping. It just doesn't *support* it.


    If you want to argue it that way the OP is still wrong. Since if C
    allows it then it certainly does not prevent the sleep times from being
    double or anything else.

    >> The [sleep] function is *not* part of C

    >
    > Arguable. It's not defined by the Standard, I agree. But what is a
    > language, if not the set of all sentences that can be formed according
    > to the rules of that language? It is certainly possible to call a
    > function named sleep(), within the rules of C.


    Yes, and the rules of C allow the sleep function to take a double.

    > Incidentally, I am not arguing that sleep() is topical.


    Indeed. You are arguing terminology and I don't have any problem with
    yours. I was just continuing using the terminology the OP used which was
    possibly wrong of me. However, my original comment about the use of
    sleep was simply that it was not a standard function and seemed
    pointless in the code presented, the OP appeared not to have understood
    that point based on talking about C only allowing integer sleep times.

    It is important for the OP to realise that the sleep function s/he is
    using is not one provided by the C language but one provided by his
    specific implementation (and a number of other implementations, but not
    even all implementations for common desktops).
    --
    Flash Gordon
    Flash Gordon, Feb 13, 2007
    #16
  17. On Sun, 11 Feb 2007 11:13:34 -0800, robert maas, wrote:
    >Per a nice example I found on the Web:
    > Linkname: Bullet Proof Integer Input Using strtol()
    > URL: http://home.att.net/~jackklein/c/code/strtol.html


    The linked code does not reflect the current C Standard:
    "If the correct value is outside the range of representable values,
    LONG_MIN, LONG_MAX ... is returned ... and the value of the macro
    ERANGE is stored in errno."

    Best regards,
    Roland Pibinger
    Roland Pibinger, Feb 13, 2007
    #17
  18. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    Roland Pibinger wrote, On 13/02/07 14:32:
    > On Sun, 11 Feb 2007 11:13:34 -0800, robert maas, wrote:
    >> Per a nice example I found on the Web:
    >> Linkname: Bullet Proof Integer Input Using strtol()
    >> URL: http://home.att.net/~jackklein/c/code/strtol.html

    >
    > The linked code does not reflect the current C Standard:
    > "If the correct value is outside the range of representable values,
    > LONG_MIN, LONG_MAX ... is returned ... and the value of the macro
    > ERANGE is stored in errno."


    Looks like it allows for that to me. It includes:
    if (ERANGE == errno)
    {
    puts("number out of range\n");
    }

    Admittedly it does not separate out positive and negative out of range,
    but that information is mentioned in the text.
    --
    Flash Gordon
    Flash Gordon, Feb 13, 2007
    #18
  19. On Sun, 11 Feb 2007 22:26:27 +0000, Flash Gordon wrote:
    >robert maas, see http://tinyurl.com/uh3t wrote, On 11/02/07 19:13:


    >> errno = 0;
    >> long_var = strtol(chars, &endptr, 10);
    >> if (ERANGE == errno) {
    >> printf("Number out of range.\n");
    >> } else if (endptr==chars) {

    >
    >At this point you could scan from the start of the string for the first
    >character that is not white space and report a different error depending
    >on what it is using the is* functions from ctype.h. Alternatively you
    >could look at using strspn or strcspn from string.h


    You consider leading whitespace an error?

    >
    >> printf("No number given.\n");
    >> } else if ('\0' != *endptr) {
    >> printf("After number, extra characters on input line.\n");
    >> } else {
    >> printf("Looks good? N=%ld\n", long_var);
    >> }
    >> sleep(1);


    IMO, the last part of the function should look like the following:

    errno = 0;
    long_var = strtol(chars, &endptr, 0);
    if (ERANGE == errno) {
    printf("Number out of range.\n");
    } else if (endptr==chars) {
    printf("No number or not parsable number given.\n");
    } else if ('\0' == *endptr) {
    printf("Looks good? N=%ld\n", long_var);
    } else if (endptr != chars) {
    printf("After number, extra characters on input line.\n");
    } else {
    printf("Unknown error, should never happen.\n");
    }

    Best regards,
    Roland Pibinger
    Roland Pibinger, Feb 13, 2007
    #19
  20. robert maas, see http://tinyurl.com/uh3t

    Flash Gordon Guest

    Re: Requesting advice how to clean up C code for validating stringrepresents integer

    Roland Pibinger wrote, On 13/02/07 15:45:
    > On Sun, 11 Feb 2007 22:26:27 +0000, Flash Gordon wrote:
    >> robert maas, see http://tinyurl.com/uh3t wrote, On 11/02/07 19:13:

    >
    >>> errno = 0;
    >>> long_var = strtol(chars, &endptr, 10);
    >>> if (ERANGE == errno) {
    >>> printf("Number out of range.\n");
    >>> } else if (endptr==chars) {

    >> At this point you could scan from the start of the string for the first
    >> character that is not white space and report a different error depending
    >> on what it is using the is* functions from ctype.h. Alternatively you
    >> could look at using strspn or strcspn from string.h

    >
    > You consider leading whitespace an error?


    Not in this case. Since the OP wanted more specific errors I suggested
    scanning for the first non-whitespace character to allow identification
    of the character that caused the failure.

    >>> printf("No number given.\n");
    >>> } else if ('\0' != *endptr) {
    >>> printf("After number, extra characters on input line.\n");
    >>> } else {
    >>> printf("Looks good? N=%ld\n", long_var);
    >>> }
    >>> sleep(1);

    >
    > IMO, the last part of the function should look like the following:
    >
    > errno = 0;
    > long_var = strtol(chars, &endptr, 0);
    > if (ERANGE == errno) {
    > printf("Number out of range.\n");
    > } else if (endptr==chars) {
    > printf("No number or not parsable number given.\n");


    The OP wanted to be more specific in error reporting hence my suggesting
    ways of analysing this further.

    > } else if ('\0' == *endptr) {
    > printf("Looks good? N=%ld\n", long_var);
    > } else if (endptr != chars) {


    You have already trapped the case when endptr==chars above, so you know
    that endptr!=chars if you reach here so I would consider the above test
    to be a sign of the coder having not understood what s/he was writing.

    > printf("After number, extra characters on input line.\n");
    > } else {
    > printf("Unknown error, should never happen.\n");


    It is guaranteed not to happen!

    > }

    --
    Flash Gordon
    Flash Gordon, Feb 13, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roberto López

    What represents NT AUTHORITY/SYSTEM User??

    Roberto López, Jul 30, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    2,056
    Roberto López
    Jul 30, 2003
  2. MrBill
    Replies:
    1
    Views:
    6,080
    Marco Schmidt
    Aug 26, 2003
  3. MarionEll
    Replies:
    0
    Views:
    432
    MarionEll
    Sep 30, 2003
  4. Replies:
    8
    Views:
    498
  5. Davy
    Replies:
    9
    Views:
    212
    Ted Zlatanov
    Aug 14, 2006
Loading...

Share This Page