sscanf up to 127 characters into a buffer?

Discussion in 'C Programming' started by mathog, Mar 14, 2013.

  1. mathog

    mathog Guest

    Is this sscanf adequate to prevent a buffer overflow if
    the input text field is >127 characters in length? The goal
    is to put everything from the first character of the text
    following the 3 floats to the end of the line into fontname.
    But only if it will all fit into the buffer.

    int elements;
    float f1,f2,f3;
    char fontname[128];
    char *string;
    while(1){
    /* load the string from a file, or exit loop on EOF
    Input lines look like:
    1.23 2.34 -1.2323 some text here
    */
    elements = sscanf(string,
    "%lf %lf %lf %127[^\n]",
    &f1,&f2,&f3, &fontname[0]);
    /* do something with the values */
    }

    The code at present lacks the 127, and I assume
    that it would generate a memory access error on an excessively long
    input line.

    Thanks,

    David Mathog
     
    mathog, Mar 14, 2013
    #1
    1. Advertising

  2. mathog

    Joe Pfeiffer Guest

    mathog <> writes:

    > Is this sscanf adequate to prevent a buffer overflow if
    > the input text field is >127 characters in length? The goal
    > is to put everything from the first character of the text
    > following the 3 floats to the end of the line into fontname.
    > But only if it will all fit into the buffer.
    >
    > int elements;
    > float f1,f2,f3;
    > char fontname[128];
    > char *string;
    > while(1){
    > /* load the string from a file, or exit loop on EOF
    > Input lines look like:
    > 1.23 2.34 -1.2323 some text here
    > */
    > elements = sscanf(string,
    > "%lf %lf %lf %127[^\n]",
    > &f1,&f2,&f3, &fontname[0]);
    > /* do something with the values */
    > }
    >
    > The code at present lacks the 127, and I assume
    > that it would generate a memory access error on an excessively long
    > input line.


    Almost...

    %lf looks for a double, not a float. You need to either declare
    double f1, f2, f3;
    or use %f

    sscanf() doesn't know about regular expressions. %127[^\n] is
    meaningless. %127s would read a string of up to 127 characters, but it
    couldn't contain any white space.

    If I were doing this, I'd probably use three calls to strtok() to pull
    off the floats (and then use atof to convert them), and strncpy() to
    pull the name into the buffer (note that you have to make sure that if
    the font is too long to fit in the buffer you get properly
    null-terminated yourself).
     
    Joe Pfeiffer, Mar 14, 2013
    #2
    1. Advertising

  3. mathog

    mathog Guest

    Joe Pfeiffer wrote:
    > Almost...
    >
    > %lf looks for a double, not a float. You need to either declare
    > double f1, f2, f3;


    Sorry brain fart typing in the small example, the original, much larger,
    code uses double.

    > or use %f
    >
    > sscanf() doesn't know about regular expressions. %127[^\n] is
    > meaningless.


    Hmm, not meaningless, but perhaps a gcc extension?
    The man page for scanf says:

    [ Matches a nonempty sequence of characters from the
    specified set of accepted characters; the next pointer must be a pointer
    to char, and there must be enough room for all the characters in the
    string, plus a terminating null byte. ...

    Regards,

    David Mathog
     
    mathog, Mar 14, 2013
    #3
  4. mathog

    mathog Guest

    Joe Pfeiffer wrote:

    > sscanf() doesn't know about regular expressions. %127[^\n] is
    > meaningless.


    After some searching, it is not a gcc extension, but part of C99,
    section 7.19.6.2. Information from here:

    http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard

    > %127s would read a string of up to 127 characters, but it
    > couldn't contain any white space.


    %127[^\n] does move white space characters into the buffer.

    Regards,

    David Mathog
     
    mathog, Mar 14, 2013
    #4
  5. mathog

    Joe Pfeiffer Guest

    mathog <> writes:

    > Joe Pfeiffer wrote:
    >>
    >> sscanf() doesn't know about regular expressions. %127[^\n] is
    >> meaningless.

    >
    > Hmm, not meaningless, but perhaps a gcc extension?
    > The man page for scanf says:
    >
    > [ Matches a nonempty sequence of characters from the
    > specified set of accepted characters; the next pointer must be a
    > pointer to char, and there must be enough room for all the characters
    > in the string, plus a terminating null byte. ...


    I apologize! checking, it's even in the C99 standard. I don't know
    when it was added.
     
    Joe Pfeiffer, Mar 14, 2013
    #5
  6. mathog

    Joe Pfeiffer Guest

    mathog <> writes:

    > Joe Pfeiffer wrote:
    >
    >> sscanf() doesn't know about regular expressions. %127[^\n] is
    >> meaningless.

    >
    > After some searching, it is not a gcc extension, but part of C99,
    > section 7.19.6.2. Information from here:
    >
    > http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard
    >
    >> %127s would read a string of up to 127 characters, but it
    >> couldn't contain any white space.

    >
    > %127[^\n] does move white space characters into the buffer.


    I stand corrected.
     
    Joe Pfeiffer, Mar 14, 2013
    #6
  7. mathog

    Lew Pitcher Guest

    On Thursday 14 March 2013 15:56, in comp.lang.c, wrote:

    > Joe Pfeiffer wrote:
    >> Almost...
    >>
    >> %lf looks for a double, not a float. You need to either declare
    >> double f1, f2, f3;

    >
    > Sorry brain fart typing in the small example, the original, much larger,
    > code uses double.
    >
    >> or use %f
    >>
    >> sscanf() doesn't know about regular expressions. %127[^\n] is
    >> meaningless.

    >
    > Hmm, not meaningless, but perhaps a gcc extension?


    Nope. It's there in the C 1999 standard

    ISO/IEC C 9899-1999 7.19.6.2 ("The fscanf function") paragraph 12 states:

    12 The conversion speciï¬ers and their meanings are:
    ...
    [ Matches a nonempty sequence of characters from a set of expected
    characters (the scanset)) If no l length modiï¬er is present, the
    corresponding argument shall be a pointer to the initial element of a
    character array large enough to accept the sequence and a terminating
    null character, which will be added automatically.

    If an l length modiï¬er is present, the input shall be a sequence of
    multibyte characters that begins in the initial shift state. Each
    multibyte character is converted to a wide character as if by a call to
    the mbrtowc function, with the conversion state described by an
    mbstate_t object initialized to zero before the ï¬rst multibyte
    character is converted. The corresponding argument shall be a pointer
    to the initial element of an array of wchar_t large enough to accept
    the sequence and the terminating null wide character, which will be
    added automatically.

    The conversion speciï¬er includes all subsequent charactersin the
    format string, up to and including the matching right bracket (]).The
    characters between the brackets (the scanlist) compose the scanset,
    unless the character after the left bracket is a circumflex(ˆ), in
    which case the scanset contains all characters that do not appear in
    the scanlist between the circumflex and the right bracket. If the
    conversion speciï¬er begins with [] or [ˆ], the right bracket character
    is in the scanlist and the next following right bracket character is
    the matching right bracket that ends the speciï¬cation; otherwise the
    ï¬rst following right bracket character is the one that ends the
    speciï¬cation. If a - character is in the scanlist and is not the ï¬rst,
    nor the second where the ï¬rst character is a ˆ, nor the last character,
    the behavior is implementation-deï¬ned.


    --
    Lew Pitcher
    "In Skills, We Trust"
     
    Lew Pitcher, Mar 14, 2013
    #7
  8. Joe Pfeiffer <> writes:
    > mathog <> writes:
    >> Is this sscanf adequate to prevent a buffer overflow if
    >> the input text field is >127 characters in length? The goal
    >> is to put everything from the first character of the text
    >> following the 3 floats to the end of the line into fontname.
    >> But only if it will all fit into the buffer.
    >>
    >> int elements;
    >> float f1,f2,f3;
    >> char fontname[128];
    >> char *string;
    >> while(1){
    >> /* load the string from a file, or exit loop on EOF
    >> Input lines look like:
    >> 1.23 2.34 -1.2323 some text here
    >> */
    >> elements = sscanf(string,
    >> "%lf %lf %lf %127[^\n]",
    >> &f1,&f2,&f3, &fontname[0]);
    >> /* do something with the values */
    >> }
    >>
    >> The code at present lacks the 127, and I assume
    >> that it would generate a memory access error on an excessively long
    >> input line.

    >
    > Almost...
    >
    > %lf looks for a double, not a float. You need to either declare
    > double f1, f2, f3;
    > or use %f
    >

    [...]
    >
    > If I were doing this, I'd probably use three calls to strtok() to pull
    > off the floats (and then use atof to convert them), and strncpy() to
    > pull the name into the buffer (note that you have to make sure that if
    > the font is too long to fit in the buffer you get properly
    > null-terminated yourself).


    atof() doesn't detect errors; strtod() does.

    strncpy() is rarely the right answer. It usually either fails to
    null-terminate the target array, or pads it to the end with unnecessary
    null characters. I ranted about it here:

    http://the-flat-trantor-society.blogspot.com/2012/03/no-strncpy-is-not-safer-strcpy.html

    If you can confirm that there's enough room in the target, just use
    strcpy(). If there isn't enough room, you'll have to decide how to
    handle that; strncpy()'s behavior in that case is probably not what you
    want.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 14, 2013
    #8
  9. mathog

    Eric Sosman Guest

    On 3/14/2013 4:13 PM, Joe Pfeiffer wrote:
    > mathog <> writes:
    >
    >> Joe Pfeiffer wrote:
    >>>
    >>> sscanf() doesn't know about regular expressions. %127[^\n] is
    >>> meaningless.

    >>
    >> Hmm, not meaningless, but perhaps a gcc extension?
    >> The man page for scanf says:
    >>
    >> [ Matches a nonempty sequence of characters from the
    >> specified set of accepted characters; the next pointer must be a
    >> pointer to char, and there must be enough room for all the characters
    >> in the string, plus a terminating null byte. ...

    >
    > I apologize! checking, it's even in the C99 standard. I don't know
    > when it was added.


    It was added in 1989 by ANSI, but not until 1990 by ISO.

    --
    Eric Sosman
    d
     
    Eric Sosman, Mar 14, 2013
    #9
  10. mathog

    Eric Sosman Guest

    On 3/14/2013 4:04 PM, mathog wrote:
    > Joe Pfeiffer wrote:
    >
    >> sscanf() doesn't know about regular expressions. %127[^\n] is
    >> meaningless.

    >
    > After some searching, it is not a gcc extension, but part of C99,
    > section 7.19.6.2. Information from here:
    >
    > http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard
    >
    >
    >> %127s would read a string of up to 127 characters, but it
    >> couldn't contain any white space.

    >
    > %127[^\n] does move white space characters into the buffer.


    Yes. With your format string, though:

    elements = sscanf(string,
    "%lf %lf %lf %127[^\n]",
    &f1,&f2,&f3, &fontname[0]);

    .... the "%[" directive will move no leading white space, because
    it's already been eaten by the " " immediately before. (It will,
    however, move embedded and trailing white space, up to but not
    including the '\n'.) If you want the leading spaces, try

    elements = sscanf(string,
    "%lf%lf%lf%127[^\n]",
    &f1,&f2,&f3, &fontname[0]);

    --
    Eric Sosman
    d
     
    Eric Sosman, Mar 14, 2013
    #10
  11. mathog

    Alan Curry Guest

    In article <khtgth$il5$>,
    Eric Sosman <> wrote:
    >On 3/14/2013 4:13 PM, Joe Pfeiffer wrote:
    >> mathog <> writes:
    >>
    >>> Joe Pfeiffer wrote:
    >>>>
    >>>> sscanf() doesn't know about regular expressions. %127[^\n] is
    >>>> meaningless.
    >>>
    >>> Hmm, not meaningless, but perhaps a gcc extension?
    >>> The man page for scanf says:
    >>>
    >>> [ Matches a nonempty sequence of characters from the
    >>> specified set of accepted characters; the next pointer must be a
    >>> pointer to char, and there must be enough room for all the characters
    >>> in the string, plus a terminating null byte. ...

    >>
    >> I apologize! checking, it's even in the C99 standard. I don't know
    >> when it was added.

    >
    > It was added in 1989 by ANSI, but not until 1990 by ISO.


    Another opportunity grep the ancient source!

    V6 UNIX (1975) had scanf %[...] support. ANSI "added" it when they took on
    the C language I guess, but not in the same way they added things like
    prototypes which were actually changes from K&R1.

    V5 doesn't appear to have any scanf-like function at all, so it looks like
    there was never a scanf without this feature.

    --
    Alan Curry
     
    Alan Curry, Mar 14, 2013
    #11
  12. mathog

    Eric Sosman Guest

    On 3/14/2013 7:38 PM, Alan Curry wrote:
    > In article <khtgth$il5$>,
    > Eric Sosman <> wrote:
    >> On 3/14/2013 4:13 PM, Joe Pfeiffer wrote:
    >>>[...]
    >>> I apologize! checking, it's even in the C99 standard. I don't know
    >>> when it was added.

    >>
    >> It was added in 1989 by ANSI, but not until 1990 by ISO.

    >
    > Another opportunity grep the ancient source!
    >
    > V6 UNIX (1975) had scanf %[...] support. ANSI "added" it when they took on
    > the C language I guess, but not in the same way they added things like
    > prototypes which were actually changes from K&R1.


    Hmmm. Looks like that protrusion in my cheek caused by
    the pressure of my tongue is not as prominent as I'd thought.

    --
    Eric Sosman
    d
     
    Eric Sosman, Mar 14, 2013
    #12
  13. mathog

    Shao Miller Guest

    On 3/14/2013 15:03, mathog wrote:
    > Is this sscanf adequate to prevent a buffer overflow if
    > the input text field is >127 characters in length?


    I don't think so, but it seems close.

    > The goal
    > is to put everything from the first character of the text
    > following the 3 floats to the end of the line into fontname.
    > But only if it will all fit into the buffer.
    >


    A minimal test-case might have excluded the 'float' business in order
    not to distract from a question about using 'sscanf' to populate a
    buffer without overflow. Having said that, it's good that you included
    the 'float's, since at least one person pointed out the "%lf". :)

    > int elements;
    > float f1,f2,f3;
    > char fontname[128];
    > char *string;
    > while(1){
    > /* load the string from a file, or exit loop on EOF
    > Input lines look like:
    > 1.23 2.34 -1.2323 some text here
    > */
    > elements = sscanf(string,
    > "%lf %lf %lf %127[^\n]",
    > &f1,&f2,&f3, &fontname[0]);
    > /* do something with the values */
    > }
    >
    > The code at present lacks the 127, and I assume
    > that it would generate a memory access error on an excessively long
    > input line.
    >


    I don't know if the following code might help, or not. It might have bugs.

    #include <stdio.h>
    #include <stdlib.h>

    #define Stringify_(x) # x
    #define Stringify(x) Stringify_(x)
    #define CvBufSize 128

    int main(void) {
    char src[] = "Hello, world!";
    char buf[CvBufSize];
    int sr;
    int chars;

    sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf, &chars);
    if (sr != 1) {
    printf("Failed\n");
    return EXIT_FAILURE;
    }
    printf("%s\n", buf);
    return EXIT_SUCCESS;
    }

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
     
    Shao Miller, Mar 15, 2013
    #13
  14. On Fri, 15 Mar 2013 03:33:02 -0400, Shao Miller wrote:

    > I don't know if the following code might help, or not. It might have
    > bugs.


    It does have at least a glaring off-by-one bug.
    For string-input with a maximum field width, the buffer you provide must
    be at least one byte larger than the specified maximum to accommodate the
    '\0' that sscanf will write to it after reading the maximum length string.

    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > #define Stringify_(x) # x
    > #define Stringify(x) Stringify_(x)

    #define CvMaxStringLength 127
    > #define CvBufSize 128

    #define CbBufSize CvMaxStringLength+1
    >
    > int main(void) {
    > char src[] = "Hello, world!";
    > char buf[CvBufSize];
    > int sr;
    > int chars;
    >
    > sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf,

    sr = sscanf(src, "%" Stringify(CvMaxStringLength) "[^\n]%n", buf,
    > &chars);
    > if (sr != 1) {
    > printf("Failed\n");
    > return EXIT_FAILURE;
    > }
    > printf("%s\n", buf);
    > return EXIT_SUCCESS;
    > }
    >
    > --
    > - Shao Miller


    Bart v Ingen Schenau
     
    Bart van Ingen Schenau, Mar 15, 2013
    #14
  15. mathog

    Ike Naar Guest

    On 2013-03-15, Bart van Ingen Schenau <> wrote:
    > On Fri, 15 Mar 2013 03:33:02 -0400, Shao Miller wrote:
    >> #define Stringify_(x) # x
    >> #define Stringify(x) Stringify_(x)

    > #define CvMaxStringLength 127
    >> #define CvBufSize 128

    > #define CbBufSize CvMaxStringLength+1


    This definition may lead to an unpleasant surprise
    when it is used in an expression such as

    CbBufSize * 2

    A safer definition would be

    #define CbBufSize (CvMaxStringLength+1)
     
    Ike Naar, Mar 15, 2013
    #15
  16. mathog

    Shao Miller Guest

    On 3/15/2013 07:24, Bart van Ingen Schenau wrote:
    > On Fri, 15 Mar 2013 03:33:02 -0400, Shao Miller wrote:
    >
    >> I don't know if the following code might help, or not. It might have
    >> bugs.

    >
    > It does have at least a glaring off-by-one bug.
    > For string-input with a maximum field width, the buffer you provide must
    > be at least one byte larger than the specified maximum to accommodate the
    > '\0' that sscanf will write to it after reading the maximum length string.
    >


    Dwoops. The original post specified 127 and somehow + 1 happened before
    writing the code, instead of inside the array bounds. Thanks for making
    the code better than useless. :)

    One point I was trying to make with the code was that the %n can be
    useful for a few things, such as not needing to 'strlen' the destination
    buffer. Hopefully the error didn't distract too much from that.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
     
    Shao Miller, Mar 15, 2013
    #16
  17. Shao Miller <> writes:
    [...]
    > #define Stringify_(x) # x
    > #define Stringify(x) Stringify_(x)


    These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
    it's obvious that they're macros.

    > #define CvBufSize 128


    Ditto, but not quite as important in this case.

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 15, 2013
    #17
  18. mathog

    Shao Miller Guest

    On 3/15/2013 13:34, Keith Thompson wrote:
    > Shao Miller <> writes:
    > [...]
    >> #define Stringify_(x) # x
    >> #define Stringify(x) Stringify_(x)

    >
    > These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
    > it's obvious that they're macros.
    >
    >> #define CvBufSize 128

    >
    > Ditto, but not quite as important in this case.
    >
    > [...]
    >


    If I was in a Rome where this was the trend, most certainly. In the
    last couple of years, I've kind of wondered why anyone should care about
    distinguishing macros versus otherwise. I can think of some reasons,
    but then there seem to be counter-reasons for them. I'd enjoy reading
    what other people have to say about it.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
     
    Shao Miller, Mar 15, 2013
    #18
  19. Shao Miller <> writes:
    > On 3/15/2013 13:34, Keith Thompson wrote:
    >> Shao Miller <> writes:
    >> [...]
    >>> #define Stringify_(x) # x
    >>> #define Stringify(x) Stringify_(x)

    >>
    >> These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
    >> it's obvious that they're macros.
    >>
    >>> #define CvBufSize 128

    >>
    >> Ditto, but not quite as important in this case.
    >>
    >> [...]

    >
    > If I was in a Rome where this was the trend, most certainly. In the
    > last couple of years, I've kind of wondered why anyone should care about
    > distinguishing macros versus otherwise. I can think of some reasons,
    > but then there seem to be counter-reasons for them. I'd enjoy reading
    > what other people have to say about it.


    The line invoking that macro was:

    sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf, &chars);

    The problem is that that would be illegal if it were a function call.

    Other examples:

    int x;
    foo(x);

    That looks like x is being used before it's initialized -- but
    if foo is a macro, it's not necessarily an error. And if x *is*
    initialized, I'm going to assume that it has the same value after
    foo(x) that it had before -- but a macro can break that assumption.

    Basically when I'm reading code, knowing that a particular identifier
    is a macro name helps me to understand the code. More precisely,
    if there's a macro invocation, I can't be sure what the code does
    unless (a) I know it's a macro, and (b) I know how it's defined.

    If a macro is sufficiently well-behaved that it's practically a
    drop-in replacement for a function (like getc(), for example),
    then giving it a lowercase name is probably ok. Otherwise,
    it's best to follow the convention of all-caps for macro names.
    (Admittedly, the standard itself isn't very good about following it.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 15, 2013
    #19
  20. On Fri, 15 Mar 2013 13:01:19 +0000, Ike Naar wrote:

    > On 2013-03-15, Bart van Ingen Schenau <>
    > wrote:
    >> On Fri, 15 Mar 2013 03:33:02 -0400, Shao Miller wrote:
    >>> #define Stringify_(x) # x
    >>> #define Stringify(x) Stringify_(x)

    >> #define CvMaxStringLength 127
    >>> #define CvBufSize 128

    >> #define CbBufSize CvMaxStringLength+1

    >
    > This definition may lead to an unpleasant surprise when it is used in an
    > expression such as
    >
    > CbBufSize * 2
    >
    > A safer definition would be
    >
    > #define CbBufSize (CvMaxStringLength+1)


    Oops.
    What was it again about spelling flames that must contain a spelling
    error themselves?

    Bart v Ingen Schenau
     
    Bart van Ingen Schenau, Mar 17, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alf P. Steinbach

    isdigit() for characters greater than 127

    Alf P. Steinbach, Oct 9, 2004, in forum: C++
    Replies:
    4
    Views:
    3,231
    James Gregory
    Oct 9, 2004
  2. Bernard Liang

    sscanf'ing floats into structure members

    Bernard Liang, Jun 21, 2006, in forum: C Programming
    Replies:
    3
    Views:
    709
    Keith Thompson
    Jun 21, 2006
  3. Felipe Navas
    Replies:
    2
    Views:
    83
    Joel VanderWerf
    May 17, 2007
  4. kike
    Replies:
    0
    Views:
    139
  5. Alextophi
    Replies:
    8
    Views:
    519
    Alan J. Flavell
    Dec 30, 2005
Loading...

Share This Page