sscanf up to 127 characters into a buffer?

M

mathog

Is this sscanf adequate to prevent a buffer overflow if
the input text field is >127 characters in length? The goal
is to put everything from the first character of the text
following the 3 floats to the end of the line into fontname.
But only if it will all fit into the buffer.

int elements;
float f1,f2,f3;
char fontname[128];
char *string;
while(1){
/* load the string from a file, or exit loop on EOF
Input lines look like:
1.23 2.34 -1.2323 some text here
*/
elements = sscanf(string,
"%lf %lf %lf %127[^\n]",
&f1,&f2,&f3, &fontname[0]);
/* do something with the values */
}

The code at present lacks the 127, and I assume
that it would generate a memory access error on an excessively long
input line.

Thanks,

David Mathog
 
J

Joe Pfeiffer

mathog said:
Is this sscanf adequate to prevent a buffer overflow if
the input text field is >127 characters in length? The goal
is to put everything from the first character of the text
following the 3 floats to the end of the line into fontname.
But only if it will all fit into the buffer.

int elements;
float f1,f2,f3;
char fontname[128];
char *string;
while(1){
/* load the string from a file, or exit loop on EOF
Input lines look like:
1.23 2.34 -1.2323 some text here
*/
elements = sscanf(string,
"%lf %lf %lf %127[^\n]",
&f1,&f2,&f3, &fontname[0]);
/* do something with the values */
}

The code at present lacks the 127, and I assume
that it would generate a memory access error on an excessively long
input line.

Almost...

%lf looks for a double, not a float. You need to either declare
double f1, f2, f3;
or use %f

sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless. %127s would read a string of up to 127 characters, but it
couldn't contain any white space.

If I were doing this, I'd probably use three calls to strtok() to pull
off the floats (and then use atof to convert them), and strncpy() to
pull the name into the buffer (note that you have to make sure that if
the font is too long to fit in the buffer you get properly
null-terminated yourself).
 
M

mathog

Joe said:
Almost...

%lf looks for a double, not a float. You need to either declare
double f1, f2, f3;

Sorry brain fart typing in the small example, the original, much larger,
code uses double.
or use %f

sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

Hmm, not meaningless, but perhaps a gcc extension?
The man page for scanf says:

[ Matches a nonempty sequence of characters from the
specified set of accepted characters; the next pointer must be a pointer
to char, and there must be enough room for all the characters in the
string, plus a terminating null byte. ...

Regards,

David Mathog
 
M

mathog

J

Joe Pfeiffer

mathog said:
Joe said:
sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

Hmm, not meaningless, but perhaps a gcc extension?
The man page for scanf says:

[ Matches a nonempty sequence of characters from the
specified set of accepted characters; the next pointer must be a
pointer to char, and there must be enough room for all the characters
in the string, plus a terminating null byte. ...

I apologize! checking, it's even in the C99 standard. I don't know
when it was added.
 
J

Joe Pfeiffer

mathog said:
Joe said:
sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

After some searching, it is not a gcc extension, but part of C99,
section 7.19.6.2. Information from here:

http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard
%127s would read a string of up to 127 characters, but it
couldn't contain any white space.

%127[^\n] does move white space characters into the buffer.

I stand corrected.
 
L

Lew Pitcher

Joe said:
Almost...

%lf looks for a double, not a float. You need to either declare
double f1, f2, f3;

Sorry brain fart typing in the small example, the original, much larger,
code uses double.
or use %f

sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

Hmm, not meaningless, but perhaps a gcc extension?

Nope. It's there in the C 1999 standard

ISO/IEC C 9899-1999 7.19.6.2 ("The fscanf function") paragraph 12 states:

12 The conversion speciï¬ers and their meanings are:
...
[ Matches a nonempty sequence of characters from a set of expected
characters (the scanset)) If no l length modiï¬er is present, the
corresponding argument shall be a pointer to the initial element of a
character array large enough to accept the sequence and a terminating
null character, which will be added automatically.

If an l length modiï¬er is present, the input shall be a sequence of
multibyte characters that begins in the initial shift state. Each
multibyte character is converted to a wide character as if by a call to
the mbrtowc function, with the conversion state described by an
mbstate_t object initialized to zero before the ï¬rst multibyte
character is converted. The corresponding argument shall be a pointer
to the initial element of an array of wchar_t large enough to accept
the sequence and the terminating null wide character, which will be
added automatically.

The conversion speciï¬er includes all subsequent charactersin the
format string, up to and including the matching right bracket (]).The
characters between the brackets (the scanlist) compose the scanset,
unless the character after the left bracket is a circumflex(ˆ), in
which case the scanset contains all characters that do not appear in
the scanlist between the circumflex and the right bracket. If the
conversion speciï¬er begins with [] or [ˆ], the right bracket character
is in the scanlist and the next following right bracket character is
the matching right bracket that ends the speciï¬cation; otherwise the
ï¬rst following right bracket character is the one that ends the
speciï¬cation. If a - character is in the scanlist and is not the ï¬rst,
nor the second where the ï¬rst character is a ˆ, nor the last character,
the behavior is implementation-deï¬ned.
 
K

Keith Thompson

Joe Pfeiffer said:
mathog said:
Is this sscanf adequate to prevent a buffer overflow if
the input text field is >127 characters in length? The goal
is to put everything from the first character of the text
following the 3 floats to the end of the line into fontname.
But only if it will all fit into the buffer.

int elements;
float f1,f2,f3;
char fontname[128];
char *string;
while(1){
/* load the string from a file, or exit loop on EOF
Input lines look like:
1.23 2.34 -1.2323 some text here
*/
elements = sscanf(string,
"%lf %lf %lf %127[^\n]",
&f1,&f2,&f3, &fontname[0]);
/* do something with the values */
}

The code at present lacks the 127, and I assume
that it would generate a memory access error on an excessively long
input line.

Almost...

%lf looks for a double, not a float. You need to either declare
double f1, f2, f3;
or use %f
[...]

If I were doing this, I'd probably use three calls to strtok() to pull
off the floats (and then use atof to convert them), and strncpy() to
pull the name into the buffer (note that you have to make sure that if
the font is too long to fit in the buffer you get properly
null-terminated yourself).

atof() doesn't detect errors; strtod() does.

strncpy() is rarely the right answer. It usually either fails to
null-terminate the target array, or pads it to the end with unnecessary
null characters. I ranted about it here:

http://the-flat-trantor-society.blogspot.com/2012/03/no-strncpy-is-not-safer-strcpy.html

If you can confirm that there's enough room in the target, just use
strcpy(). If there isn't enough room, you'll have to decide how to
handle that; strncpy()'s behavior in that case is probably not what you
want.
 
E

Eric Sosman

mathog said:
Joe said:
sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

Hmm, not meaningless, but perhaps a gcc extension?
The man page for scanf says:

[ Matches a nonempty sequence of characters from the
specified set of accepted characters; the next pointer must be a
pointer to char, and there must be enough room for all the characters
in the string, plus a terminating null byte. ...

I apologize! checking, it's even in the C99 standard. I don't know
when it was added.

It was added in 1989 by ANSI, but not until 1990 by ISO.
 
E

Eric Sosman

Joe said:
sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

After some searching, it is not a gcc extension, but part of C99,
section 7.19.6.2. Information from here:

http://stackoverflow.com/questions/5999164/is-scanfs-regex-support-a-standard

%127s would read a string of up to 127 characters, but it
couldn't contain any white space.

%127[^\n] does move white space characters into the buffer.

Yes. With your format string, though:

elements = sscanf(string,
"%lf %lf %lf %127[^\n]",
&f1,&f2,&f3, &fontname[0]);

.... the "%[" directive will move no leading white space, because
it's already been eaten by the " " immediately before. (It will,
however, move embedded and trailing white space, up to but not
including the '\n'.) If you want the leading spaces, try

elements = sscanf(string,
"%lf%lf%lf%127[^\n]",
&f1,&f2,&f3, &fontname[0]);
 
A

Alan Curry

mathog said:
Joe Pfeiffer wrote:

sscanf() doesn't know about regular expressions. %127[^\n] is
meaningless.

Hmm, not meaningless, but perhaps a gcc extension?
The man page for scanf says:

[ Matches a nonempty sequence of characters from the
specified set of accepted characters; the next pointer must be a
pointer to char, and there must be enough room for all the characters
in the string, plus a terminating null byte. ...

I apologize! checking, it's even in the C99 standard. I don't know
when it was added.

It was added in 1989 by ANSI, but not until 1990 by ISO.

Another opportunity grep the ancient source!

V6 UNIX (1975) had scanf %[...] support. ANSI "added" it when they took on
the C language I guess, but not in the same way they added things like
prototypes which were actually changes from K&R1.

V5 doesn't appear to have any scanf-like function at all, so it looks like
there was never a scanf without this feature.
 
E

Eric Sosman

[...]
I apologize! checking, it's even in the C99 standard. I don't know
when it was added.

It was added in 1989 by ANSI, but not until 1990 by ISO.

Another opportunity grep the ancient source!

V6 UNIX (1975) had scanf %[...] support. ANSI "added" it when they took on
the C language I guess, but not in the same way they added things like
prototypes which were actually changes from K&R1.

Hmmm. Looks like that protrusion in my cheek caused by
the pressure of my tongue is not as prominent as I'd thought.
 
S

Shao Miller

Is this sscanf adequate to prevent a buffer overflow if
the input text field is >127 characters in length?

I don't think so, but it seems close.
The goal
is to put everything from the first character of the text
following the 3 floats to the end of the line into fontname.
But only if it will all fit into the buffer.

A minimal test-case might have excluded the 'float' business in order
not to distract from a question about using 'sscanf' to populate a
buffer without overflow. Having said that, it's good that you included
the 'float's, since at least one person pointed out the "%lf". :)
int elements;
float f1,f2,f3;
char fontname[128];
char *string;
while(1){
/* load the string from a file, or exit loop on EOF
Input lines look like:
1.23 2.34 -1.2323 some text here
*/
elements = sscanf(string,
"%lf %lf %lf %127[^\n]",
&f1,&f2,&f3, &fontname[0]);
/* do something with the values */
}

The code at present lacks the 127, and I assume
that it would generate a memory access error on an excessively long
input line.

I don't know if the following code might help, or not. It might have bugs.

#include <stdio.h>
#include <stdlib.h>

#define Stringify_(x) # x
#define Stringify(x) Stringify_(x)
#define CvBufSize 128

int main(void) {
char src[] = "Hello, world!";
char buf[CvBufSize];
int sr;
int chars;

sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf, &chars);
if (sr != 1) {
printf("Failed\n");
return EXIT_FAILURE;
}
printf("%s\n", buf);
return EXIT_SUCCESS;
}
 
B

Bart van Ingen Schenau

I don't know if the following code might help, or not. It might have
bugs.

It does have at least a glaring off-by-one bug.
For string-input with a maximum field width, the buffer you provide must
be at least one byte larger than the specified maximum to accommodate the
'\0' that sscanf will write to it after reading the maximum length string.
#include <stdio.h>
#include <stdlib.h>

#define Stringify_(x) # x
#define Stringify(x) Stringify_(x) #define CvMaxStringLength 127
#define CvBufSize 128
#define CbBufSize CvMaxStringLength+1
int main(void) {
char src[] = "Hello, world!";
char buf[CvBufSize];
int sr;
int chars;

sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf,
sr = sscanf(src, "%" Stringify(CvMaxStringLength) "[^\n]%n", buf,
&chars);
if (sr != 1) {
printf("Failed\n");
return EXIT_FAILURE;
}
printf("%s\n", buf);
return EXIT_SUCCESS;
}

Bart v Ingen Schenau
 
I

Ike Naar

#define CbBufSize CvMaxStringLength+1

This definition may lead to an unpleasant surprise
when it is used in an expression such as

CbBufSize * 2

A safer definition would be

#define CbBufSize (CvMaxStringLength+1)
 
S

Shao Miller

It does have at least a glaring off-by-one bug.
For string-input with a maximum field width, the buffer you provide must
be at least one byte larger than the specified maximum to accommodate the
'\0' that sscanf will write to it after reading the maximum length string.

Dwoops. The original post specified 127 and somehow + 1 happened before
writing the code, instead of inside the array bounds. Thanks for making
the code better than useless. :)

One point I was trying to make with the code was that the %n can be
useful for a few things, such as not needing to 'strlen' the destination
buffer. Hopefully the error didn't distract too much from that.
 
K

Keith Thompson

Shao Miller said:
#define Stringify_(x) # x
#define Stringify(x) Stringify_(x)

These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
it's obvious that they're macros.
#define CvBufSize 128

Ditto, but not quite as important in this case.

[...]
 
S

Shao Miller

Shao Miller said:
#define Stringify_(x) # x
#define Stringify(x) Stringify_(x)

These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
it's obvious that they're macros.
#define CvBufSize 128

Ditto, but not quite as important in this case.

[...]

If I was in a Rome where this was the trend, most certainly. In the
last couple of years, I've kind of wondered why anyone should care about
distinguishing macros versus otherwise. I can think of some reasons,
but then there seem to be counter-reasons for them. I'd enjoy reading
what other people have to say about it.
 
K

Keith Thompson

Shao Miller said:
Shao Miller said:
#define Stringify_(x) # x
#define Stringify(x) Stringify_(x)

These would be better defined as STRINGIFY_ and STRINGIFY, so on a call
it's obvious that they're macros.
#define CvBufSize 128

Ditto, but not quite as important in this case.

[...]

If I was in a Rome where this was the trend, most certainly. In the
last couple of years, I've kind of wondered why anyone should care about
distinguishing macros versus otherwise. I can think of some reasons,
but then there seem to be counter-reasons for them. I'd enjoy reading
what other people have to say about it.

The line invoking that macro was:

sr = sscanf(src, "%" Stringify(CvBufSize) "[^\n]%n", buf, &chars);

The problem is that that would be illegal if it were a function call.

Other examples:

int x;
foo(x);

That looks like x is being used before it's initialized -- but
if foo is a macro, it's not necessarily an error. And if x *is*
initialized, I'm going to assume that it has the same value after
foo(x) that it had before -- but a macro can break that assumption.

Basically when I'm reading code, knowing that a particular identifier
is a macro name helps me to understand the code. More precisely,
if there's a macro invocation, I can't be sure what the code does
unless (a) I know it's a macro, and (b) I know how it's defined.

If a macro is sufficiently well-behaved that it's practically a
drop-in replacement for a function (like getc(), for example),
then giving it a lowercase name is probably ok. Otherwise,
it's best to follow the convention of all-caps for macro names.
(Admittedly, the standard itself isn't very good about following it.)
 
B

Bart van Ingen Schenau

This definition may lead to an unpleasant surprise when it is used in an
expression such as

CbBufSize * 2

A safer definition would be

#define CbBufSize (CvMaxStringLength+1)

Oops.
What was it again about spelling flames that must contain a spelling
error themselves?

Bart v Ingen Schenau
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top