snprint rationale?

M

Michael B Allen

What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

What is the name and address of the person responsible for this?

Mike
 
C

Chris Torek

What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This lets you allocate a buffer that is big enough, without having
to do many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);

It is also consistent with fprintf(), which returns the number of
characters printed.
This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

Given a buffer "buf" of size "size":

result = snprintf(buf, size, fmt, arg);

if (result >= 0 && result < size)
all_is_well();
else
needed_more_space();
What is the name and address of the person responsible for this?

That would be me. :)
 
K

Keith Thompson

Michael B Allen said:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

The declaration of snprintf() is:

int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);

The number of characters available is passed in as the second argument.
To test for overflow, check whether the result exceeds n.

If you call snprintf() with n==0; it won't write any characters, but
it will return the number of characters that would have been written.
You can then allocate the appropriate space and call snprintf() again
with the same arguments, but with a non-zero n.
 
E

Erik de Castro Lopo

Michael said:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

Snprintf is guaranteed not to overflow.

This works:

if (snprintf (buf, buflen, "...", ....) < buflen)
puts ("No overflow occurred");
else
puts ("Overflow might have occurred");

This also works:

buflen = snprintf (NULL, 0, "...", ....);
buf = malloc (buflen + 1) ;
snprintf (buf, buflen, "...", ....);



Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo (e-mail address removed) (Yes it's valid)
+-----------------------------------------------------------+
Seen on comp.lang.python:
Q : If someone has the code in python for a buffer overflow,
please post it.
A : Python does not support buffer overflows, sorry.
 
M

Michael Mair

Michael said:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

From the C99 standard:
"7.19.6.5 The snprintf function
Synopsis
1
#include <stdio.h> int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);

Description
2 The snprintf function is equivalent to fprintf, except that the output
is written into an array (specified by argument s) rather than to a
stream. If n is zero, nothing is written, and s may be a null pointer.
Otherwise, output characters beyond the n-1st are discarded rather than
being written to the array, and a null character is written at the end
of the characters actually written into the array. If copying takes
place between objects that overlap, the behavior is undefined.

Returns
3 The snprintf function returns the number of characters that would have
been written had n been sufficiently large, not counting the terminating
null character, or a neg ative value if an encoding error occurred.
Thus, the null-terminated output has been completely written if and only
if the returned value is nonnegative and less than n.
"

So, I would use allocated buffers and do something along the lines

char *buf;
size_t buf_size;
int retval;

buf = NULL;
buf_size = 0;

while (1) {
retval = snprintf(buf, buf_size, "Test with buf of size %zu\n",
buf_size);
if (retval < 0) {
/* Treat encoding error or die */
}
else if (retval<buf_size) {
break; /* We finally made it */
}
else {
char *tmp;
if ( (tmp=realloc(buf, (size_t) retval + 1)) == NULL ) {
/* Give up trying to write this string or die */
}
buf = tmp;
buf_size = (size_t) retval + 1;
}
}

I did not test it but you see that it deals with the problem
that, depending on buf_size, the length of the output varies
so we need to adjust the size a second time.

What is the name and address of the person responsible for this?

I think this is slighty OT here. Try comp.std.c but I guess
they won't tell you either.


-Michael
 
P

pete

Michael said:
It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this
is that snprintf takes a size_t parameter and returns an int,
which is broken by design.

There's also an environmental limit, which is the minimum value for
the maximum number of characters produced by any single conversion:
509 in C89,
4095 in C99.
 
M

Michael Mair

pete said:
There's also an environmental limit, which is the minimum value for
the maximum number of characters produced by any single conversion:
509 in C89,
4095 in C99.

Thank you :)
I was completely unaware of this.
However, this does not really affect that this switching of types
in between is ugly.

Cheers
Michael
 
P

pete

Michael said:
Thank you :)
I was completely unaware of this.
However, this does not really affect that this switching of types
in between is ugly.

I think it has to do with snprintf being based on the
functionality of fprintf and with fprintf being older than size_t.
 
R

Richard Bos

Michael B Allen said:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

So what else would you have it return? The number of characters it
actually did write? That's almost always useless information, since it's
easily found using strlen(). The number of characters it would've
written had it had the space, however, is very useful.

Richard
 
K

Keith Thompson

Erik de Castro Lopo said:
Snprintf is guaranteed not to overflow.

Well, sort of; it will overflow if you tell it to.

For example,

char buf[5];
snprintf(buf, 30, "%s", "This string is too big");

But assuming the arguments are consistent, yes, it's guaranteed not to
overflow.
 
K

Keith Thompson

Michael Mair said:
It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.
[...]

The following:

snprintf(buf, buf_size, "");

is a legitimate call to snprintf; it returns 0 but doesn't indicate an
error.

If ISO C had a "ssize_t" type (a signed equivalent of size_t), this
would be a good place to use it. (POSIX defines ssize_t; ISO C
doesn't.)

An alternative might be to have the return value just indicate success
or failure, and return the number of bytes via a separate size_t*
argument, but that would make the function more difficult to use.

In practice, returning int is only going to be a problem if the length
of the string would exceed INT_MAX characters. This is unlikely on
systems with 16-bit int, and even more unlikely on systems with 32-bit
or larger int. I agree that it's a wart, but I'm not sure there's a
good way to fix it.
 
M

Michael Mair

Keith said:
Michael Mair said:
It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.

[...]

The following:

snprintf(buf, buf_size, "");

is a legitimate call to snprintf; it returns 0 but doesn't indicate an
error.

With my suggestion, this would have returned 1 ('\0') which is distinct
from 0 :)

If ISO C had a "ssize_t" type (a signed equivalent of size_t), this
would be a good place to use it. (POSIX defines ssize_t; ISO C
doesn't.)

Yep, I really do not understand why we were not given that toy by
C99... especially since at other places the standard goes to a length
avoiding to say ssize_t (for example when describing the *printf/*scanf
length modifier z, referring to size_t or the corresponding signed
type...).
Losing half the positive range of size_t is certainly better than
a potential int/size_t problem.

An alternative might be to have the return value just indicate success
or failure, and return the number of bytes via a separate size_t*
argument, but that would make the function more difficult to use.
Indeed.


In practice, returning int is only going to be a problem if the length
of the string would exceed INT_MAX characters. This is unlikely on
systems with 16-bit int, and even more unlikely on systems with 32-bit
or larger int. I agree that it's a wart, but I'm not sure there's a
good way to fix it.

Well, apart from the differences to fprintf() which will lead to
problems with people too lazy to look up snprintf(), I still hold
that returning -- as size_t value -- the numbers of characters to
be written _including_ the string terminator or zero on error would
have been the easiest and probably best way.
However, this is purely academical as we already have the wart.


Cheers
Michael
 
M

Michael B Allen

This lets you allocate a buffer that is big enough, without having to do
many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2); if (needed < 0) ...
handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ... result = snprintf(mem, needed
+ 1, fmt, arg1, arg2);

I see. This is reasonable. I was wondering why it didn't just return -1
but I prefer this behavior. If I want something dumber I can wrap it.

Thanks,
Mike
 
K

Keith Thompson

Michael Mair said:
Keith said:
Michael Mair said:
Michael B Allen wrote:

What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.
[...]
The following:
snprintf(buf, buf_size, "");
is a legitimate call to snprintf; it returns 0 but doesn't indicate
an
error.

With my suggestion, this would have returned 1 ('\0') which is distinct
from 0 :)

Right, I missed the "including the string terminator" clause. I think
that would be counterintuitive, since most similar functions return
the length (strlen()) of the string excluding the terminator. But in
any case we're stuck with the current behavior.
 
D

Dan Pop

In said:
This lets you allocate a buffer that is big enough, without having
to do many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);

No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Dan
 
C

Charlie Gordon

Dan Pop said:
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Bad advice : if the code for assessing the length is duplicated for the actual
formatting, chances are these copies will get out of sync !
how is it better to call sprintf() instead of snprintf() ? What is there to
gain ?

If you are really paranoid, check again that result == needed indeed. But by
all means call snprintf().

Chqrlie.
 
K

Keith Thompson

Charlie Gordon said:
Bad advice : if the code for assessing the length is duplicated for
the actual formatting, chances are these copies will get out of
sync! how is it better to call sprintf() instead of snprintf() ?
What is there to gain ?

I suspect any actual implementation is going to use much of the same
underlying code for all the *printf functions. And even if the code
is duplicated, any discrepancy between the length determined by
snprintf() and the length determined by sprintf() would be a bug in
the implementation. If you're unwilling to assume that the
implementation gets things right, you probably shouldn't be using it
in the first place. If you happen to know of a specific bug in some
implementation, it can make sense to guard against it in your code
(*and* submit a bug report to the implementer), but guarding against
implementation bugs in general is a waste of time. Or did I miss your
point?
 
M

Michael Mair

Keith said:
Charlie Gordon said:
[...]
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Bad advice : if the code for assessing the length is duplicated for
the actual formatting, chances are these copies will get out of
sync! how is it better to call sprintf() instead of snprintf() ?
What is there to gain ?

I suspect any actual implementation is going to use much of the same
underlying code for all the *printf functions. And even if the code
is duplicated, any discrepancy between the length determined by
snprintf() and the length determined by sprintf() would be a bug in
the implementation. If you're unwilling to assume that the
implementation gets things right, you probably shouldn't be using it
in the first place. If you happen to know of a specific bug in some
implementation, it can make sense to guard against it in your code
(*and* submit a bug report to the implementer), but guarding against
implementation bugs in general is a waste of time. Or did I miss your
point?

The only thing which makes it _necessary_ to check again the return
value I can think of is a dependence of arg1/arg2 on needed or mem
(as I illustrated elsethread).

Charlies point, AFAICS, is that you easily can make a slip of the kind
"Oh, I will always copy over the format string" and then forget it
once. This may lead to exactly the buffer overrun which was to be
avoided in the first place.
As Dan referred to the const char * fmt in both cases, this point is
moot here; but in general, I agree that it certainly does not hurt to
always use snprintf...


Cheers
Michael
 
C

Chris Torek

No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Call it paranoia if you like, but I recommend the two-snprintf()
method anyway. For instance, suppose we have:

char *some_subroutine(const char *str, int n) {
...
needed = snprintf(NULL, 0, "%s%d", str, n);
...
mem = malloc(needed + 1);
...
return mem;
}

and suppose the caller passes, as the "str" argument, a pointer to
freed space (or similarly invalid pointer) that gets overwritten
with some new value by the malloc() call. In particular, if
strlen(str) increases, the second snprintf() will "want to" write
more characters than we just allocated.

Obviously such a call is faulty -- but using snprintf() twice will
limit any additional damage, and the fact that the number of
characters printed has changed may help find the bug. The *cost*
of using snprintf() twice (instead of snprintf() followed by plain
sprintf()) is likely negligible; it may even save CPU time and/or
(code) space.

As someone else noted, there is also the "parallel construction"
bonus, which I admit is kind of a soft squidgy human-factor thing
-- but as I see it, that makes two or three advantages (however
small they may be), and no disadvantages, so one might as well do
it this way.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top