snprintf without null-termination?

L

Lauri Alanko

I would like to format some output into a buffer that is exactly of
the correct size to hold the formatted output _without_ a terminating
NUL. However, the snprintf function always ensures that the resulting
string is null-terminated, even if this means truncating the output.

This behavior is inconsistent with how strncpy works, and it seems to
be the wrong default: if snprintf did not ensure null-termination, the
caller could easily provide it if required. But there seems to be no
efficient way in the other direction.

Currently, I have to allocate a temporary buffer that is one byte
longer, then snprintf to the temporary buffer, then memcpy everything
but the terminating '\0' to the final destination buffer. Is there a
better way?

Thanks,


Lauri
 
M

Mark Bluemel

I would like to format some output into a buffer that is exactly of
the correct size to hold the formatted output _without_ a terminating
NUL.

So you don't want your output to be a valid C string? Can we ask why
this is?
However, the snprintf function always ensures that the resulting
string is null-terminated, even if this means truncating the output.

That is to say the result is a valid C string...
This behavior is inconsistent with how strncpy works,

Which suggests to me that strncpy is badly designed, rather than
snprintf - strncpy returns a pointer to the destination which may not be
a valid C string.
and it seems to
be the wrong default: if snprintf did not ensure null-termination, the
caller could easily provide it if required. But there seems to be no
efficient way in the other direction.

Currently, I have to allocate a temporary buffer that is one byte
longer, then snprintf to the temporary buffer, then memcpy everything
but the terminating '\0' to the final destination buffer. Is there a
better way?

I can't think of one immediately. What are you going to do with the
destination buffer? If it's going to be passed to a function which takes
an address and a length, the issue's trivial, of course.
 
L

Lauri Alanko

So you don't want your output to be a valid C string? Can we ask why
this is?

C strings are horrible, and there's no compelling reason to use them
except when interfacing with existing libraries. In my own code, I
prefer to have a different representation for strings, one where the
length is stored separately.

When implementing operations for my strings, I'd like to leverage the
standard library as much as possible, and formatting in particular is
non-trivial so I'd like to avoid reimplementing it from scratch.
What are you going to do with the
destination buffer? If it's going to be passed to a function which takes
an address and a length, the issue's trivial, of course.

Would you care to elaborate? Certainly I could simply make the
destination buffer one byte longer, but that is one byte of memory
wasted, and that is significant when dealing with a large number of
very short strings, as I happen to be.


Lauri
 
T

Tom St Denis

C strings are horrible, and there's no compelling reason to use them
except when interfacing with existing libraries. In my own code, I
prefer to have a different representation for strings, one where the
length is stored separately.

When implementing operations for my strings, I'd like to leverage the
standard library as much as possible, and formatting in particular is
non-trivial so I'd like to avoid reimplementing it from scratch.


Would you care to elaborate? Certainly I could simply make the
destination buffer one byte longer, but that is one byte of memory
wasted, and that is significant when dealing with a large number of
very short strings, as I happen to be.

Why not just wrap snprintf with a function that doesn't copy the NUL
byte to the final destination buffer?

vsnprintf exists for a reason...

Tom
 
M

Mark Bluemel

So you don't want your output to be a valid C string? Can we ask why
this is?

C strings are horrible, and there's no compelling reason to use them
except when interfacing with existing libraries. [Snip]

... I'd like to leverage the
standard library as much as possible

You seem to be keen on having your cake and eating it, it seems to me.
On the one hand you want to avoid standard C strings and on the other
you want to use library routines which are predicated on them...

Have you considered grabbing a public domain snprintf implementation and
hacking it to your requirements?
 
K

Keith Thompson

Lauri Alanko said:
I would like to format some output into a buffer that is exactly of
the correct size to hold the formatted output _without_ a terminating
NUL. However, the snprintf function always ensures that the resulting
string is null-terminated, even if this means truncating the output.

This behavior is inconsistent with how strncpy works, and it seems to
be the wrong default: if snprintf did not ensure null-termination, the
caller could easily provide it if required. But there seems to be no
efficient way in the other direction.

Currently, I have to allocate a temporary buffer that is one byte
longer, then snprintf to the temporary buffer, then memcpy everything
but the terminating '\0' to the final destination buffer. Is there a
better way?

It's strncpy that's inconsistent with the other string functions.
It's designed to work with a very specialized data structure,
consisting of an array containing some number of non-null characters
followed by zero or more null bytes. In particular, strncpy can
pad its destination with *multiple* null bytes. (I think it was
primarily used to store file names in early versions of Unix.)

Almost all the other string functions (such as strncat) deal with
C-style null-terminated strings, and attempt to avoid creating
character arrays that don't have the null terminator.

As you're seeing, this particular data format is *very* deeply
entwined into the standard C library, and to a lesser extent into
the language itself. Working with a different format while using
the C standard library is just going to be difficult.
 
B

Ben Bacarisse

Tom St Denis said:
Why not just wrap snprintf with a function that doesn't copy the NUL
byte to the final destination buffer?

That sounds like what the OP is doing. From the first post:

| Currently, I have to allocate a temporary buffer that is one byte
| longer, then snprintf to the temporary buffer, then memcpy everything
| but the terminating '\0' to the final destination buffer. Is there a
| better way?

or have you some other way to wrap snprintf in mind?

<snip>
 
K

Keith Thompson

In actual practice, it seemed to work like "the caller could easily
forget to provide it if required".



Correct. A UNIX V6 (and I believe V7, System III, and some System
V) directory entry was a fixed 16 bytes long: 2 bytes of inode
number (limiting an inode number to 16 bits, which wasn't too bad
given the disk sizes available at the time) and 14 bytes of file
name. It was not that uncommon to run into application software
that would do funny things with 14-character file names, like print
a few garbage characters after the file name, or fail to open the
file, because the application didn't allow for the possibility of
the file name being non-NUL-terminated.

On one hand, you have quoted my words (starting "It's strncpy that's
inconsistent ..." without giving me credit. I consider this quite
rude, as I've made abundantly clear in the past.

On the other hand, this is interesting information; thanks for
posting it.

[snip]
 
E

Edward A. Falk

Would you care to elaborate? Certainly I could simply make the
destination buffer one byte longer, but that is one byte of memory
wasted, and that is significant when dealing with a large number of
very short strings, as I happen to be.

Memory is cheap, spend the byte.

Keep in mind that if you're using any kind of standard library to allocate
the string buffers, there's going to be some wasted space anyway.
I just ran a quick test, and on my linux system, malloc() seems to
allocate space in 16-byte quanta, so if your string is not exactly a
multiple of 16 bytes, you're wasting space anyway.
 
J

Joe Pfeiffer

Lauri Alanko said:
C strings are horrible, and there's no compelling reason to use them
except when interfacing with existing libraries. In my own code, I
prefer to have a different representation for strings, one where the
length is stored separately.

When implementing operations for my strings, I'd like to leverage the
standard library as much as possible, and formatting in particular is
non-trivial so I'd like to avoid reimplementing it from scratch.


Would you care to elaborate? Certainly I could simply make the
destination buffer one byte longer, but that is one byte of memory
wasted, and that is significant when dealing with a large number of
very short strings, as I happen to be.

How are you going to store the length of the string without using at
least one byte?
 
K

Keith Thompson

Joe Pfeiffer said:
How are you going to store the length of the string without using at
least one byte?

If you're *already* storing the length as an integer, additionally
storing a terminating '\0' is a waste. (Though not enough of one to
worry about, IMHO.)

If I were doing this kind of thing, I might consider either (a) storing
both the length and the terminating '\0', so the strings are usable with
the standard C string functions (and being very careful to keep the
information consistent), or (b) providing a function that converts
a stored-length string to a C-style string (and using it only when
necessary because of the overhead).
 
J

Joe Pfeiffer

Keith Thompson said:
If you're *already* storing the length as an integer, additionally
storing a terminating '\0' is a waste. (Though not enough of one to
worry about, IMHO.)

True -- I was comparing her preferred implementation (which would have no
terminating nulls) to the existing C convention.
If I were doing this kind of thing, I might consider either (a) storing
both the length and the terminating '\0', so the strings are usable with
the standard C string functions (and being very careful to keep the
information consistent), or (b) providing a function that converts
a stored-length string to a C-style string (and using it only when
necessary because of the overhead).

Much like C++.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top