strndup: RFC

J

jacob navia

Hi
Reading comp.std.c I noticed that there was a message like this:


The WG14 Post Portland mailing is now available from the WG14 web site
at http://www.open-std.org/jtc1/sc22/wg14/

Best regards
Keld Simonsen

I went there and found that there is a report called
ISO/IEC JTC1 SC22 WG14 WG14/N1193
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

Thanks in advance

jacob
---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)
{
char *p,*r;
if (string == NULL)
return NULL;
p = string;
while (s > 0) {
if (*p == 0)
break;
p++;
s--;
}
s = (p - string);
r = malloc(1+s);
if (r) {
strncpy(r,string,s);
r = 0;
}
return r;
}

#ifdef TEST
#include <stdio.h>
#define MAXTEST 60
int main(void)
{
char *table[MAXTEST];
char *str = "The quick brown fox jumps over the lazy dog";

for (int i=0; i<MAXTEST;i++) {
table = strndup(str,i);
}
for (int i=0; i<MAXTEST;i++) {
printf("[%4d] %s\n",i,table);
}
return 0;
}
#endif
 
R

Richard Heathfield

jacob navia said:

I would like
to see if your sharp eyes see any bug or serious problem with it.

I don't see any serious problem with the code in terms of meeting its
specification, although I would consider replacing
strncpy(r,string,s);

with:

memcpy(r, string, s);

since you've already detected the terminator and know precisely where it is.

The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.
 
J

jacob navia

Richard Heathfield a écrit :
jacob navia said:




I don't see any serious problem with the code in terms of meeting its
specification, although I would consider replacing




with:

memcpy(r, string, s);

since you've already detected the terminator and know precisely where it is.

The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

Interesting. I did not think about that.

You know of a version that gives that information back to the user?
It is sometimes important to know.

< off topic>
Since lcc-win32 supports optional arguments I could do:
char *strndup(char *str,size_t siz,bool *pTruncated=NULL);
strndup(str,30) would be strndup(str,30,NULL)...
< / off topic >
 
R

Richard Tobin

Richard Heathfield said:
The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

When I've used my own version of strndup, it's always been make an
ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

-- Richard
 
T

Tom St Denis

jacob said:
It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

What would the point of strndup be? It allocates the memory so there
really isn't a problem of an overflow. If you're low on memory ... why
are you duplicating a string?

Me thinks this is a solution looking for a problem [not directed at you
specifically Jacob, just the whole question of whether we should care
about strndup at all].

Tom
 
R

Richard Heathfield

Richard Tobin said:
When I've used my own version of strndup, it's always been make an
ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

No, that's not more common - it's just a more *intelligent* use of strncpy.
By far the most common usage of strncpy is from the cargo cult bunch: "I'm
smart, I know about buffer overruns, I know I should use strncpy instead of
strcpy, oops, oh look, I just threw away data, ohdearhowsadnevermind."
 
R

Richard Heathfield

jacob navia said:
You know of a version that gives that information back to the user?

No, but that doesn't mean much because I know of no version of strndup
whatsoever, apart from the one you just posted. I don't see any particular
use for it, so I've never gone looking for it. (That is not the same as
saying it's useless - one man's useless is another man's essential.)
 
C

CBFalconer

Tom said:
jacob said:
It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few
weeks ago.

It proposes strdup and strndup too. Lcc-win32 proposes already
strdup, but strndup was missing. Here is a proposed implementation.
I would like to see if your sharp eyes see any bug or serious
problem with it.

What would the point of strndup be? It allocates the memory so
there really isn't a problem of an overflow. If you're low on
memory ... why are you duplicating a string?

Me thinks this is a solution looking for a problem [not directed at
you specifically Jacob, just the whole question of whether we should
care about strndup at all].

I can see a possible use for it - to duplicate the initial portion
of a string only, i.e. to truncate it on the right.

#include <stdlib.h>
#include <stddef.h>

/* The strndup function copies not more than n characters
(characters that follow a null character are not copied)
from string to a dynamically allocated buffer. The copied
string shall always be null terminated.
*/
char *strndup(const char *_string, size_t _len) {
char *s, *p;

if ((p = s = malloc(_len + 1))) {
if (_string) /* else interpret NULL as empty string */
while (_len-- && (*p++ = *_string++)) continue;
*p = '\0';
}
return s;
} /* strndup, untested */

However N1193 is, in general, a Microsoft proposal, not a
standard. It is their lame attempt to catch their own foolish
clueless programmers. It is ugly, ugly, ugly.

BTW, Jacobs code carelessly fails to define size_t. It also return
NULL for other reasons than lack of memory, which can only cause
confusion.
 
T

Thad Smith

jacob said:
It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it. ....
---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)

I haven't looked at the code, but I am a big believer in accurate
documentation.

Your comment references "n characters", but defines a parameter s. Is
that n? The comment doesn't say how many characters are copied, only
that it is not more than n, so I can't rely on the function to copy the
full strlen(string), even if s exceeds this. I suggest calling the
result a dynamically allocated char array, rather than a buffer, since
it isn't necessarily buffering anything. As a matter of fact, since it
is presumably sized to the input string, there probably isn't room for
additional characters to be added later, thus not used as a buffer.

The description doesn't say what the results are if insufficient memory
exists for the new array.
 
J

jacob navia

CBFalconer a écrit :
BTW, Jacobs code carelessly fails to define size_t.

???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

It also return
NULL for other reasons than lack of memory, which can only cause
confusion.
??? Why confusion?

Null means failure
 
J

jacob navia

Thad Smith a écrit :
I haven't looked at the code, but I am a big believer in accurate
documentation.

Your comment references "n characters", but defines a parameter s. Is
that n? The comment doesn't say how many characters are copied, only
that it is not more than n, so I can't rely on the function to copy the
full strlen(string), even if s exceeds this. I suggest calling the
result a dynamically allocated char array, rather than a buffer, since
it isn't necessarily buffering anything. As a matter of fact, since it
is presumably sized to the input string, there probably isn't room for
additional characters to be added later, thus not used as a buffer.

The description doesn't say what the results are if insufficient memory
exists for the new array.

Excuse me I just cutted and pasted the specification from the
standards document. It is not my documentation.
 
J

jacob navia

CBFalconer a écrit :
Tom St Denis wrote:
char *strndup(const char *_string, size_t _len) {
char *s, *p;

if ((p = s = malloc(_len + 1)))

Here you allocate _len characters even if the string could be
considerably shorter... This wastes space.
 
R

Roland Pibinger

Reading comp.std.c ....
It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.
It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing.

strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

Best regards,
Roland Pibinger
 
J

jacob navia

Roland Pibinger a écrit :
strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

Best regards,
Roland Pibinger

1) What's wrong with the user deallocating?
2) Maybe this view is changing since that technical report is there...
 
R

Richard Heathfield

jacob navia said:
CBFalconer a écrit :

(Wrong, Chuck!)
???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

On your particular implementation, possibly, but the Standard doesn't
require this as far as I know. What it *does* require is that size_t is
available as a type to those translation units that have included
<stdlib.h>. So yes, your code was correct in that regard.
 
J

jacob navia

Richard Heathfield a écrit :
jacob navia said:




(Wrong, Chuck!)




On your particular implementation, possibly, but the Standard doesn't
require this as far as I know. What it *does* require is that size_t is
available as a type to those translation units that have included
<stdlib.h>. So yes, your code was correct in that regard.

stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.
 
R

Richard Heathfield

jacob navia said:

stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

True enough - or, better still, we can simply quote chapter and verse at
Chuck:

"4.10 GENERAL UTILITIES <stdlib.h>

The header <stdlib.h> declares four types and several functions of
general utility, and defines several macros./113/

The types declared are size_t and wchar_t (both described in $4.1.5),..."

<snip>
 
C

CBFalconer

jacob said:
CBFalconer a écrit :


Here you allocate _len characters even if the string could be
considerably shorter... This wastes space.

The only possible use for the function is to truncate input
strings, as I pointed out. In that case there is no space wasted.
 
C

CBFalconer

Roland said:
strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

By that reasoning malloc, calloc, and realloc should also be
omitted. Not to mention fopen.
 
C

CBFalconer

jacob said:
CBFalconer a écrit :


???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

Sorry, you are right there. However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).
??? Why confusion?

Null means failure

Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top