strndup: RFC

P

pete

Richard said:
jacob navia said:



True enough - or, better still,
we can simply quote chapter and verse at Chuck:

"4.10 GENERAL UTILITIES <stdlib.h>

The header <stdlib.h> declares four types and several functions of
general utility, and defines several macros./113/

The types declared are size_t and wchar_t
(both described in $4.1.5),..."

<snip>

We can, but I also use jacob navia's logic
to deduce what's defined in a header,
except for stddef.h which is small enough to learn.

I tend to be more familiar with standard function descriptions
from looking them up repeatedly,
than I am with the standard header descriptions in their entirety.

I know that the ctype functions are described as being
able to take the value of EOF as an argument,
so I know that EOF is defined in ctype.h.

I know that putchar is described as being
able to return the value of EOF,
so I know that EOF is defined in stdio.h.
 
J

Joe Wright

jacob navia wrote:
[ snip ]
stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

realloc's prototype..

void *realloc(void *_ptr, size_t _size);
 
J

jacob navia

Joe Wright a écrit :
jacob navia wrote:
[ snip ]
stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.


Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

realloc's prototype..

void *realloc(void *_ptr, size_t _size);
Please Joe do not use my mail messages as header files :)

I was just copying, and I wrote (etc...) sometimes to
speed it up. Maybe is lazyness but I just wanted to make a point,
not to write a header file.
 
C

Chris Torek

[comp.compilers.lcc snipped as there is nothing lcc-specific here]

... However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

Indeed, in fact including <stdlib.h> must *not* include <stddef.h>,
because stddef.h will (e.g.) define "offsetof", and it must not be
defined if you have included only stdlib.h:

% cat x.c
#include <stdlib.h>
#ifdef offsetof
# error offsetof defined when it should not be
#endif
void f(void){}
%

This x.c translation unit must translate without error, i.e., the
#error must not fire.

(Now that it has been officially announced I can mention this: One
rather painful aspect of obtaining POSIX PSE52 certification for
vxWorks 6.4 was cleaning up our header-file organization so that
"inappropriate" symbols were not defined by including various POSIX
headers. The certification process includes code that reads the
actual headers, searching for all things that look like identifiers,
and emits a test module to make sure they are not "#define"d when
compiling in "POSIX mode". Only names reserved to the implementor
-- things like __users_must_not_use_this_identifier -- are omitted
from this test.

Making header files "do the right thing" when writing them from
scratch is not that bad, but retroactively enforcing such rules in
a system with many years of history is more difficult.)
 
R

Roland Pibinger

1) What's wrong with the user deallocating?

It's bad style. The responsibility for deallocation becomes unclear.
In your program you have some functions that return a char* that must
be freed and other functions retruning char* without that requirement:
a perfect receipt for a leaking program.
Moreover, functions like getline foster an inefficient style. They
dynamically allocate memory for each line even when most of the lines
would fit into a char[80] buffer and only exceptional cases needed
dynamic allocation.
2) Maybe this view is changing since that technical report is there...

Why should they abandon good style?

Best regards,
Roland Pibinger
 
R

Roland Pibinger

By that reasoning malloc, calloc, and realloc should also be
omitted.

Why? Do those functions force you to free something you haven't
allocated?
Not to mention fopen.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

Best regards,
Roland Pibinger
 
J

jacob navia

Roland Pibinger a écrit :
1) What's wrong with the user deallocating?


It's bad style. The responsibility for deallocation becomes unclear.
In your program you have some functions that return a char* that must
be freed and other functions retruning char* without that requirement:
a perfect receipt for a leaking program.
Moreover, functions like getline foster an inefficient style. They
dynamically allocate memory for each line even when most of the lines
would fit into a char[80] buffer and only exceptional cases needed
dynamic allocation.

2) Maybe this view is changing since that technical report is there...


Why should they abandon good style?

Best regards,
Roland Pibinger

The responsability of the deallocation is perfectly clear: the specs
specify that the user should deallocate the new space.

And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!
 
S

Stephen Sprunk

Roland Pibinger said:
Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

I've written code like that, yes, though more often with <OT>open() /
close()</OT> than fopen() / fclose(). It's also pretty standard when
working with <OT>sockets</OT>; because they require so much f'ing work
to create, people tend to put all that in another function to avoid
clutter.

This whole "who deallocates the returned string" argument is one of the
largest problems I have with C; yes, you can always find the correct
answer if you look at the function specs (assuming they exist), but it's
not obvious and is therefore prone to errors. <OT> I'm often tempted to
use the C-like subset of C++ just so I'll have string objects that
deallocate (er, destruct) themselves when appropriate rather than having
to read function specs to figure things out. The hassle of requiring a
working C++ environment isn't yet worth the gain, though it's getting
closer. </OT>

S
 
C

CBFalconer

jacob said:
.... snip ...

And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.
 
C

CBFalconer

Roland said:
Why? Do those functions force you to free something you haven't
allocated?

Yes. Which function does the allocating? If you answer 'all
three' then what is wrong with making it 'all four'.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

Certainly would. Although the passed parameters would probably
have a bit more to say. For example, you want to open a file with
a specified name, or prefix, in some system dependent directory,
subject to some condition or other. Or you want to conditionally
override a default file name. Consider logging files.
 
K

Keith Thompson

Joe Wright said:
jacob navia wrote:
[ snip ]
stdlib.h defines
calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);
and many others, so I do not see how size_t could be unknown after
including stdlib.h...
Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

How so? A prototype is a function declaration that declares the types
(not necessarily the names) of its parameters. Only the definition
needs the parameter names.
 
K

Keith Thompson

Chris Torek said:
[comp.compilers.lcc snipped as there is nothing lcc-specific here]

... However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

Indeed, in fact including <stdlib.h> must *not* include <stddef.h>,
because stddef.h will (e.g.) define "offsetof", and it must not be
defined if you have included only stdlib.h:

% cat x.c
#include <stdlib.h>
#ifdef offsetof
# error offsetof defined when it should not be
#endif
void f(void){}
%

This x.c translation unit must translate without error, i.e., the
#error must not fire.
[...]

<stdlib.h> and <stddef.h> could have some kind of logic that causes
<stddef.h> to define offsetof normally, but not to define it if it's
#include'd from <stdlib.h>. The restriction isn't that <stdlib.h> may
not include <stddef.h>; it's that, as your test program shows,
including just <stdlib.h> may not define offsetof.
 
J

jacob navia

CBFalconer a écrit :
jacob navia wrote:

... snip ...



Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

You have a good point here.

The specs are:

< quote >

4 The application shall ensure that *lineptr is a valid argument that
could be passed to the free function. If *n is nonzero, the application
shall ensure that *lineptr points to an object containing at least *n
characters.
5 The size of the object pointed to by *lineptr shall be increased to
fit the incoming line, if it isn’t already large enough. The characters
read shall be stored in the string pointed to by the argument lineptr

< end quote >

That error would absolutely fatal since it would mean that the
function would pass a wrong pointer to realloc, with disastrous
consequences!

Since there isn't in C a portable way to determine if a memory
block is the result of malloc() we are screwed...

jacob
 
K

Keith Thompson

CBFalconer said:
Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

It's worse than that. Attempting to realloc() something that wasn't
allocated by one of the *alloc() functions invokes undefined behavior.
You're as likely to silently corrupt the heap (if the implementation
uses a "heap" for dynamic memory allocation) as to make the program
"blow up".
 
T

Thad Smith

jacob said:
Thad Smith a écrit :
[suggestions to improve function description deleted]
Excuse me I just cutted and pasted the specification from the
standards document. It is not my documentation.

Feel free to forward my comments to the author. Still, if you
distribute such a function, I recommend you use a better description.
 
R

Richard Tobin

When I've used my own version of strndup, it's always been make an ^^^^^^^
ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.
[/QUOTE]
No, that's not more common - it's just a more *intelligent* use of strncpy.
^^^^^^^
By far the most common usage of strncpy is from the cargo cult bunch: "I'm
smart, I know about buffer overruns, I know I should use strncpy instead of
strcpy, oops, oh look, I just threw away data, ohdearhowsadnevermind."

Um, you seem to have mixed up strndup and strncpy here. But looking
at my own posting, I seem to have done the same thing in the last
sentence. If you know the string is null-terminated but don't know
how big it is, just use strdup. I can't imagine why you'd use strndup
unless you had a counted string or wanted to just copy a prefix of the
string.

-- Richard
 
R

Richard Tobin

Tom St Denis said:
What would the point of strndup be?

As I said in another article, it's useful for converting a counted
string to a null-terminated one.

-- Richard
 
S

Stephen Sprunk

CBFalconer said:
....
I can see a possible use for it - to duplicate the initial portion
of a string only, i.e. to truncate it on the right.

Ah, but it's in a "Specification for Safer C Library Functions"; that
implies that strndup() is somehow a safer version of strdup(), like
strncat() and strncpy() are safer versions of strcat() and strcpy().

There is nothing unsafe about strdup() per se, though. The problem with
strdup() is how it's normally used, which is to copy strings that are
returned from functions as pointers to static buffers. This is not
thread-safe/re-entrant, and strndup() doesn't solve that problem at all,
because the problem is needing strdup() in the first place.

If one needs to truncate a string on the right, it's easy enough to do
that with strdup() and some extra code, with malloc() and strncpy(), or
by creating a new function which truncates any string (such as one
returned by strdup()). But such a function is not a "safer" version of
strdup() -- it's entirely new functionality.

S
 
S

Stan Milam

CBFalconer said:
Sorry, you are right there. However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).


Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.

Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

#include <errno.h>
#include <stdlib.h>

char *
dupnstr( const char *source, size_t size )
{
size_t length;
char *wrk, *rv = NULL;

if ( source == NULL || *source == 0 || size == 0 )
errno = EINVAL;
else {
for ( length = 0, wrk = (char *)source; *wrk; length++ ) wrk++;
if ( length < size ) size = length;
if ( rv = malloc( size + 1 ) ) {
for( wrk = rv; size; size-- ) *wrk++ = *source++;
*wrk = 0;
}
}
return rv;
}


Regards,
Stan Milam.
 
R

Richard Heathfield

Stan Milam said:

Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

Except that neither EINVAL nor ENOMEM is defined by the Standard.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top