getdelim: wrong specs

J

jacob navia

In the documents presented in the post Portland meeting
of the C standards comitee
http://www.open-std.org/jtc1/sc22/wg14/

there is a document called ISO/IEC WDTR 24731-2,
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

In that document we have:
ssize_t getdelim(char **restrict lineptr,
size_t *restrict n, int delimiter, FILE *stream);

We read:
< quote >

Upon successful completion the getdelim function shall return the number
of characters written into the buffer, including the delimiter character
if one was encountered before EOF. Otherwise it shall return −1.

< end quote >

We come back here to the error analysis problem, a problem that I
have been mentioning since ages and comes again and again. Returning
-1 to indicate "some error occurred but I will not tell you which"
is BAD DESIGN!

This function can have several errors that should be distinguished in
the return value of the function, i.e. there should be other return
values for signaling errors to the user. Lcc-win32 implements this
function and returns:

-5 No memory left for allocating the line
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.
-3 The n parameter is NULL
-2 The LinePointer parameter is NULL
-1 End of file without any characters read.

This is a much more detailed error reporting, that allows the
user of the function to discriminate between the different error
conditions. It is sad that a document that proposed "safer" functions
doesn't do the error analysis that is an essential part of safer
programming.
 
P

pete

jacob said:
In the documents presented in the post Portland meeting
of the C standards comitee
http://www.open-std.org/jtc1/sc22/wg14/

there is a document called ISO/IEC WDTR 24731-2,
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

In that document we have:
ssize_t getdelim(char **restrict lineptr,
size_t *restrict n, int delimiter, FILE *stream);

We read:
< quote >

Upon successful completion the getdelim function shall return the number
of characters written into the buffer, including the delimiter character
if one was encountered before EOF. Otherwise it shall return −1.

< end quote >

We come back here to the error analysis problem, a problem that I
have been mentioning since ages and comes again and again. Returning
-1 to indicate "some error occurred but I will not tell you which"
is BAD DESIGN!

This function can have several errors that should be distinguished in
the return value of the function, i.e. there should be other return
values for signaling errors to the user. Lcc-win32 implements this
function and returns:

-5 No memory left for allocating the line
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.

Not if the buffer is allocated this way:
n = 0;
buffer = malloc(0);

ITYM "... but n equals 0"

-3 The n parameter is NULL

ITYM null, and that's not a problem.
-2 The LinePointer parameter is NULL

Not a problem either, if n equals 0.
-1 End of file without any characters read.

This is a much more detailed error reporting, that allows the
user of the function to discriminate between the different error
conditions. It is sad that a document that proposed "safer" functions
doesn't do the error analysis that is an essential part of safer
programming.

I've noticed that the next function listed, getline,
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
has exactly the same functionality as my line_to_string function,
(which I've posted enough times to be annoying recently
http://groups-beta.google.com/group/comp.lang.c/msg/c3694880f515e317)
but the return values are different.

line_to_string returns EOF upon an end of file condition
before reading a newline, or input error.
feof and ferror can be used to determine which.
"End of file without any characters read"
is a specific case of "before reading a newline".

line_to_string returns 0 if there is not enough memory.
if *size is less than 2 within the function,
then there is also a character pushback.

Otherwise, line_to_string returns
"the number of characters written into the buffer"
just like getline.

line_to_string always leaves a string in the buffer,
unless *size is zero within the function.

Both the pushback case and the no string in the buffer case,
can be definitely avoided,
by supplying a buffer with more than one byte,
in the call to line_to_string.
 
J

jacob navia

pete a écrit :
Not if the buffer is allocated this way:
n = 0;
buffer = malloc(0);

This is ridiculous... Now what is the point of getting a line of zero
length???? This is surely an error!

ITYM "... but n equals 0"





ITYM null, and that's not a problem.

The n parameter should never be NULL according to the specs...
Not a problem either, if n equals 0.

No, because it can't return the result into *n!!! It will read
chars but it can't return them. This is nonsense...
 
P

pete

jacob said:
pete a écrit :

This is ridiculous... Now what is the point of getting a line of zero
length???? This is surely an error!


The point of getting a line of zero length,
is that that's not how the function works.

Recall the title of the document:
Part II: Dynamic Allocation Functions

A very simple way to call the function is:

ssize_t rc;
char *buffer = NULL;
size_t size = 0;

rc = getdelim(&buffer, &size, '\n', stdin);

In this post:
http://groups-beta.google.com/group/comp.lang.c/msg/4da8ef6a9d40e578
line_to_string is called in just exactly that way.

/*
** INITIAL_BUFFER_SIZE can be any number.
** Lower numbers are more likely
** to get a non-NULL return value from malloc.
** Higher numbers are more likely to prevent
** any further allocation from being needed.
*/
#define INITIAL_BUFFER_SIZE 0

buff_size = INITIAL_BUFFER_SIZE;
buff_ptr = malloc(buff_size);
if (buff_ptr == NULL && buff_size != 0) {
printf("malloc(%lu) == NULL\n", (long unsigned)buff_size);
exit(EXIT_FAILURE);
}

rc = line_to_string(stdin, &buff_ptr, &buff_size)
The n parameter should never be NULL according to the specs...

I misread that as *n being equal to NULL.
From your usage of "n is <= 0"
I assumed you meant n to be the buffer size
rather than the pointer to the size.
No, because it can't return the result into *n!!! It will read
chars but it can't return them. This is nonsense...

Likewise, I meant if *n equals zero
and I also misread LinePointer to mean *LinePointer.
 
H

Hallvard B Furuseth

jacob said:
We come back here to the error analysis problem, a problem that I
have been mentioning since ages and comes again and again. Returning
-1 to indicate "some error occurred but I will not tell you which"
is BAD DESIGN!

Indeed. Why don't they just set errno though? It's not my favorite
way of reporting errors, but it _is_ the Standard C way. I don't
remember offhand any standard C functions which have several failure
return values instead.
 
C

CBFalconer

jacob said:
In the documents presented in the post Portland meeting
of the C standards comitee
http://www.open-std.org/jtc1/sc22/wg14/

there is a document called ISO/IEC WDTR 24731-2,
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

In that document we have:
ssize_t getdelim(char **restrict lineptr,
size_t *restrict n, int delimiter, FILE *stream);

We read:
< quote >

Upon successful completion the getdelim function shall return the number
of characters written into the buffer, including the delimiter character
if one was encountered before EOF. Otherwise it shall return −1.

< end quote >

We come back here to the error analysis problem, a problem that I
have been mentioning since ages and comes again and again. Returning
-1 to indicate "some error occurred but I will not tell you which"
is BAD DESIGN!

But which you ignored in your proposed strndup function, and then
objected when I pointed it out. As I said then that document
originated with Microsoft, and is subject to the usual Microsoft
failings, i.e. it should be ignored.
 
B

boa sema

Hallvard said:
Indeed. Why don't they just set errno though? It's not my favorite
way of reporting errors, but it _is_ the Standard C way. I don't
remember offhand any standard C functions which have several failure
return values instead.

strtol() returns either 0, LONG_MAX or LONG_MIN, which may or may not be
interpreted as error codes ;-)

Boa
 
N

Niklas Matthies

pete a écrit :

This is ridiculous... Now what is the point of getting a line of zero
length???? This is surely an error!

Following your logic,

int n = 0;
...
int k = m + n;

is surely an error as well (better raise a signal or something),
and one should write

int k = m;
if (n != 0)
{
k += n;
}

instead...

-- Niklas Matthies
 
J

jacob navia

Niklas Matthies a écrit :
Following your logic,

int n = 0;
...
int k = m + n;

is surely an error as well (better raise a signal or something),
and one should write

int k = m;
if (n != 0)
{
k += n;
}

instead...

-- Niklas Matthies

The straw man argument consists of making others believe that your
adversary has said something ridiculous, and absolutely absurd.

You take also a sentence where I say:

"What's the point of reading a line of zero length?"

to mean that I am against additions... How nice Mr Matthies,
you have a very good way of discussing things.

Note that the only consequence of passing a zero length is that you
receive a negative result. Nothing else happens, and you are
free to ignore the result...
 
N

Niklas Matthies

The straw man argument consists of making others believe that your
adversary has said something ridiculous, and absolutely absurd.

You take also a sentence where I say:

"What's the point of reading a line of zero length?"

So, what's the point of adding zero to an integer?

-- Niklas Matthies
 
M

Mark McIntyre

The straw man argument consists of making others believe that your
adversary has said something ridiculous, and absolutely absurd.

indeed, and its something you seem quite fond of.
You take also a sentence where I say:

"What's the point of reading a line of zero length?"

to mean that I am against additions... How nice Mr Matthies,
you have a very good way of discussing things.

case in point. Niklas dids not say that.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
K

Keith Thompson

pete said:
jacob navia wrote: [...]
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.

Not if the buffer is allocated this way:
n = 0;
buffer = malloc(0);
[...]

The result of malloc(0) is implementation-defined; it can either
return a null pointer or a pointer to some allocated but inaccessible
memory. C doesn't support zero-sized objects.

If it did, though, it would make sense to support them in all possible
contexts, just for the sake of consistency. It's dangerous to assume
that nobody would ever use a function in some particular way.
 
N

Niklas Matthies

pete said:
jacob navia wrote: [...]
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.

Not if the buffer is allocated this way:
n = 0;
buffer = malloc(0);
[...]

The result of malloc(0) is implementation-defined; it can either
return a null pointer or a pointer to some allocated but
inaccessible memory.

Which means you can use it portably as if it were a pointer to a
zero-sized object (as long as you don't expect it to have a unique
address, that is).
C doesn't support zero-sized objects. If it did, though, it would
make sense to support them in all possible contexts, just for the
sake of consistency.

Given the above, it already does make sense.

-- Niklas Matthies
 
K

Keith Thompson

Niklas Matthies said:
pete said:
jacob navia wrote: [...]
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.

Not if the buffer is allocated this way:
n = 0;
buffer = malloc(0);
[...]

The result of malloc(0) is implementation-defined; it can either
return a null pointer or a pointer to some allocated but
inaccessible memory.

Which means you can use it portably as if it were a pointer to a
zero-sized object (as long as you don't expect it to have a unique
address, that is).

Any standard library function with a char* parameter that points to a
string invokes undefined behavior if the argument is a null pointer.

char *s = malloc(0);
size_t len = strlen(s);
/*
* Returns 0 if s != NULL
* Invokes UB if s == NULL
*/

A valid string has to have a size of at least 1, but consider also the
mem*() functions.
Given the above, it already does make sense.

I disagree.
 
E

Eric Sosman

Keith said:
Any standard library function with a char* parameter that points to a
string invokes undefined behavior if the argument is a null pointer.

Well, "almost any." Two counterexamples are strtok()
and system().
 
N

Niklas Matthies

["Followup-To:" header set to comp.std.c.]
Niklas Matthies said:
Any standard library function with a char* parameter that points to a
string invokes undefined behavior if the argument is a null pointer. :
A valid string has to have a size of at least 1, but consider also
the mem*() functions.

Strings are never zero-sized objects for the reason you note, hence
pretty irrelevant to the discussion, but for functions that work on
raw memory, like the mem* functions or fread/fwrite, it would make
perfectly sense to allow whatever malloc(0) returns when the buffer
size is zero.

-- Niklas Matthies
 
A

Arthur J. O'Dwyer

Any standard library function with a char* parameter that points to a
string invokes undefined behavior if the argument is a null pointer.

char *s = malloc(0);
size_t len = strlen(s);
/*
* Returns 0 if s != NULL
* Invokes UB if s == NULL
*/

ITYM "Invokes UB if s == malloc(0)". Whether s is a null pointer, or
simply points to zero bytes of memory, trying to access s[0] will invoke
undefined behavior. So there's no difference between NULL and (unnamed
zero-sized object) /in this case/.
A valid string has to have a size of at least 1, but consider also the
mem*() functions.

Yes. This is the example you should have used. (7.1.4 says that NULL
is not a "valid value" for pointer arguments to functions, and 7.21.1#2
says that string functions require valid values as arguments.)

-Arthur
 
S

Skarmander

jacob said:
Niklas Matthies a écrit :

The straw man argument consists of making others believe that your
adversary has said something ridiculous, and absolutely absurd.
You do not seem to have expended enough effort to interpret what mr.
Matthies was saying. He was attempting to discredit the logic of your
decision by using an analogous situation which you would obviously not be
tempted to declare erroneous.
You take also a sentence where I say:

"What's the point of reading a line of zero length?"

to mean that I am against additions... How nice Mr Matthies,
you have a very good way of discussing things.
He was trying to point out that it makes sense to allow operations that have
no effect, despite the fact that they have no effect. It should not be
considered an error because it may arise naturally in an algorithm, and
treating it as an error complifies algorithm design by requiring the
programmer to distinguish a special case. In short, that you do not see the
"point" of something is not sufficient reason to declare it an error.

S.
 
R

Richard Bos

jacob navia said:
ssize_t getdelim(char **restrict lineptr,
size_t *restrict n, int delimiter, FILE *stream);
< quote >

Upon successful completion the getdelim function shall return the number
of characters written into the buffer, including the delimiter character
if one was encountered before EOF. Otherwise it shall return −1.

< end quote >

We come back here to the error analysis problem, a problem that I
have been mentioning since ages and comes again and again. Returning
-1 to indicate "some error occurred but I will not tell you which"
is BAD DESIGN!

True. But so is...
Lcc-win32 implements this function and returns:

-5 No memory left for allocating the line
-4 Line pointer points to a buffer but n is <= 0. This is an error in
the incoming arguments.
-3 The n parameter is NULL
-2 The LinePointer parameter is NULL
-1 End of file without any characters read.

....this.

You have been given errno by the Standard. Use it.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top