strtok and strtok_r

C

CBFalconer

Ben said:
Nope. The third and fourth correctly implement the specification
given for the second (which simply re-implements the behaviour of
the first).

Well, I was only considering the standard strlen specification, and
returning zero for a NULL input (i.e. treating NULL as a zero
length string). I think. I totally missed the urge to return an
error indication for NULL input (which I don't consider aa good
idea, since using it requires non-std accepting code).
 
C

Charlie Gordon

CBFalconer said:
Mr Gordon seems to be unaware of the fact that this newsgroup deals
_strictly_ with standard C, as defined in the various ISO C
standards, and K&R for pre-standardization versions. Material not
included in those standards is off-topic, unless full std C code to
implement them is included. For system dependent material, go to a
newsgroup that deals with that system.

Dear Mr Falconer,

Let me sum up the sequence on events leading to my losing my temper.

The OP asked about the inner workings of strtok, starting his post with: "As
I know strtok_r is re-entrant version of strtok." and further enquiring
about the difference between the two.

Jacob Navia immediately posted a GNU implementation of strtok_r with
attribution of source, but his explanation of the inner workings was quite
terse.

Ben Pfaff hinted that strtok_r is POSIX specific, along with a slightly more
informative answer.

I, Charlie Gordon, posted a very short explanation, along with my own
implementation of the POSIX function strtok_r, implemented concisely in
terms of Standard library functions.

You, Charles B Falconer, posted a response dismissing strtok_r as
non-standard, along with the source of your own function tknsplit, presented
as a "suitable replacement function". You did not address the OPs question
at all. Your post was largely irrelevant. The OPs question is at least
partly on topic since it discusses an undocumented shortcomming of a
Standard library function.

I then contented that better answers had already been posted, that strtok_r
implementations had been posted, that strtok_r was indeed standardized by a
different body, POSIX. The OPs question was unambiguous and your answer was
not helpful... I also posted comments on your code, as I always do,
pointing at some inconsistencies, semantical differences, potentially
misleading API, and last but not least, a semantical advantage over strtok.
I

Martien Verbruggen insisted that POSIX functions be only discussed on
comp.unix.programmer, but did not set a followup.

I then insisted that discussing strtok's shortcommings and a widely
available solution available in POSIX, with source code already posted
upstream was definitely on topic. I started to lose it, and expressed my
opinion on the Standard's Committee for not fixing known issues in the
library, and forum regulars for not answering simple OP questions.

Then, there was an amazing digression about strdup, thanks to a proud
Fortune 500 coder.

Jacob Navia sided with me on the issue, opening other digressions and the
usual display of collective grief.

Then you responded to my code review, insisted strtok_r could not be
discussed here without source (which had been posted at least twice
already), and finally concluded with a definitive "I have no idea what
strtok_r is, except that it invades user namespace".

I responded to this bold statement in a ironic tone, given that nobody can
believe you never came across Unix in your 50+ years of programming
experience.

Richard Heathfield added to this, confirming the comp.lang.c hard line. Yet
there is a difference between his "there is *no such function* as strtok_r
in Standard C" and your "I have no idea what strtok_r is".

I then lost my temper and called you names. I regret that.

I at least tried to answer the OP who was inquiring about strtok and a
decent alternative. I am quite disappointed that strtok_r was not made part
of C99. I know the proper place for discussing inclusion in the Standard is
comp.std.c, but the proper place to help C programmers is here.

The strtok function specified in the Standard is error prone as it is non
reentrant, causing surprising if not undefined behaviour when used
improperly, even in single threaded programs. I was referring to nested
use, which is incorrect, but not even mentioned in the non-normative
examples or footlines of the Standard.

If advocating people to use a simple documented alternative, with source
code, is not on topic here, where is it ? comp.lang.real.life.c.programming
?

Thanks for your time, and dont use strtok anymore ;-)
 
C

Charlie Gordon

CBFalconer said:
... snip about tknsplit and strtok ...

This newsgroup is comp.lang.c. C is defined by the various C
standards, present or past, and includes K&R for times previous to
1989. None of these define, or even mention, strtok_r. Thus,
without standard C code, published in the same message, discussion
of it is off-topic here. The name is still reserved for the
implementor. As I said, it doesn't exist. Unix, Linux, Microsoft
have no influence whatsoever.

Yes, we know the hard line you are trying to enforce.
strtok_r is not part of Standard C, but inclusion in it was and is still
discussed:
http://www.google.com/search?source...en-std.org/jtc1/sc22/wg14/&btnG=Google+Search

Source code was published very early on this thread, there was no ambiguity
on what the OP was asking about, his question was answered. The rest on
this discussion, while typical of this forum is also quite vain.
 
J

Joachim Schmitz

Yes, that was my intention
Well, I was only considering the standard strlen specification, and
returning zero for a NULL input (i.e. treating NULL as a zero
length string). I think. I totally missed the urge to return an
error indication for NULL input (which I don't consider aa good
idea, since using it requires non-std accepting code).
Hmm, but how is returning 0 for input NULL any better than returning
(site_t)-1? What else should be returned on a NULL input? What else should
happen instead?

assert(s==NULL);
is the only other sensible other option I see, as it assures the program to
aborts right there with a somewhat usefull error message.

Bye, Jojo
 
J

Joachim Schmitz

Ben Bacarisse said:
Nope. The third and fourth correctly implement the specification
given for the second (which simply re-implements the behaviour of
the first).
I was quite surprised that my mistake went undetected for a day...
 
B

Ben Bacarisse

CBFalconer said:
Just for fun, since you have the test code available, try comparing
it against the following small piece. Use the same optimization
levels etc.

char *dupstr(const char *s) {
char *p, *q;

if (p = q = malloc(1 + strlen(s)))
while (*p++ = *s++) continue;
return q;
} /* dupstr, untested */

which needs "#include <string.h>" and "#include <stdlib.h".

Without optimisation, worse than strcpy (as one would imagine). With
-O2 almost exactly the same as strcpy.
 
C

CBFalconer

Joachim said:
.... snip ...


Hmm, but how is returning 0 for input NULL any better than
returning (site_t)-1? What else should be returned on a NULL
input? What else should happen instead?

assert(s==NULL); <
is the only other sensible other option I see, as it assures the
program to aborts right there with a somewhat usefull error
message.

Remember, we are discussing an implementation of strlen. The
point, to me, is that an empty string (normally represented by a
pointer to '\0' char) can be equally well represented by NULL,
saving some memory. This is usable as long as the purpose of the
accessing code is to make or use a copy of the string. It fails
miserably if the idea is to modify the string, but in that case the
modifying code will be what blows up.

However, returning a value of "(size_t)-1" is hopeless. This
requires that every call to strlen be immediately followed by a
test of the result, and doesn't allow the (possible) minor
advantage of representing an empty string by NULL.
 
C

CBFalconer

Charlie said:
"CBFalconer" <[email protected]> a écrit:
.... snip ...
.... snip thorough recapitulation ...
If advocating people to use a simple documented alternative, with
source code, is not on topic here, where is it ?
comp.lang.real.life.c.programming ?

Thanks for your time, and dont use strtok anymore ;-)

No war intended, and you have spent much time on the recap. I
haven't. Please note that I qualified everything with 'seems'.

I don't think I have ever used strtok. :)
 
K

Keith Thompson

CBFalconer said:
Remember, we are discussing an implementation of strlen. The
point, to me, is that an empty string (normally represented by a
pointer to '\0' char) can be equally well represented by NULL,
saving some memory. This is usable as long as the purpose of the
accessing code is to make or use a copy of the string. It fails
miserably if the idea is to modify the string, but in that case the
modifying code will be what blows up.
[...]

No, an empty string cannot *equally* well be represented by a null
pointer, at least not in standard C. You can do that if you like, as
long as all the code that uses these strings specifically handles that
special case. But you can't use any of the standard functions that
take string arguments unless you wrap each call in a check for a null
pointer.

#include <stdio.h>
int main(void)
{
char *s = NULL;
printf("s = \"%s\"\n", s);
puts(s);
return 0;
}

On one implementation this prints:

s = "(null)"
Segmentation fault (core dumped)

Such an approach also makes it difficult to distinguish between an
empty string and a nonexistent string.
 
C

CBFalconer

Keith said:
CBFalconer said:
Remember, we are discussing an implementation of strlen. The
point, to me, is that an empty string (normally represented by a
pointer to '\0' char) can be equally well represented by NULL,
saving some memory. This is usable as long as the purpose of the
accessing code is to make or use a copy of the string. It fails
miserably if the idea is to modify the string, but in that case the
modifying code will be what blows up.
[...]

No, an empty string cannot *equally* well be represented by a null
pointer, at least not in standard C. You can do that if you like,
as long as all the code that uses these strings specifically
handles that special case. But you can't use any of the standard
functions that take string arguments unless you wrap each call in
a check for a null pointer.

Fair enough. I am guilty of unrestrained use of 'equally well', in
that I was only considering the programs actual use of such, and
ignoring the effect on other std procs.
 
T

Tor Rustad

[...]
Kind of you, James, and I must admit I find it hard to imagine arguing that
way as well. Unfortunately, I remember that I had some pretty strange
misconceptions about C a decade or so ago, so it's not utterly impossible.
But no, I don't remember this particular debate. Your mod seems plausible,
however. (I can't think of any reason why I'd want to call memcpy in that
way, however.)

I found the thread, it was more than 7 years ago.. you gave me that
lesson. Posting the link to the thread, would be rather humiliating
for me, so I prefer not to! :)

Here is the function R.H. wrote back then:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>


char *dupstr(char *s)
{
char *t = NULL;
assert(s != NULL);
if(s != NULL) /* assert won't fire if NDEBUG is defined, so check
anyway */
{
t = malloc(strlen(s) + 1); /* allocate enough bytes to make a
copy
of the string pointed to by s */
if(t != NULL) /* did it work? */
{
strcpy(t, s); /* perform the copy */
}
}
return t; /* returns NULL if the string could not be created */
}

and Ben Pfaff instantly suggested the "char *dupstr(const char *s)"
improvement.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top