Keith said:
Keith said:
(e-mail address removed) writes:
Jonathan Leffler wrote:
(e-mail address removed) wrote:
[...] (so strcat(p,p) leads
to UB even though it has a compelling intuitive meaning).
What's the compelling intuitive meaning? To me, it means copy
characters from the start of p over the null that used to mark the end
of p and keep going until you crash.
That's not an intuitive meaning. Its just an understanding of an
implementation anomoly. Perhaps for you, implementation details
changes your intuition.
[...]
In this case, intuition is not necessary.
Just *SAYING* this is the ultimate indictment of the C language. If
the language doesn't match your intuition, then its just takes that
much more effort to program in it.
News flash: C is not the most intuitive and beginner-friendly language
ever invented. Does this come as a surprise to you?
Huh? No, but apparently its a surprise to Jonanthan Leffler and Andrew
Poelstra. They seem to be arguing the case that it *does* match their
intuition.
My *original* contention, is that any proposal such as Richard
Seacord's managed string library should go ahead and pay the basically
0 penalty of actually *matching* intuition (as compared to the enormous
penalty he seems willing to pay for automatic character set filtering).
Anything else so far, are just people's false projections about what I
said *beyond* this, and my responses to them.
In a language where strings are first-class objects, and you can pass
them around as values, use them as operands in expressions, and so
forth, I'd expect something called "strcat" to behave in some
reasonable intuitive manner.
Yeah, that's nice. C is basically the only example of a language of
its kind, yet you feel not the slightly problem with making sweeping
generalizations about it based on its properties. That's kind of like
saying that Joseph Lieberman can't be elected president of the US
because he's jewish. I mean, that's nonsense (he's a right wing
democrat, which means he has no serious base of support outside his
state) in same way your idea here is nonsense.
first-class or not, strcat *CAN* be implemented as aliasing safe, at
very little cost. The fact that it doesn't is a choice that was made;
its nothing more than that. Its certainly not a *property* of
low-level languages (in assembly language, for example, there is no
assertion or expection of being unable to deal with aliased "objects"),
or a property of the fact that its not a first class value (bstrlib is
the obvious counter-example of this.)
[...] I'd still need to see the declaration to
know how to use it, but it would probably be safe to assume that
something like
s1 = strcat(s2, s3)
would do the obvious thing.
You mean if it were a first class value? But you are making a false
association here. There is no reason you cannot perform in-place
mutation of first class values. So the API could still have the same
basic functionality as the current strcat (i.e., two operands, and
modifying the destination.)
C is not like that. Strings are not a data type, they're a data
format, "a contiguous sequence of characters terminated by and
including the first null character", subject to all of C's
complications regarding arrays and pointers. If you think you can
guess, with 99% certainty, how strcat() is going to behave based on
that, you're likely to be disappointed.
These *complications*, as you suggest, have nothing to do with it. Its
all down to pure choice at the specification level. The guesses are
only wrong because the standard chooses that they should be wrong.
[...] If you read the standard's
description of strcat(), you'll see:
... If copying takes place between objects that overlap, the
behavior is undefined.
Any decent description of strcat() (in a man page
The latest cygwin man page makes no mention of this and WATCOM C/C++'s
documentation omits this.
The Cygwin man page doesn't mention this, but it's not intended to be
complete:
strcat is part of the libc library. The full documentation for
libc is maintained as a Texinfo manual. If info and libc are
properly installed at your site, the command
info libc
will give you access to the complete manual.
I'm not convinced that's a good idea, but it's explicitly acknowledged
with a reference to the complete documentation.
"info libc" doesn't work for me under Cygwin (I don't know why, but
the reason is clearly irrelevant),
It works on my system. info libc does nothing more than document the
standard include contents. info strcat just re-echos the man page.
[...] but on another system the section
on strcat clearly says:
This function has undefined results if the strings overlap.
I don't know about Watcom.
Well I just told you about Watcom, so now you do (it reads
substantially similar to the man pages). Its all downloadable from the
open watcom site if you care.
[...] or text book, for
example) should have similar wording; if it doesn't, that's the fault
of the author of the documentation.
Here's the first hit on google:
http://www.cplusplus.com/ref/cstring/strcat.html
and the second:
http://www.mkssoftware.com/docs/man3/strcat.3.asp
Here's the wikipedia entry as of 07/28/2006:
http://en.wikipedia.org/wiki/Strcat
and here's the Open BSD documentation that it links to:
http://www.openbsd.org/cgi-bin/man.cgi?query=strcat
So I guess none of that counts as "decent documentation".
I agree. I don't know what cplusplus.com is, and I'm not too
surprised by an error like this in Wikipedia (possibly someone here
will correct it soon). I am surprised that the OpenBSD documentation
doesn't mention this.
So we there have it. The standard for "decent" documentation as you
suggest appears to be quite high, and is certainly different from what
is commonly available.
[...] That's a problem -- but not a problem with C itself.
If you could just *stop* with the false projection for one second. You
know there is a reason why I quote other text when I post responses.
Then by all means feel free to go and use those languages. Nobody
here will stop you.
It always this false choice with you. I have to completely throw out
my investment in learning this language because it makes a number of
idiotic decisions through nothing other than poor choices.
We're not even talking about what language I *USE* for whatever I am
doing. Remember, this thread started as a discussion about improving
to the standard, and as I understand it, has reached the level of
serious official proposal. Citing other languages ought to be a
standard part of such a discussion without you pulling out this tired
old canard all the time.