William said:
But how many of those are for MiX and other errors in his books? I meant to
refer to things like TeX, parts of which are written in C.
Are they? I don't think he wrote any of TeX at all in C. He wrote it all
in Pascal (or, to be more accurate, he wrote it in WEB, which compiles
to Pascal).
The resulting Pascal, these days, is generally fed to a program that
compiles _that_ to C. Parts of TeTex may have been written in CWEB,
maybe, but not by him.
(I didn't mean any of that to imply that it's safer _because_ he wrote
it in Pascal rather than C, I was just disputing whether it was the case.)
Writing a "hello world" program is harder in C than in Borne Shell, and
harder still in an assembly language.
On the flip side, in a simple "hello world" program string handling doesn't
predominate, and you may very well have persuasive reasons for writing it in
C than in Borne Shell.
I don't think anyone was talking about how much work it might be to code
in C. What we were talking about was how hard it is to code _safely_ in
C. It's an entirely different question. I don't think a "Hello world"
program's safety is appreciably harder to achieve in C than it is in sh.
More complex programs are a different issue.
And of course there are reasons for choosing C over other implementation
platforms (if that weren't the case, would I be a C programmer?

).
I agree. strlcpy(), though, fills in inevitable gaps between the standard
interfaces, traditional string handling, and whatever design or manner of
approaching the issue one takes. Seems to me that's as good a reason as any
to include strlcpy(). On top of the fact, and more to the point, that it
encapsulates the _minimal_ exact code one would normally and rightly employ
in these situations.
Well, no, it doesn't. strcpy() plus a buffer check does. strlcpy() adds
one more thing: copying what it can of src to dst, regardless of whether
there was enough space for all of it, or whether that's what was wanted.
This has _never_ been what I want (usually, like Yevgen, I want to
allocate more space). I can't say it will never _be_ what I want, and I
know it's sometimes what others (apparently, including yourself) have
needed. Constrained by output limits is a legitimate case. Constrained
by input limits, IMO isn't a good one ("be liberal in what you accept").
Even with your example of RFC limits, most such limits are within the
context of mechanisms that provide ways to represent entities that do
not match those constraints. For instance, if I need to force arbitrary
text files to meet the constraints of RFC 2822, I may be using a fixed
line-buffer size, but I'm sure as hell not using strlcpy() to meet that
constraint. I'd be using quoted-printable or somesuch, instead.
And even if I'm writing an old-style tarfile with fixed block sizes and
a maximum filename length, I'd _still_ probably want to ensure I
generate a unique filename, rather than blindly truncating it.
In short, I rarely want to truncate, and when I _do_, I rarely want to
do it naively (as strlcat() will do).
I'm not against its inclusion, I just think its utility has been _way_
overblown.
And none of this has anything to do with the OP's actual question, which
was whether he'd been misled when people told him to always use
strlcpy() in preference to strcpy(). To which the answer, hopefully
obvious by now, is _yes_, he was misled. The utility of strcpy() is
_far_ more general than that of strlcpy().
And, while strlcpy() may be better than strcpy() for those limited
situations where you want a naive truncation (and don't mind its limited
portability), I don't see any basis for the claim that strlcpy() is
_safer_ than strcpy() (which, after all, is the basis for the claim that
you should always use it in preference to strcpy()). It is precisely as
easy to remember to use strlcpy() instead of strcpy(), as it is to
remember to check the buffer size before you strcpy() (the latter,
though, still gives you more options about what to do after the check
fails).
That's an impossible criterion. No C library, IMO, can hide the details of
buffer (aka memory, aka resource) management in C
struct allocator {
void * (*a_malloc)(void *, size_t);
void * (*a_realloc)(void *, void *, size_t);
void (*a_free)(void *, void *);
void *data;
};
struct str *str_new(struct allocator *);
struct str *str_cat(struct allocator *, struct str *, struct str *);
str_del(struct allocator *, struct str **);
.... etc, etc. I imagine there'd actually be versions of these same
functions that don't take the initial allocator, and just use a default one.
IMO, C++'s string classes (and many others in the standard C++ library)
handle the allocation problem in a quite general and elegant manner.
Surely a C library could emulate something similar, even if the syntax
were somewhat clunkier?
and it's not clear to me
that off-by-ones are substantially more of an issue than NULL or dangling
pointers.
Both of which can be solved fairly gracefully (to the degree they can be
solved in C) by a library with an interface such as the one I've
outlined. And off-by-ones are a pretty small subset of buffer-size
violations. Forgetting to check, using the size variable for the wrong
buffer, forgetting to initialize the size variable, are all common
mistakes. Most of these can also be solved by a general library; none of
them are solved by using strlcpy() (except "forgetting to check", but as
already mentioned, this isn't a solution, it's an indirection. Instead
of forgetting to check buffer size, it becomes forgetting to use strlcpy()).
They can only grease the wheels, so to speak. That is, better
weave the patterns into your code. Encapsulation being one important way to
accomplish that. But there are many levels of encapsulation, and many/most
string libraries force you to too high a level of encapsulation for what its
worth in many instances; rather than encapsulate they obsfuscate.
No argument there.
And I'm not saying that such a library should ever be part of the C
standard (though it might not be terrible, if done as carefully as C++
has done); what I _am_ saying, is that it would go a long way towards
solving the general issue with bounds checking, whereas strlcpy() is
only claimed to do so.