Bug/Gross InEfficiency in HeathField's fgetline program

Richard · Nov 6, 2007

Except that he got it backwards. The hyperventilating style assumed by
Dickie H is to refer to the people you don't like as "Mr." and the ones
you do by their full names (or, occasionally, by their first names only).
This is a fairly common hack (inter alia, in the biz world); it was by
no means invented by our own Dickie H.

The punter to which rgrdev responded couldn't even get that right, and
dropped down into schoolboy/playground-speak. On the playground, you
refer to the people in power as "Mr.", and, of course, refer to the
people you're beating up on by their first names.

Slight corrections : their sirnames.

As in "You 'ad enuff yet McCormack?!?!" (sound of boot going in).

Charlie Gordon · Nov 6, 2007

James Kuyper said:
Well, I'm one of the many people who was unwilling to contradict Mr.
Heathfield, and there's a very simple reason for that: I didn't think he
was wrong. Jacob might want to consider the possibility that many of the
other silent ones also agreed with Mr. Heathfield, even if Jacob finds
that very difficult to imagine.

There's also the fact that it usually either does not copy the entire
string, not even bothering to properly null-terminate it, or writes a lot
of stuff in addition. Only rarely, in normal usage, does it write an exact
copy of the string, and nothing else.

The fact that it does not produce a 'string' is not a necessary condition to
qualify as a 'string function', take strlen for instance.

There are other plausible ways of defining what is meant by a "string copy
function", but requiring that it always copy the entire string, including
the terminating null character, and write nothing more than the string,
seems like a reasonable requirement to me.

Nobody questions that. The reasons strncpy is labelled as a string function
are simple:
- it is defined in <string.h>
- it is documented in the standard in chapter 7.21 String handling
- it's behaviour is somewhat specific when it is passed a string shorter
than its numeric argument.

The other issue debated about strncpy() is whether it was actually any
use. Well, in my programs I routinely face the following situation:

The output file spec allows a fixed amount of space for an array of
characters. The input data that is to be written into that space will
usually be small enough to fit, but cannot be guaranteed to fit. When it
does not fit, it's not an error condition - I'm supposed to put as many
characters from the beginning of the input source into the output array as
possible. Losing the terminating characters is regrettable but
permissible. The character array can be null-terminated, but is not
required to be. However, to make binary comparison of output files easier,
every character after the terminating null should also be null.

The above description applies to many of the "strings" that my programs
have to write, and also to many of the strings that they must read. It
seems to me that strncpy() is tailor-made for such use.

You are making this up ;-)
If you are dealing with binary files, strncpy is not the final answer to
your problem, you have to perform I/O as well and the intermediary buffer is
wasteful. A 'tailor-made' solution to your problem is this:

int write_string_field(FILE *fp, size_t count, const char *str) {
while (count > 0 && *str) {
putc(*str++, fp);
count--;
}
while (count > 0) {
putc(0, fp);
count--;
}
return ferror(fp);
}

Charlie Gordon · Nov 6, 2007

Richard Heathfield said:
Richard Bos said:

Oh my dear chap, you're beginning to make a bit of a habit of being wrong,
aren't you? If it's a theorem, it *has* been proved. If it has not been
proved, it is not a theorem. (Fermat's Last Theorem was for centuries a
misnomer, because it hasn't been proved. Now it's a misnomer because the
proof was supplied by Wiles, not Fermat.)

Fermat claimed to have proved his conjecture, and wrote his friends about
his 'marvelous' discovery. His proof was never found, and thus the
'theorem' remained unproved for more than three centuries. The greatest
mathematicians have spent immense efforts trying to prove it, and eventually
conjectured that Fermat himself did not have proof for it, or that his proof
was incorrect. But they did not reformulate the famous problem as 'the
Fermat conjecture'.

Finally came Andrew Wiles, who discovered a very indirect and complex way to
prove it and bridge different fields of mathematics. That is definitely
'Wiles proof', but it does not make Fermat's 'marvelous' discovery less his.

Charlie Gordon · Nov 6, 2007

Richard said:
It is nonsense to decry strncpy because some programmers can't be arsed
to read the manual properly IMO.

Just reflect on this bold statement next time you are seated in an aircraft.
Wouldn't you want software to be written with less easy to misuse library
functions, *even* if 'some programmers can't be arsed to read the manual
properly' ?

Richard Heathfield · Nov 6, 2007

Charlie Gordon said:

The fact that it does not produce a 'string' is not a necessary condition
to qualify as a 'string function', take strlen for instance.

The fact that it does not either require a string as input nor yield a
string as output might, however, give one pause for thought.

The reasons strncpy is labelled as a string
function

....by some people...

are simple:
- it is defined in <string.h>

....like mem*, and unlike strtol, strtoul, strtod...

- it is documented in the standard in chapter 7.21 String handling

....like mem*, and unlike strtol, strtoul, strtod...

- it's behaviour is somewhat specific when it is passed a string shorter
than its numeric argument.

....like printf("%*s\n", 20, "Hello");

None of these reasons seems to me to be particularly compelling.

You are making this up ;-)

Despite the smiley, that's a rather offensive suggestion, isn't it?

If you are dealing with binary files, strncpy is not the final answer to
your problem, you have to perform I/O as well and the intermediary buffer
is wasteful.

Has it occurred to you that he might have other processing to do on that
buffer before writing it to file?

jacob navia · Nov 6, 2007

Richard Heathfield wrote:
[snip]

1) The standard says:
7.21 String Handling
Later:
7.21.2: Copying functions
Later:
7.21.2.3: The strncpy function.

But this is no proof for Mr Heathfield. He will
insist forever his nonsense. strncpy is not
for string copying.

There is no blinder man as the one that doesn't want to see.

Richard Heathfield · Nov 6, 2007

Charlie Gordon said:

Fermat claimed to have proved his conjecture, and wrote his friends about
his 'marvelous' discovery. His proof was never found,

....if it ever existed...

and thus the
'theorem' remained unproved for more than three centuries.

Right. An unpublished proof, as far as the mathematics community is
concerned, might as well not exist. I have an interesting demonstration of
this fact, but this Usenet article is too large to contain it.

Flash Gordon · Nov 6, 2007

Richard Heathfield wrote, On 06/11/07 13:34:

Flash Gordon said:

Interesting philosophical point there, Flash!

As a matter of fact, I *do* think I'm correct, for the obvious reason that,
if I thought I were /in/correct, naturally I would modify my position.
(Doesn't everyone think that way?)

<snip>

You have misinterpreted my statement. I didn't say that you don't always
believe you are correct, I said that you don't believe you are always
correct. I.e. I stated that I believe you know that you are not perfect
and are sometimes incorrect, not that you state things that you believe
you are incorrect at the time you state them.

Richard · Nov 6, 2007

jacob navia said:
Richard Heathfield wrote:
[snip]

1) The standard says:
7.21 String Handling
Later:
7.21.2: Copying functions
Later:
7.21.2.3: The strncpy function.

But this is no proof for Mr Heathfield. He will
insist forever his nonsense. strncpy is not
for string copying.

There is no blinder man as the one that doesn't want to see.

In the valley of the blind, the one eyed man is king.

user923005 · Nov 6, 2007

user923005 said:

[...] the introduction of a discussion of
multiple currencies blurred the issue in my mind for a little while.
Taking a step back and thinking about it, I realised it represented a
completely separate issue.
I still stand by my statement that you can do double-entry book-keeping
*exactly*.

Click to expand...

Click to expand...

I disagree. If there are any rational or exponential calculations,
then it is not possible.
Examples:
Depreciation calculations
Interest calculations
Investments (Future value, Present value, Annuities...)

Click to expand...

Yes, but it's the same blurring. The "how much interest should be added?"
question cannot be answered exactly (except by chance, of course) - it
must be rounded. But double-entry book-keeping is about putting one
*monetary* amount into two ledgers, once on the debit side and once on the
credit side. This can be done exactly.

I agree that the credit and the debit will agree. But the amount
stored is not exact. There have been schemes based on stealing these
fractional pennies that have netted huge sums over time.

The point I was making is that the calculations of the above enties
are definitely not exact. An 'agreed to method' such as banker's
rounding is used to truncate the decimals at some point and it is
stored. But any operation that uses something as simple as a division
is necessarily (by its very nature) not perfectly precise. The only
operations that can be performed with perfect precision in fixed point
arithmetic are addition, subtraction and multiplication. Any
calculations that include division, exponentiation, quadrature, etc.
are by their very nature not exact. All financial systems of
sufficient complexity include these types of calculations. Truncation
of the total does not vaporize the inexactness of the result. It only
masks it.

If you examine quantlib, you will find that it is full of floating
point operations. But you cannot perform this sort of analysis
correctly without it.

user923005 · Nov 6, 2007

pete said:

Yes, but there is a difference between a "theory" (which is a science
concept) and a "theorem" (which is a formal systems concept).

True, but they are closely related:
http://www.etymonline.com/index.php?search=theorem&searchmode=none

theorem
1551, from M.Fr. théorème, from L.L. theorema, from Gk. theorema
"spectacle, speculation," in Euclid "proposition to be proved," from
theorein "to consider" (see theory).

santosh · Nov 6, 2007

Richard Heathfield wrote, On 06/11/07 13:34:

<snip>

You have misinterpreted my statement. I didn't say that you don't
always believe you are correct, I said that you don't believe you are
always correct. I.e. I stated that I believe you know that you are not
perfect and are sometimes incorrect, not that you state things that
you believe you are incorrect at the time you state them.

This gets my vote for the most extreme hair-splitting post for this
year

jameskuyper · Nov 6, 2007

Charlie said:
"James Kuyper" <[email protected]> a écrit dans le message de news: ....

The fact that it does not produce a 'string' is not a necessary condition to
qualify as a 'string function', take strlen for instance.

True, but I was specifically responding to a message referring to
"string copy" functions, not just "string" functions. A "string copy"
function that doesn't always copy the entire string, and sometimes
creates an output that is a lot larger than the string that is to be
copied is arguably mislabeled.

You are making this up ;-)
No.

If you are dealing with binary files, strncpy is not the final answer to
your problem, you have to perform I/O as well and the intermediary buffer is
wasteful. A 'tailor-made' solution to your problem is this:

The third-party library function that performs the actual writing
requires a buffer. What actually ends up in the output file is not
just the array itself, but also the name, dimensions, and XDR datatype
of the array.

Keith Thompson · Nov 6, 2007

Richard said:
Yes it is reasonable. As is

,----
| The strncpy() function is similar, except that at most n bytes of src
| are copied. Warning: If there is no null byte among the first n
| bytes of src, the string placed in dest will not be null
| terminated.
`----

The "string placed in dest" is the clue.

It's a clue that the person who wrote that description of strncpy()
got it wrong. A "string" in C is null terminated by definition; if
it's not null terminated; it's not a string. C99 7.1.1p1: "A string
is a contiguous sequence of characters terminated by and including the
first null character."

The portion you quoted also doesn't say that null characters are
appended if the source array is a string shorter than n characters;
perhaps you just didn't include that part.

The existence of documentation that incorrectly describes the
strncpy() function doesn't prove that it's a "string function". Out
of curiosity, where did you get that description? If it's from a
current system, you might consider submitting a bug report.

You see what makes it so childishly simple is the "n" bit in the
name. it doesn't take a genius to figure out that n means
something. Possibly the number of characters to copy? Surely not!

[...]

If I didn't know about the strncpy() function, and didn't have access
to the documentation, I'd probably assume that it's a safer version of
strcpy() that lets you specify the size of the target array. A call
like
strncpy(dest, source, n);
would (hypothetically) behave like strcpy(dest, source), except that
it wouldn't copy more than n characters in the dest array. If
strlen(source)+1 <= n, it would behave just like strcpy(dest, source).
Otherwise, it would copy the first n-1 characters of source into dest,
and set dest[n-1] to '\0'. It would not copy additional '\0'
characters into the dest array. In any case, as long as source points
to a valid string and dest points to an array of at least n
characters, dest would point to a valid string after the call.

In other words, this:
strncpy(dest, source, n);
would be equivalent to this:
dest[0] = '\0';
strncat(dest, source, n);

There are several other functions with an added 'n' in their names
that behave in this manner: strncat, snprinf, vsnprintf.

The behavior of strncpy() is *not obvious*. I completely agree that
programmers should not attempt to use it (or any other function)
without understanding how it works (though a beginning programmer
doesn't need to understand the details of format strings to write
``printf("hello, world\n");''). I don't mind having strncpy() in the
standard library, but I wish it had a different name, and I wish that
there were a strncpy() function that behaves as I've described above.
But it's far too late to make such a change.

Some questions:

What *exactly* do you mean by the phrase "string function"?

Does strncpy() meet your definition?

Do you use strncpy() in your own code?

Keith Thompson · Nov 6, 2007

jacob navia said:
Richard Heathfield wrote:
[snip]

1) The standard says:
7.21 String Handling
Later:
7.21.2: Copying functions
Later:
7.21.2.3: The strncpy function.

But this is no proof for Mr Heathfield. He will
insist forever his nonsense. strncpy is not
for string copying.

And he's right; strncpy is not for string copying. In text that you
snipped, he presented multiple examples of string functions that are
not described in the "String handling" section of the standard (such
as strtol), as well as multiple examples of functions that are
described in that section that clearly are not string functions (such
as memcpy). So the fact that strncpy is described in the "String
handling" section proves nothing.

There is no blinder man as the one that doesn't want to see.

Indeed.

jacob, you are aware, aren't you, that the source (s2) argument of
strncpy(), unlike the corresponding argument of strcpy(), is not
required to point to a string? And that after a call to strncpy(),
the destination (s1) argument does not necessarily point to a string?
And that it typically appends multiple '\0' characters to the
destination, something that has nothing to do with string handling?

But the real question is this:

*What difference does it make?*

As long as you understand how strncpy() works (or, if you don't
understand it, as long as you avoid using it), it *doesn't matter*
whether you call it a "string function" or not. (That's a generic
"you"; I'm not accusing anyone of not understanding how strncpy()
works.) I'm taking the time to refute your arguments because you're
mistaken, not because the question itself is of any great importance.

I suspect that you (and a few others) are far more interested in
demonstrating that Richard Heathfield is wrong than in discussing C.
The fact that you've chosen to fight this particular battle over a
point on which he happens to be right is probably less significant
than your insistence on fighting it in the first place.

But if you must argue this point, try presenting some valid arguments.
The standard uses the phrase "string function" a few times, but it
doesn't define it. Present a definition of the phrase "string
function", so we can have some basis for determining whether strncpy()
is or isn't one. I suspect that for any definition I could come up
with, either it would be too contrived to be useful, or it would be so
broad that it would include fopen() (which IMHO should not be
considered a string function), or it would be narrow enough to exclude
both strncpy() and memcpy(). Show us your definition.

Richard Heathfield · Nov 6, 2007

Flash Gordon said:

Richard Heathfield wrote, On 06/11/07 13:34:

<snip>

You have misinterpreted my statement. I didn't say that you don't always
believe you are correct, I said that you don't believe you are always
correct. I.e. I stated that I believe you know that you are not perfect
and are sometimes incorrect, not that you state things that you believe
you are incorrect at the time you state them.

I sit corrected. And now that I understand what you meant, I agree.

I must take issue with santosh's response, though - your response was not
hair-splitting. It clarified an important distinction between two very
different ideas.

William Hughes · Nov 6, 2007

Richard Heathfield wrote:

[snip]

1) The standard says:
7.21 String Handling
Later:
7.21.2: Copying functions
Later:
7.21.2.3: The strncpy function.

But this is no proof for Mr Heathfield. He will
insist forever his nonsense. strncpy is not
for string copying.

There is no blinder man as the one that doesn't want to see.

A bit silly. The question is whether strncpy is a "string function".

For: the name starts with str and it is described in 7.21 String
Handling

Against: the fact that it does not take or produce a string

Clearly we cannot decide the question without a definition for "string
function".

We could try

A: a string function starts str and is defined in 7.21

or
B: a string function either requires or always produces a string

I do not find either compelling. As to A I note that there are many
functions defined in 7.21 that are not string functions (e.g.
the mem* family). It seems as reasonable to say that strncpy is
misnamed as it is to say that strncpy is a string function.
Also there are function we would like to call string
functions that are not defined in 7.21, so this is at best a
sufficient
condition. As to B, I would want to characterize the function
by what is normally given and/or produced, not by the edge conditions.
Still, if forced at gunpoint to choose, I would probably pick B.

I would prefer an operational definition:

C: a string function is a function that is more
commonly used to handle strings, than for other purposes.

Like many operational definitions this is a little
fuzzy, but I would claim that using definition C, strncpy is not a
string function.

When discussing definitions we are getting close to
"De Gustibus ...". However, in matters of taste there is no
obvious.

- William Hughes

Tor Rustad · Nov 6, 2007

Flash said:
Tor Rustad wrote, On 06/11/07 11:38:
[...]

This means that the only way to convert money from one currency
> to another is to move it from one account to another, and

We call this a transaction, which typically is done with double-entry
bookkeeping.

at that point the rounding occurs as part of the conversion using
defined rules and a specified conversion rate,

"I'm struggling to imagine any real-world double-entry-relevant
calculation that /cannot/ be done exactly."
-RH

and those rules specify *exactly* what will be credited to one
account and debited from the other.

The whole point, is that this calculation uses rounding. Do UK accounts
have two decimal places?

Incorrect, it is easy as long as you follow the requirements above.

Nothing you said, invalidated my statement here...

If anyone is interested enough I could ask one of my brothers who was
doing application support & maintenance for one of the large
organisations in the City of London that works with lots of currencies.

I have been trying to leave this thread for some time...

Anyway, some details from an insider is very interesting. In particular,
a related problem, how UK banks can implement SEPA w.r.t. double-entry
bookkeeping and risk management.

In this case, the transactions will be in euro, while the EU payment
scheme rulebook say nothing about possible currency conversion, or the
related risks for the banks.

I don't think that Richard believes he is always correct.

Neither did I. ;-)

Charlie Gordon · Nov 6, 2007

True, but I was specifically responding to a message referring to
"string copy" functions, not just "string" functions. A "string copy"
function that doesn't always copy the entire string, and sometimes
creates an output that is a lot larger than the string that is to be
copied is arguably mislabeled.

I agree: strncpy is *definitely* mislabeled.
Furthermore, I advocate there is no justification for its inclusion in the
Standard.
It was a historical mistake, just like forgetting or deliberately excluding
the much more useful strdup function (as defined in Posix).

The usual argument that strdup is easy to write in terms of standard
functions does not hold: atoi and friends are easy to write with strtoxxx,
and for that matter, strncpy is quite easy to write too.

The third-party library function that performs the actual writing
requires a buffer. What actually ends up in the output file is not
just the array itself, but also the name, dimensions, and XDR datatype
of the array.

OK, you have one of the very few examples where strncpy is the right tool
for the job, but it would not have been much of a problem if it had not been
part of the standard and you had to write a 'tailor-made' utility for this
need.

Charlie Gordon · Nov 6, 2007

Richard Heathfield said:
Charlie Gordon said:

...if it ever existed...

correct, as I mentioned in content that you snipped.

Right. An unpublished proof, as far as the mathematics community is
concerned, might as well not exist. I have an interesting demonstration of
this fact, but this Usenet article is too large to contain it.

Mathematics was like a game of twits then... making Fermat's Last Theorem
one of the longest running jokes of all time.

Fibonacci	0	May 13, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
C language. work with text	3	Dec 9, 2021
code review	26	Feb 6, 2004
Can't solve problems! please Help	0	Sep 26, 2022
compressing charatcers	35	Apr 2, 2014
Strange bug	65	Nov 19, 2010
K&R exercise 5-5	10	Feb 19, 2007

Bug/Gross InEfficiency in HeathField's fgetline program

Richard

Charlie Gordon

Charlie Gordon

Charlie Gordon

Richard Heathfield

jacob navia

Richard Heathfield

Flash Gordon

Richard

user923005

user923005

santosh

jameskuyper

Keith Thompson

Keith Thompson

Richard Heathfield

William Hughes

Tor Rustad

Charlie Gordon

Charlie Gordon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads