Bug/Gross InEfficiency in HeathField's fgetline program

Mark McIntyre · Oct 9, 2007

But exactly the opposite is true - clarity is lost in *your* version,

You're wrong, but I don't expect you to believe me since you're
apparently a troll.

And if you make simple things over-complicated, we might not
unreasonably suspect that you might make complicated things into a
complete mess.

Only if he changes his name to Antoninus Twink.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Joachim Schmitz · Oct 9, 2007

¬a\/b said:
In data Mon, 8 Oct 2007 00:16:07 +0200 (CEST), Antoninus Twink
scrisse:

this is a big waste of vertical spaces

Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}

Bye, Jojo

Richard Heathfield · Oct 9, 2007

Antoninus Twink said:

On 9 Oct 2007 at 9:34, Richard Heathfield wrote:

But exactly the opposite is true - clarity is lost in *your* version, by
taking something simple and making a meal of it.

Clearly, you are entitled to your opinion. Equally clearly, many here do
not share it, including me.

Yes, it's terribly, terribly simple. Almost babyish even. I did not mean
that your *code* is complex, but that the *algorithm* is complex.

You're kidding, right? If you think that's a complex algorithm, steer well
clear of Miller-Rabin or Boyer-Moore (or, come to that, gcd).

And
you will say, "But it's a simple algorithm!" Right, it isn't complex by
any objective standard of complexity, but it's *more complex than it
needs to be* - why swap a simple single-pass algorithm for a 2-pass
algorithm?

I didn't swap anything. I just followed my programming instincts. It was a
long time ago, but I expect I reasoned roughly along these lines: (a) the
original might not be writeable, so make a safe copy; (b) hack the copy,
leaving the original alone.

During development, I would certainly have noticed if the performance were
unacceptable (because I tend to develop incrementally most of the time, so
sudden performance drops will normally stand out), and done something
about it. Since that never happened, I didn't bother. This is in full
accordance with the Two Rules of Optimisation.

And if you make simple things over-complicated, we might not
unreasonably suspect that you might make complicated things into a
complete mess.

You might think that my way is more complicated, and that's entirely up to
you. But John Bode's optimiser clearly disagrees with you.

By the way: if, for the sake of argument, we accept that my code /is/
grossly inefficient, how should we describe your code, which - on John's
system - is between 20% and 85% *slower* than mine?

Chris Dollin · Oct 9, 2007

Joachim said:
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';
++u;
}
}
return t;
}

Save another line by declaring `char *u = t;`, which also makes it
utterly clear that `u` is initialised to a sensible value.

Richard Heathfield · Oct 9, 2007

Chris Dollin said:

Save another line by declaring `char *u = t;`, which also makes it
utterly clear that `u` is initialised to a sensible value.

<hell temperature="frozen over">
Or even char *u = strcpy(t, s); saving /two/ lines.

</hell>

Chris Dollin · Oct 9, 2007

Richard said:
Chris Dollin said:

<hell temperature="frozen over">
Or even char *u = strcpy(t, s); saving /two/ lines.
</hell>

I was just about to post that adjustment, you mind-reader you.

Richard · Oct 9, 2007

Mark McIntyre said:
You're wrong, but I don't expect you to believe me since you're
apparently a troll.

He most certainly is not.

The shorter more concise version would have got through any code review
long before the long winded one on any project I have worked on.

Only if he changes his name to Antoninus Twink.

This is getting ridiculous.

Richard · Oct 9, 2007

Joachim Schmitz said:
Scratch the 'big' and I agree. Saving 6 lines:

char *dot_to_underscore(const char *s)
{
char *t = malloc(strlen(s) + 1);
if(t != NULL){
char *u;
strcpy(t, s);
u = t;
while(*u){
if(*u == '.') *u = '_';

failed.

you cant easily see in a debugger if the assignment is done.

Richard Bos · Oct 9, 2007

Ian Collins said:
You must have a very small screen.

I don't think it's his _screen_ that's very small.

Richard

Joachim Schmitz · Oct 9, 2007

Richard said:
failed.

you cant easily see in a debugger if the assignment is done.

Well, OK, I saved one line too much then here. Make it a 2-liner and save
that line by char *u = t;

Bye, Jojo

Richard · Oct 9, 2007

I don't think it's his _screen_ that's very small.

Richard

It is perfectly normal in many cases to debug on smaller displays over
remote lines or to debug in a small window or gdb buffer. The point
remains - smaller footprint is easier to read and follow.

3 clear lines is better than 20 clear lines any day of the week. Long
winded overly whitespaced code is bad.

kuyper · Oct 9, 2007

Richard wrote:
....

I am astonished that people claiming to be professional programmers
could be in any way "confused" or "unsure" about your concise
replacement for Heathfield's version.

,----
| while(*u++=(*s=='.' ? '_' : *s))
| s++;
`----

I simply can not see the complication or the need to "analyse" this. Yes
if we were training in lesson 2 we might expand it out a little. but
only to the extent of removing the ?: usage to an if then else.

I have interviewed several dozen people for C programming positions
over the past 15 years. I've given every single one of them a simple
program to understand and explain and suggest improvements for. The
heart of that program was the following loop:

while(*p++ = *q++);

Only about half of them could even tell me correctly what that loop
does; not a single one has ever correctly explained how it does it.
The only person I've found who ever passed this test wasn't an
applicant but a co-worker, and she required a lot of hints before she
figured it out. The stumbling block for most of them is explaining how
the loop terminates. I've had to hire people who couldn't answer that
question, because better candidates weren't available. One of the most
recent applicants had 15 years(!) of recent C programming experience.
He didn't get the position, but not because of his lack of programming
skills, but only because he lacked several of the other important
qualifications for the position.

You may have had the privilege of working with more competent C
programmers than I have; but I would recommend not overestimating the
competence of the average C programmer. The code you cite has every
feature that made my test difficult for applicants, and adds to that a
use of the ?: operator.

While I can understand your code, even for a use-once-and-throw-away
program, I consider it excessively terse. For improved debugging, I'd
have used an if() statement rather than a ?: operator, so the if-
clause and the else-clause would occur on different lines that could
be selected by the debugger. Any decent compiler should produce the
same code, whether using if() or ?:. Yes, I do write my even use-once
programs anticipating that they may need to be debugged; they tend to
be written in a hurry without review, which means they're much more
likely to be buggy than my longer-lived code.

J. J. Farrell · Oct 9, 2007

He most certainly is not.

The shorter more concise version would have got through any code review
long before the long winded one on any project I have worked on.

I can only conclude that you've worked exclusively on very strange
projects. The opposite of what you say is true in my experience. The
original code would always have got through, after questions about the
multiple passes; in the given circumstances the code would be allowed
to stand. Antoninus Twink's version would never have got through a
review. Your version would probably have got through after a lot of
arguing about its being needlessly obscure.

I find your position very surprising. There's no way to know but I'd
bet that, given the choice between the two in the given circumstances,
the vast majority of experienced professional C programmers would
choose the original version over Antoninus Twink's. They might not be
overly fond of it, but it would be the preferred choice of the two.

This is getting ridiculous.

Indeed.

santosh · Oct 9, 2007

kuyper said:
Richard wrote:
...

I have interviewed several dozen people for C programming positions
over the past 15 years. I've given every single one of them a simple
program to understand and explain and suggest improvements for. The
heart of that program was the following loop:

while(*p++ = *q++);

Only about half of them could even tell me correctly what that loop
does; not a single one has ever correctly explained how it does it.
The only person I've found who ever passed this test wasn't an
applicant but a co-worker, and she required a lot of hints before she
figured it out. The stumbling block for most of them is explaining how
the loop terminates. I've had to hire people who couldn't answer that
question, because better candidates weren't available.

This is incredible!

You may have had the privilege of working with more competent C
programmers than I have; but I would recommend not overestimating the
competence of the average C programmer. The code you cite has every
feature that made my test difficult for applicants, and adds to that a
use of the ?: operator.

I agree here. All other factors being equal clearer code is always
better. The issue though is, what's clear to one is obfuscation to
another.

<snip>

Richard Heathfield · Oct 9, 2007

santosh said:

kuyper wrote:

[classic K&R strcpy loop snipped]
The stumbling block for most of them is explaining how
the loop terminates. I've had to hire people who couldn't answer that
question, because better candidates weren't available.

Click to expand...

This is incredible!

No, he's right. Most C programmers are very, very bad at C. That's why it's
so important to write clear, simple code. Or at least, as clear and simple
as possible. No clearer, no simpler.

Kenny McCormack · Oct 9, 2007

santosh said:

kuyper wrote:

[classic K&R strcpy loop snipped]
The stumbling block for most of them is explaining how
the loop terminates. I've had to hire people who couldn't answer that
question, because better candidates weren't available.

Click to expand...

This is incredible!

Click to expand...

No, he's right. Most C programmers are very, very bad at C. That's why it's
so important to write clear, simple code. Or at least, as clear and simple
as possible. No clearer, no simpler.

I think Sturgeon's law applies here (as it does everywhere).

Malcolm McLean · Oct 9, 2007

kuyper said:
The heart of that program was the following loop:

while(*p++ = *q++);

Only about half of them could even tell me correctly what that loop
does; not a single one has ever correctly explained how it does it.

It's what I call compileable gibberish. That's the sort of thing that gives
C a bad name.
If C were the only lanauge in existence there would be a case for developing
that sort of terse convention. But the fact is that there are many languages
that can copy zero-terminated arrays one to the other, but most require a
slightly more intuitive syntax to do it. Whilst you've got to assume that
the reader knows C, don't assume that he uses the language more than
occasionally. mke is as easy as possible for the reader, and remember that
that trivial inconveniences are cumulative.

Flash Gordon · Oct 9, 2007

Malcolm McLean wrote, On 09/10/07 21:40:

It's what I call compileable gibberish. That's the sort of thing that
gives C a bad name.

If you think that then C is not the language for you since it is
idiomatic. Personally I would write it as
while (*p++ = *q++) continue;
I would even suggest the changes at a code review (more spacing would be
mandated by any coding standard I had a say in). Other than that it is fine.

If C were the only lanauge in existence there would be a case for
developing that sort of terse convention.

It was developed when C was young (before it was standardised) by the
creators. There were already other languages at the time.

But the fact is that there are
many languages that can copy zero-terminated arrays one to the other,
but most require a slightly more intuitive syntax to do it.

So has C, you will find details in the documentation for the standard
library. That does not mean the code is not idiomatic.

Whilst
you've got to assume that the reader knows C, don't assume that he uses
the language more than occasionally.

Completely wrong. Kuyper was talking about interviewing people for C
programming jobs. Anyone in such a job will be using C a *lot*. Either
that or they will be sacked for not doing the job they are paid for.

mke is as easy as possible for the
reader, and remember that that trivial inconveniences are cumulative.

You makes it as easy as possible for the correct level of reader. Anyone
with a C programming job should be able to work out what that loop does
during an interview.

Antoninus Twink · Oct 9, 2007

It's what I call compileable gibberish. That's the sort of thing that gives
C a bad name.

It's the sort of thing that attracted me to C in the first place. I
suspect I'm not alone.

If C were the only lanauge in existence there would be a case for developing
that sort of terse convention. But the fact is that there are many languages
that can copy zero-terminated arrays one to the other, but most require a
slightly more intuitive syntax to do it. Whilst you've got to assume that
the reader knows C, don't assume that he uses the language more than
occasionally. mke is as easy as possible for the reader, and remember that
that trivial inconveniences are cumulative.

If a maintainance programmer doesn't have a basic level of competence in
the language of the code he's working with, then slight befuddlement
over how string copying works is likely to be the least of his problems
(and the least of the damage to the code that results). I mean,
honestly, this sort of idiom is like Chapter 2 of K&R or something.

kuyper · Oct 9, 2007

Malcolm said:
It's what I call compileable gibberish. That's the sort of thing that gives
C a bad name.

I agree that it's an obscure idiom, but I wouldn't go so far as to
call it gibberish.

When I first used this test on an applicant, I expected that anyone
adequately qualified for an entry-level full-time C programming
position would be able to figure out how that idiom works, if given a
reasonable amount of time to do so. Furthermore, since the idiom is
explained in full detail in K&R, I expected that many of them would
actually recognize it. My expectations have lowered since then. I
still think that entry-level programmers should be able to figure this
out, even though I no longer expect them to be able to do so.

....

slightly more intuitive syntax to do it. Whilst you've got to assume that
the reader knows C, don't assume that he uses the language more than
occasionally.

The code I'm responsible for is intended to be read and modified only
by full-time professional C programmers. While the code is publicly
available, making it easily modifiable by people who aren't very
familiar with C is not a priority. It's not just a low priority; it's
completely missing from our list of priorities.

Making it easily modifiable by people who are very familiar with C is
one of our priorities. The depressing thing is how low a limit that
places on the complexity of the code.

Fibonacci	0	May 13, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
C language. work with text	3	Dec 10, 2021
code review	26	Feb 6, 2004
Can't solve problems! please Help	0	Sep 26, 2022
compressing charatcers	35	Apr 2, 2014
Strange bug	65	Nov 19, 2010
K&R exercise 5-5	10	Feb 20, 2007

Bug/Gross InEfficiency in HeathField's fgetline program

Mark McIntyre

Joachim Schmitz

Richard Heathfield

Chris Dollin

Richard Heathfield

Chris Dollin

Richard

Richard

Richard Bos

Joachim Schmitz

Richard

kuyper

J. J. Farrell

santosh

Richard Heathfield

Kenny McCormack

Malcolm McLean

Flash Gordon

Antoninus Twink

kuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads