embarrassing spaghetti code needs stylistic advice

B

Ben Bacarisse

Richard said:
Flash Gordon <[email protected]> writes:

How horrendous. The first thing I looked for seeing that was a higher
level loop.

It is almost impossible code so as to guard against all possible
misunderstandings.
Hint : "while(--x);" executes no consitional statement so include no
conditional statement.

That is not right -- the ; is not part of the while. The above /does/
execute a statement conditionally, the question is simply about how
best to write that it. C offers two obvious choices: an expression
statement with no expression, and a "continue" statement. My
preference is, like you, for the empty expression statement but I have
used the other form in the past.
If a C programmer can not see what

while(x--);

does then he has no business in the code

I would say the same about while(x--) continue;
 
L

luser-ex-troll

It is almost impossible code so as to guard against all possible
misunderstandings.


That is not right -- the ; is not part of the while.  The above /does/
execute a statement conditionally, the question is simply about how
best to write that it.  C offers two obvious choices: an expression
statement with no expression, and a "continue" statement.  My
preference is, like you, for the empty expression statement but I have
used the other form in the past.


I would say the same about while(x--) continue;

I both agree and disagree with both of you. Although while(cond); is
syntactically valid, and any C programmer worth their salt should
understand this if they look at it. But one indulges in speculation
about how best to offer that minimum bit of help which may be needed
by some future reader without being obnoxious to those who can grok it
without such aid.

For more fuel to the fire, I've investigated the hodge-podge of int-
bool interpretations in this little program and am surprised at the
results. Is it normal for different functions in the same file to have
such wildly differing notions about good and evil?


#include <ctype.h>
#include <stdio.h>
#include <string.h>

/* character classes */
#define eq(a,b) (int)a==b
#define within(a,b) strchr(a,b)!=NULL
int israd (int c){return eq('#',c);}
int isdot (int c){return eq('.',c);}
int ise (int c){return within("eE",c);}
int issign (int c){return within("+-",c);}
int isdelim (int c){return within("()<>{}[]%/",c);}
int isregular(int c)
{return c!=EOF && !isspace(c) && !isdelim(c);}
#undef within
#undef eq

typedef struct { int (*fp)(int); int y, n; } test;

/* n.b. this machine has no x+, use xx* */
/* ^[+-]?\d+$ */
test fsm_dec[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, -1 }, /* [+-]?! ??(\d)?? */
/* 2*/ { isdigit, 2, -1 }, /* [+-]?\d\d* yes! */
};
int acc_dec(int i){ return i==2; } /*acceptable decimal?*/

/* ^\d+[#][a-Z0-9]+$ */
test fsm_rad[] = {
/* 0*/ { isdigit, 1, -1 },
/* 1*/ { isdigit, 1, 2 }, /* \d\d* */
/* 2*/ { israd, 3, -1 }, /* \d\d*[^\d] */
/* 3*/ { isalnum, 4, -1 }, /* \d\d*# */
/* 4*/ { isalnum, 4, -1 }, /* \d\d*#\x\x* yes! */
};
int acc_rad(int i){ return i==4; } /*acceptable radix?*/

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */
test fsm_real[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, 4 }, /* [+-]? */
/* 2*/ { isdigit, 2, 3 }, /* [+-]?\d\d* yes! */
/* 3*/ { isdot, 6, 7 }, /* [+-]?\d\d*[^\d] */
/* 4*/ { isdot, 5, -1 }, /* [+-]?[^\d] */
/* 5*/ { isdigit, 6, -1 }, /* s?\. where s is [+-] */
/* 6*/ { isdigit, 6, 7 }, /* s?(\d\d*)?\.\d* yes! */
/* 7*/ { ise, 8, -1 }, /* s?(\d\d*)?(\.\d*)? */
/* 8*/ { issign, 9, 9 }, /* s?\d*(\.\d*)?[eE] */
/* 9*/ { isdigit, 10, -1 }, /* s?\d*(\.\d*)?[eE][+-]? */
/*10*/ { isdigit, 10, -1 }, /* s?\d*(\.\d*)?[eE]s?\d\d* yes! */
};
/*acceptable real*/
int acc_real(int i){
switch(i) { case 2: case 6: case 10: return 1; }
return 0; }

/*1/0*/
/*what's the reverse of reverse polish?*/
int czek(char *s, test *fsm, int(*yes)(int)){
int sta = 0;
while(sta!=-1 && *s) {
if (fsm[sta].fp((int)*s)) { sta=fsm[sta].y; s++; }
else { sta=fsm[sta].n; } }
return yes(sta); }

/*0/-1*/
int grok(char *s) { /*dig?*/
if (czek(s, fsm_dec, acc_dec )) {
printf( "dec: %s\n", s); return 0; }
else if (czek(s, fsm_rad, acc_rad )) {
printf( "rad: %s\n", s); return 0; }
else if (czek(s, fsm_real,acc_real)) {
printf("real: %s\n", s); return 0; }
/*else*/
printf("grok? %s\n", s);
return -1; }

/*0/-1*/
int puff(char *buf, int nbuf) {
int c; char *s = buf;
while ( (c=getchar()), isregular(c) ) {
if(s-buf >= nbuf-1) return -1;
*s++ = (char)c; }
*s = (char)0; (void)ungetc(c,stdin);
return 0; }

/*1/-1*/
int toke(char *buf, int nbuf) {
int sta = 0; char *s=buf;
while (isspace(*s=(char)getchar())) /*jump down, turn around*/;
if ((sta=puff(++s,nbuf-1))!=-1)
sta=grok(buf);
return sta; }

/*0/1*/
#define NBUF 10
int main(void) { char buf[NBUF] = "";
while (-1 != toke(buf,NBUF)) /*pick a bale a day*/;
return 0; }

/*eof*/
 
J

jameskuyper

Ben said:
That is not right -- the ; is not part of the while.

I'm not quite sure what you mean by that objection. It's certainly not
a part of the keyword "while", but it is part of a while statement.

The relevant grammatical production is described in section 6.8.5:
iteration-statement:
while ( expression ) statement

The ';' in "while(--x);" is part of the "statement" portion of an
iteration statement. It is no less a part of the while statement than
the 'x' is.
 
D

Default User

Richard said:
CBFalconer said:

Putting the 'continue' on a separate line makes it clearer exactly
when the condition is met (when stepping through the code with a
debugger - not something I do a lot nowadays, but I used to, and
some people still do). And using the (optional) braces to mark off
the loop body is a good habit to get into, as it can save all kinds
of embarrassment later when the loop is maintained.

I tend to prefer using the same structure for the empty block as I
would if it had something in it. I would not normally put in a continue
statement, but just leave it empty.

while (condition)
{
}

A comment in the compound statement would be a good idea as well. That
makes it extra clear that it wasn't some sort of editing error.





Brian
 
F

Flash Gordon

Richard said:
How horrendous. The first thing I looked for seeing that was a higher
level loop.

I don't.
Hint : "while(--x);" executes no consitional statement so include no
conditional statement.

If a C programmer can not see what

while(x--);

does then he has no business in the code and will certainly screw up at

while(*d++=*s++);

time.

I agree that a programmer should be able to see and understand what all
of the other alternatives mean. To me it is more a matter of which will
make it clearest that I intended that there is no body.

I'm not really bothered about which method people use for an empty body
and would happily switch to whatever method was being used by anyone
else I happened to be working with.
 
B

Ben Bacarisse

jameskuyper said:
I'm not quite sure what you mean by that objection. It's certainly not
a part of the keyword "while", but it is part of a while statement.

Yes, it was not a good way to put it. I thought it became clear later
with the text you've snipped. By saying '"while(--x);" executes no
consitional[sic] statement so include no conditional statement'
Richard seems to be suggesting that there is no sub-statement
conditionally controlled by the (outer) while. If there were true,
the ; would, most likely, come from the production for "while" but it
does not as you illustrate:
 
C

CBFalconer

Richard said:
Keith Thompson said:
(e-mail address removed) (Richard Harter) writes:
.... snip ...
I dare say you have; people have all sorts of notions. Perhaps
I am wrong but I would have supposed that good coders indent
only when a new block is started or when a single statement is
being written on more than one line. (Or in special situations
not covered by these two cases. :))

[...]

I've seen too many cases where the indentation is inconsistent
because the author had a different tabstop setting than I do
(e.g., 4 columns vs. 8). I've even seen inconsistencies within
the same file, where apparently two maintainers had different
tabstop settings.

Coding standards should ban the use of tab characters in C
source code.

I've seen the same thing and I agree about banning tab
characters. One of the conveniences of the vslick editor that
I use is that by default it automatically converts tabs into
spaces when text is entered. I imagine that other editors have
something similar.

And then there are the Vedit users, who can list a whole set of tab
stops. Very handy, but not too useful for indented code. The wise
user includes a comment in his original which specifies how to set
up the tabs.
 
C

CBFalconer

Richard said:
pete said:



Ouch. And that's why I always use {}

Then how do you code printing:

"Hello Hello Hello Hello Hello World\n"

with a user controllable count of "Hello"s? :)
 
B

Barry Schwarz

snip


Yes, irritating, but acknowledged.

snip


I know. But it's so pretty with 85 columns. I'll split such things for
future postings, but I'm keeping it this way on disk.

Bad idea. If you are having problems with code, the only way we can
help is if you post the exact code (preferably using cut and paste).
While adding new lines is probably low risk, it is not risk free.

snip
Thanks a bunch. I'll keep the lines much shorter to guard against ugly
splits. Is there any way to defend against those extra newlines?

Decide where a visual break will introduce the least problem for the
reader. For example, the above if could be

if (check(buf,decimal,dec_accept))
{printf( "dec: %s\n", buf);
return 0;
}
 
B

Barry Schwarz

Traced, debugged, sieved, and splinted; is it stylish yet?
snip

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */
test fsm_real[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, 4 }, /* [+-]? */
/* 2*/ { isdigit, 2, 3 }, /* [+-]?\d\d* yes! */
/* 3*/ { isdot, 6, 7 }, /* [+-]?\d\d*[^\d] */
/* 4*/ { isdot, 5, -1 }, /* [+-]?[^\d] */
/* 5*/ { isdigit, 6, -1 }, /* [+-]?\. */
/* 6*/ { isdigit, 6, 7 }, /* [+-]?(\d\d*)?\.\d* yes! */
/* 7*/ { ise, 8, -1 }, /* [+-]?(\d\d*)?(\.\d*)? */
/* 8*/ { issign, 9, 9 }, /* [+-]?(\d\d*)?(\.\d*)?[eE] */
/* 9*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]? */
/*10*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]?\d\d*
yes! */

Just out of curiosity, I wonder why 2e7 is not valid?
 
B

Barry Schwarz

snip


I don't understand: the program should reject an EOF and demand the
line be finished?! A newline separator is handled by isspace on the
next line.

Most user input is terminated with the Enter key which results in
getchar returning '\n'. Forcing the user to figure out how to enter
EOF (it gets asked fairly frequently and is system specific) is just
unfriendly. I don't think I know how to force EOF on my IBM mainframe
except at the start of a line with a "/*" which means I have to give
you the '\n' anyway.

If you like, check for both. (Some here even insist on it.)
 
B

Ben Bacarisse

Barry Schwarz said:
Traced, debugged, sieved, and splinted; is it stylish yet?
snip

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */
test fsm_real[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, 4 }, /* [+-]? */
/* 2*/ { isdigit, 2, 3 }, /* [+-]?\d\d* yes! */
/* 3*/ { isdot, 6, 7 }, /* [+-]?\d\d*[^\d] */
/* 4*/ { isdot, 5, -1 }, /* [+-]?[^\d] */
/* 5*/ { isdigit, 6, -1 }, /* [+-]?\. */
/* 6*/ { isdigit, 6, 7 }, /* [+-]?(\d\d*)?\.\d* yes! */
/* 7*/ { ise, 8, -1 }, /* [+-]?(\d\d*)?(\.\d*)? */
/* 8*/ { issign, 9, 9 }, /* [+-]?(\d\d*)?(\.\d*)?[eE] */
/* 9*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]? */
/*10*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]?\d\d*
yes! */

Just out of curiosity, I wonder why 2e7 is not valid?

It is valid according to the PostScript reference manual I have, and it
seems to be according to my reading of the comments. Does the code
not honour the comments?

The comments seem to suggest that the code will take e, E and E45 as
floating point numbers and this is not what my reference says. My
reading of the PS text is that floating point number has the form

[+-]?(\d+.\d*|\d*.\d+|\d+)([eE][+-]?\d+)?
 
L

luser-ex-troll

Traced, debugged, sieved, and splinted; is it stylish yet?
snip

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */
test fsm_real[] = {
/* 0*/ { issign,   1,  1 },
/* 1*/ { isdigit,  2,  4 }, /* [+-]? */
/* 2*/ { isdigit,  2,  3 }, /* [+-]?\d\d* yes! */
/* 3*/ { isdot,    6,  7 }, /* [+-]?\d\d*[^\d] */
/* 4*/ { isdot,    5, -1 }, /* [+-]?[^\d] */
/* 5*/ { isdigit,  6, -1 }, /* [+-]?\. */
/* 6*/ { isdigit,  6,  7 }, /* [+-]?(\d\d*)?\.\d* yes! */
/* 7*/ { ise,      8, -1 }, /* [+-]?(\d\d*)?(\.\d*)? */
/* 8*/ { issign,   9,  9 }, /* [+-]?(\d\d*)?(\.\d*)?[eE] */
/* 9*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]? */
/*10*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]?\d\d*
yes! */

Just out of curiosity, I wonder why 2e7 is not valid?

I wonder why you wonder.
I'm getting correct recognition of 2e7. Perhaps there was a transient
error in one of the intermediate versions, but the current one accepts
2e7 just fine; and the machine quoted here appears to be the correct
one. Let's trace it to make sure.

in state 0, issign('2') returns false, transit to 1.
in state 1, isdigit('2') returns true, transit to 2, increment s.
in state 2, isdigit('e') returns false, transit to 3.
in state 3, isdot('e') returns false, transit to 7.
in state 7, ise('e') returns true, transit to 8, increment s.
in state 8, issign('7') returns false, transit to 9.
in state 9, isdigit('7') returns true, transit to 10, increment s.
s now points to a nul character and so the loop ends,
and acc_real(10) returns true indicating the machine
terminated in an acceptable final state.

I cannot reproduce the error. It appears correct to me.
 
L

luser-ex-troll

Most user input is terminated with the Enter key which results in
getchar returning '\n'.  Forcing the user to figure out how to enter
EOF (it gets asked fairly frequently and is system specific) is just
unfriendly.  I don't think I know how to force EOF on my IBM mainframe
except at the start of a line with a "/*" which means I have to give
you the '\n' anyway.

If you like, check for both.  (Some here even insist on it.)

I'm fairly certain that I am.
Right here, in fact|
V
 
K

Keith Thompson

CBFalconer said:
And then there are the Vedit users, who can list a whole set of tab
stops. Very handy, but not too useful for indented code. The wise
user includes a comment in his original which specifies how to set
up the tabs.

No, the wise user uses only spaces, never tabs, for indentation.

Which is better, a comment telling every person viewing the file how
to set up the tabs (possibly using different methods depending on
which editor or other tool you're using), or using spaces and thereby
cleanly eliminating the need for either the comment or the setup?

Yes, this is my personal opinion, but it's a very strongly held one.
 
C

CBFalconer

Keith said:
.... snip ...


No, the wise user uses only spaces, never tabs, for indentation.

Which is better, a comment telling every person viewing the file
how to set up the tabs (possibly using different methods
depending on which editor or other tool you're using), or using
spaces and thereby cleanly eliminating the need for either the
comment or the setup?

Well, I was thinking of generic file editing, not C source. I
wasn't specific about it.
 
C

CBFalconer

pete said:
Not with the code statements which are shown above.

The number of "Hello "'s that the shown code statements can
output, is limited to one.

Not with the unshown lines, which included:

int i = 5;
and
#define condition i--

Besides, I think your assumptions cause infinite Hellos or zero.
:)
 
B

Ben Bacarisse

CBFalconer said:
Not with the unshown lines, which included:

int i = 5;
and
#define condition i--

These would make no difference.
Besides, I think your assumptions cause infinite Hellos or zero.
:)

No, you have missed the point of the example. It was to show how
confusing bad indentation can be, particularly when there is a tiny,
but significant, ';' on the while line.
 
B

Ben Bacarisse

luser-ex-troll said:
Traced, debugged, sieved, and splinted; is it stylish yet?
snip

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */
test fsm_real[] = {
/* 0*/ { issign,   1,  1 },
/* 1*/ { isdigit,  2,  4 }, /* [+-]? */
/* 2*/ { isdigit,  2,  3 }, /* [+-]?\d\d* yes! */
/* 3*/ { isdot,    6,  7 }, /* [+-]?\d\d*[^\d] */
/* 4*/ { isdot,    5, -1 }, /* [+-]?[^\d] */
/* 5*/ { isdigit,  6, -1 }, /* [+-]?\. */
/* 6*/ { isdigit,  6,  7 }, /* [+-]?(\d\d*)?\.\d* yes! */
/* 7*/ { ise,      8, -1 }, /* [+-]?(\d\d*)?(\.\d*)? */
/* 8*/ { issign,   9,  9 }, /* [+-]?(\d\d*)?(\.\d*)?[eE] */
/* 9*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]? */
/*10*/ { isdigit, 10, -1 }, /* [+-]?(\d\d*)?(\.\d*)?[eE][+-]?\d\d*
yes! */

Just out of curiosity, I wonder why 2e7 is not valid?

I wonder why you wonder.
I'm getting correct recognition of 2e7. Perhaps there was a transient
error in one of the intermediate versions, but the current one accepts
2e7 just fine; and the machine quoted here appears to be the correct
one. Let's trace it to make sure.

I see you have correctly ignored my remark about string such as e2,
..e2 etc! I thought you might be getting these wrong based on the
later comments. I would correct these to reflect what has been seen
by the time the machine gets into that particular state. The last
line suggest that .e2 will be taken as a real but, presumably, it
won't be because of previous matches to get into state 10.

Also, I don't think your top comment is correct or at least it is a
little misleading:

/* ^[+-]?(\d+(\.\d*)?)|(\d*\.\d+)([eE][+-]?\d+)?$ */

suggests that '+2' will be taken as a real when it is not one (at not
least formally).
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:

Given the context, I can understand your response, Keith.
Nevertheless, I cannot agree with your claim, because it is
equivalent to "the user who sometimes or always uses tabs for
indentation is unwise". Personally, I thought about this a long
time ago and decided to use spaces - but I know that some people,
just as wise as me if not wiser, have also thought about this and
reached the opposite conclusion. This isn't a matter of "my way
wise, his way unwise", but a matter of "what works for you". It's
one of those (very few) issues that can indeed be resolved by
postmodernist fudging. You use what you want, I'll use what I want,
and I'll run your code through indent before I try to read it (and
doubtless you will do the same to mine). As it happens, you and I
both chose spaces, so indent won't have much to do (in that
respect) - but just because we agree on this, it does not mean that
we are wiser than someone who chose tabs.

I don't want to pass all code through indent before reading it. For
one thing, depending on the environment, that's not always even an
option; for another, indent can change the layout in ways that
shouldn't affect what the compiler sees, but can adversely affect what
a human reader sees.

Different programmers use different brace placement, among other
things. I don't want to change that; for example, I might need to
conform to the existing style when I make changes and check them in.

And if there's a stray semicolon:
while (condition);
do_something;
indent will quietly change it to:
while (condition);
do_something;
making it harder to figure out what was actually intended. (That's
admittedly an unusual case.)

What I frequently do is run code through expand (a Unix tool that
replaces each tab with the right number of spaces) -- but figuring out
what options to pass to expand can be non-trivial. For many years, I
only saw code written with the assumption that tabstops are set every
8 columns, because that's the default setting on the Unix systems I
used. Now I often see code that looks incorrect unless I set tabstops
to 4 columns, or sometimes 2. There's no explicit indication of what
the tabstop setting should be; I just have to play with it until the
code looks right.
Fine - but if you now ask a tabs advocate to write a "which is
better", you may well find that he makes out an equally strong
case.

I'd be interested in seeing such an argument.
Perhaps just a touch /too/ strongly held?

Perhaps -- except that I'm right and everyone who disagrees with me is
wrong. :cool:}

Back in the old days, I did use tabs for indentation -- but not
necessarily one tab per level. My usual style was to indent 3 columns
per level (I now use 4), but with tabstops set to 8 columns. So the
beginning of a deeply indented line might have one or more tabs
followed by one or more spaces. As long as everyone reading or
editing the code had their tabstops set to 8 columns, that was ok (and
it saved a little disk space). Later people seem to have gotten the
idea that each indentation level must be represented by a single tab
character, with the tabstop settings adjusted as necessary. But until
there's universal agreement on tabstop settings, there's always the
risk that code will be formatted inconsistently. I've found that the
best way to avoid inconsistency is to use spaces exclusively.

(Except in Makefiles, which require tab characters -- sigh.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top