Stylistic questions on UNIX C coding.

Keith Thompson · Mar 5, 2010

The != operator is symmetric, ``NULL != pointer'' and ``pointer != NULL''
mean exactly the same thing, and both expressions are easy to read.

Yes, the != operator is symmetric. Yes, the two expressions mean
exactly the same thing. No, ``NULL != pointer'' *isn't* easy to
read *for a lot of people*. When I see a constant on the LHS and a
variable on the RHS of an "==" or "!=" operator, I have to mentally
swap the operands to understand what it means.

If you find them equally easy to read, I envy you. Seriously.
It probably means that you have a better intuitive understanding
than I do.

Consider that the following expressions all have the same meaning:

pointer != NULL
NULL != pointer
!(pointer == NULL)
!!pointer
pointer == NULL ? 0 : 1
pointer == NULL == '/'/'/'

Most sane people would prefer either of the first two forms to the
others, simply because they're more readable. Please understand, or
at least accept accept, than many of us prefer the first form to the
second for exactly the same (probably not entirely rational) reason.

[...]

Seebs · Mar 5, 2010

I suggest that in cases where "is" is used to denote equality,
usage is more symmetric than you might think. "My brother is Fred
Thompson" and "Fred Thompson is my brother" seem equally clear to me
(and equally false, since that's not my brother's name).

They're both equally clear, but they are communicating different things.
One of them is telling us something we might not already know, about an
entity already known to be your brother. The other is telling us something
we might not already know, about an entity already known to be Fred Thompson.

Similarly, there's a big difference between "the person who committed the
murder was the butler" and "the butler committed the murder". The former
is giving you information about a murder you already know about, the other
is giving you information about a butler you already know about.

And someone who finds "x == 7" and "7 == x" equally clear is probably
thinking about it more clearly than I'm able to, and possibly having
trouble figuring out what my problem is. (I'm not entirely sure
about that myself.)

See above. We use sentence structure to communicate additional information
that is not purely a function of the semantics of the words used.

-s

Keith Thompson · Mar 5, 2010

Seebs said:
They're both equally clear, but they are communicating different things.
One of them is telling us something we might not already know, about an
entity already known to be your brother. The other is telling us something
we might not already know, about an entity already known to be Fred Thompson.

And the first sentence is less valid if I have more than one brother.

Perhaps this is a better example: "The ratio of a circle's
circumference to its diameter is pi" vs. "pi is the ratio of a
circle's circumference to its diameter". But even there, the second
sentence looks more like a definition of "pi" than the first one does.

[snip]

Seebs · Mar 5, 2010

Perhaps this is a better example: "The ratio of a circle's
circumference to its diameter is pi" vs. "pi is the ratio of a
circle's circumference to its diameter". But even there, the second
sentence looks more like a definition of "pi" than the first one does.

Right. They're answers to different questions. One answers "I have here a
circumference and a diameter; what is the relationship between them?" The
other answers "I have this number, what does it mean?"

Similarly, there's a reason that tax forms (famed, of course, for their
excellent language, right?) say "If your income is over $N" rather than "if
$N is less than your income". (Ignoring the precise equality case, because
that's another kind of complexity.) Likewise, it's "you must be at least 48
inches tall to go on this ride", not "48 inches must be less than your
height for you to go on this ride".

Idiomatic writing matters in both English and C.

-s

Ersek, Laszlo · Mar 5, 2010

pointer == NULL == '/'/'/'

Fascinating

Cheers,
lacos

Ike Naar · Mar 5, 2010

To the compiler, yes. To the reader, no.

To the mathematically inclined reader, yes.

You sort of do, actually.
In general, while everyone knows that addition is commutative, people will
tend (slightly) to see "x + 3" as "basically x, but with 3 more", and "3 + x"
as "basically 3, but with x more". The lefthand operand has primacy, and
this *does* matter.

It doesn't matter, but anybody can fool themselves that it does.
And then Alice convinces herself that 3+x is ugly and unreadable,
Bob opts for x+3 being error-prone and unreadable, and now what
should Carol write?

[snip]

It's like indentation. We don't indent in C because the compiler cares, but
because it helps readers understand. Bad indentation can result in readers
misunderstanding code, because they trust the indentation to be a cue.

Fully agreed.
But there a several (conflicting) styles of indentation that are
considered 'good', and a programmer should be able to understand
code that uses a reasonable indentation style. It makes no sense
to convince oneself that a reasonable style is "unreadable" simply
because it's not one's preferred style.

Similarly, reversing the order of the comparands in an equality or inequality
comparison, even though in theory it changes nothing semantically, can cause
readers to misunderstand code.
I write for humans, not compilers. Compilers aren't subject to assumptions,
or to difficulty keeping track of the code.

Humans, you can't please them all ;-)

Vladimir Jovic · Mar 5, 2010

Casper said:
And for other types the compiler or lint will also create a
diagnostic.

(4) warning: assignment operator "=" found where "==" was expected
(4) warning: constant operand to op: "!"

Cool. I didn't know I would get a warning.

Seebs · Mar 5, 2010

To the mathematically inclined reader, yes.

I disagree. I was raised by mathematicians, but I view statements and
expressions as often being written to communicate additional meaning.

It doesn't matter, but anybody can fool themselves that it does.

The existence of even a small set of people to whom it matters means
it matters, because code is written for programmers first, and compilers
second.

Fully agreed.
But there a several (conflicting) styles of indentation that are
considered 'good', and a programmer should be able to understand
code that uses a reasonable indentation style. It makes no sense
to convince oneself that a reasonable style is "unreadable" simply
because it's not one's preferred style.

"convince oneself" implies a volitional act taken contrary to evidence
or experience. I don't think that's involved here. I wouldn't quite
call it "unreadable", but it certainly reduces my chances of following
code correctly on the first try. When I see "if (x != y)" in C, I
unconsciously perceive it to be the case that x could vary and y couldn't.

Consider:
for (i = 0; i < 10; ++i)

Why do we write "i < 10" rather than "10 >= i"? Because i's the one that
varies, so "i is less than ten" is more idiomatic than "ten is greater than
or equal to i".

Now consider:
for (i = 0; i < max; ++i)

even though "max" may vary over time, the assumption is that, for this loop,
i changes and max doesn't. If someone wrote this loop, then altered max
within the loop while modifying i to keep it constant, it would be completely
incoherent.

So, now...
for (l = head; l != NULL; l = l->next)

Clearly, this follows the same idiom. If we flip the components of the
middle expression, we've suddenly gone off the standard idiom for the
condition in a for loop, and the reader is justifiably surprised.

And if the for loop should have "l != NULL" rather than "NULL != l" (and
it should), then so should an if statement, for consistency.

The time when that technique caught something compilers wouldn't catch
is long gone. I don't think it's needed anymore.

-s

Nick · Mar 5, 2010

James Harris said:
Or, more Yoda-esque: If wrong you are, eat my hat I will.

But actually, thinking about it later (sad that I am) to make the 7==a
analogy better it should be "If wrong are you".

Keith Thompson · Mar 5, 2010

William Ahern said:
But it does mean equality. In fact, it commands it.

Not for volatile objects or NaNs.

Stefan Ram · Mar 5, 2010

Keith Thompson said:
Not for volatile objects or NaNs.

Not even for some other usages:

#include <stdio.h>
#include <limits.h>

int main( void )
{ char c = INT_MAX;
printf( "%d\n", c == INT_MAX ); }

»=« is a write operation.

ASCII 1963 had a left arrow symbol »<--« with code 101 1111.
But this does not exist in ASCII 1968.

Willem · Mar 5, 2010

Richard Heathfield wrote:
) In English, there is a distinction, albeit a subtle one, between "is
) Fred the man who shot my chicken?" and "is the man who shot my chicken
) Fred?" It's not a semantic distinction, however; rather, it's a
) distinction of tone.

Disagree. The semantic difference is that the first object is the item
that we're interested in, and the second is something we wish to know of
the first.

) In C, however, I can see no distinction.

In C, I see the very same distinction as I mentioned above.
(Okay, so I'm not sure it's a semantic difference, but it is a
difference that carries over to C)

) If I could animate C source, I'd have the operands orbiting the ==
) operator like little (okay, big) electrons. Position is of no
) consequence. But the operands of = would be fixed, because order matters
) for =.

How about: x <- 2 and 2 -> x as assignment operators ?
Then they can orbit as well.

After all, the two operands can be evaluated in any order, it's just
that the operator takes two different arguments (an lvalue and a value).

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Ike Naar · Mar 5, 2010

I disagree. I was raised by mathematicians, but I view statements and
expressions as often being written to communicate additional meaning.

But you must be careful not to assign additional meaning if it was
unintended by the author.

Suppose I want to add a and b, where a and b can have any value.
I don't know (or care) which one of a and b is larger.
Still, when adding them together, I must make a (in this case, arbitrary)
choice between writing (a+b) or (b+a).
Suppose I choose (a+b).
If you read my code, it would be unfortunate if you would conclude, from the
bare fact that I wrote the addition as (a+b), that a is the larger of
the two.

"convince oneself" implies a volitional act taken contrary to evidence
or experience. I don't think that's involved here. I wouldn't quite
call it "unreadable", but it certainly reduces my chances of following
code correctly on the first try. When I see "if (x != y)" in C, I
unconsciously perceive it to be the case that x could vary and y couldn't.

But that perception could be misleading.
Consider a binary search: in this case, both operands can vary:

left = lowerbound;
right = upperbound;
while (left < right)
{
inbetween = (left + right) / 2;
if (some_condition)
left = inbetween;
else
right = inbetween;
}

Consider:
for (i = 0; i < 10; ++i)

Why do we write "i < 10" rather than "10 >= i"? Because i's the one that
varies, so "i is less than ten" is more idiomatic than "ten is greater than
or equal to i".

Another reason might be that "i < 10" and "10 >= i" are not the same thing.

Seebs · Mar 5, 2010

Why?

Because it's idiomatic, and most of the time, code follows that idiom.

Why?

Idioms don't have to have any reason other than "that's how it's been done
before". It's a communications tool; given a general pattern that it's the
varying part on the left, and the invariant part on the right, that's what
I expect whenever I see a comparison operator.

Why?

Because that's how the majority of code has been written. Why is that? I
don't know. It's probably some combination of the pronunciation ("while
i is less than max" is more idiomatic than "while max is greater than or
equal to i") and the first few C books using it.

Why? Consider: an ordinary reversal loop:

while(f < h)
{
e = a[f];
a[f++] = a[h];
a[h--] = e;
}

Which is the constant now? Should it be f<h, or h>f? (Strictly, they're
not equivalent, but in this case either will do.)

Indeed, in that case, either will do.

But in many cases, there's a clear preference, and even if you don't share
it, you will understand most code better and/or more quickly if you keep
that pattern in mind.

C&V please.

K&R. I don't think you'll find a single test in there which goes the
other way.

Why?

Again, it's an idiom. It doesn't need a reason beyond the observation that
people tend to follow it. There's no objective reason for most social
norms, or linguistic conventions, but once we have them, it's useful to
use them to communicate -- both to be aware that other people may be using
them, and to use them ourselves to make communication easier.

Even though it may not seem like much, in a complicated loop or set of
nested loops, having all the conditions follow a consistent idiom makes
it much easier to follow and comprehend code. I'm not sure that which
idiom was picked would have mattered -- at this point, though, I've
seen thousands of loops with "p != NULL" as a condition, and extremely
few with "NULL != p", and similarly, thousands of "i < limit" and very
few "limit >= i", so when I see a condition, I read it that way first,
and only try something else if that works badly.

90% of the time, the heuristic is right, so I stick with it, and I

encourage other people to use it, because it's a very valuable tool.

It's the same reason I advocate "char *x" rather than "char* x" or
"char * x" or "char\n*\nx" as a declaration -- it's a convention and it
seems to generally help me and other readers understand the code. Maybe
it's not helpful for everyone, but I simply haven't seen it cause any
problems in living memory.

-s

Keith Thompson · Mar 5, 2010

Vladimir Jovic said:
Cool. I didn't know I would get a warning.

It depends on the compiler and on how you invoke it. The language
doesn't require warnings in these cases.

Nick Keighley · Mar 5, 2010

Rubbish. You don't say "if that chicken is an animal then I'll eat it",
you say "If that animal is a chicken then I'll eat it".

"is a" isn't the same as equality. It's more like "is a subset of"

Rainer Weikusat · Mar 5, 2010

Not even for some other usages:

#include <stdio.h>
#include <limits.h>

int main( void )
{ char c = INT_MAX;
printf( "%d\n", c == INT_MAX ); }

»=« is a write operation.

Of course it does. You are just playing games with automatic
conversions in the hope to confuse less knowledgable readers.
The line

char c = INT_MAX

has implementation defined behaviour (although the gcc warning
'overflow in implicit constant conversion' suggests that the
people-with-an-axe-to-grind who already managed to convert their
pointless political statement about "proper integers" to code in
various other places are still seeking for a way around the
requirement to support the traditional, sane and useful behaviour
....). If the result is not that 'an implementation defined signal is
raised', an 'implementation defined automatic conversion' is supposed
to happen. Assuming that no such signal was raised and that
sizeof(int) > 1, INT_MAX cannot be converted to char without 'loss of
information'. The expression

c == INT_MAX

causes the value stored in C to be converted back to int. Of course,
after

char c = (char)INT_MAX

was performend,

(int)c == INT_MAX

cannot be true if sizeof(int) > 1. The correct comparison would thus be

c == (char)INT_MAX

Ersek, Laszlo · Mar 5, 2010

Richard Heathfield said:
Seebs wrote:

Why?

Because he pronounces it as "x is not equal to y", and the subject of
that sentence is "x". "x" is the actor, the variable that is acting. "y"
is part of the prepositional phrase, it is static.

Good question.

.... It's either me, or now two of you not noticing (or ignoring) that
"10 == i" satisfies the second but not the first.

Why?

Because programmatically, "i" changes over time, 10 does not, and when
one reads out loud the controlling expression in English, the subject of
that sentence ("the actor") should be the entity that is acting.

Why?

Because when read out loud, "i" is the subject.

You'd be amazed at the antiquity of some compilers. At one recent site,
I was somewhat surprised to find an entire project team still using
MSC5.00a (and they seemed perfectly contented, too).

The following version of gcc:

gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

displays no warning for the following:

$ gcc -Wall -Wextra -ansi -pedantic -fsyntax-only try.c

----v----
static int
f(int a)
{
return a = 3;
}

static int
g(int a)
{
return (a = 3);
}
----^----

(Yes, we could fix that by taking "a" as const.)

Another example, showing that extra parentheses can shut up gcc:

----v----
int f(void);

static void
g(void)
{
int a;

a = f();
if ((a = 2) || (a = 3)) {
/* ... */
}
}
----^----

This produces no warnings either. Although the parentheses are not
needed for the intended meaning, that is, for

a == 2 || a == 3

some people might find the parenthesized version more readable, and when
they combine that with typing assignment instead of equality, then (this
version of) gcc won't warn them.

Cheers,
lacos

Tim Rentsch · Mar 5, 2010

Keith Thompson said:
Ian Collins said:

Tim said:

Anand Hariharan wrote:
<snip>
Haven't seen anyone point this out:

Rather than -

#define MAXNUMFILES 1024

- prefer -

const int MaxNumFiles = 1024;

That way your preprocessor won't do as much damage.
Fine in C99, I think, but an issue in C90 if he's using it to define
an array size.
It's a problem in C99 too, if the array is defined at file scope or it
has internal linkage. There are other reasons why it's not a great
idea in C99. They stem from the fact that MaxNumFiles is not
permitted as part of a constant expression. [snip elaboration]

Minor clarification -- MaxNumFiles is _permitted_ as part of a constant
expression, albeit an implementation-specific constant expression;
it just isn't _required_ to be a portable constant expression.

Click to expand...

What? You could say just about any nonsense is permitted as part of
an implementation-specific expression.

That doesn't alter that fact that in C90 or C99, MaxNumFiles is not
permitted as part of a constant expression.

Click to expand...

I think Tim is referring to C99 6.6p10:

An implementation may accept other forms of constant expressions.

Quite so.

(I just noticed that this doesn't use the term
"implementation-defined" implying, I think, that an implementation
can accept other forms of constant expressions but isn't required
to document them.)

My belief is that such forms of constant expressions still count
as language extensions. If they are they must be documented,
because extensions are required to be documented.

My understanding is that code that uses extensions in general (as
permitted by C99 4p6) still require diagnostics if the code violates a
constraint, but code that uses "other forms of constant expressions"
does not.

Yes, diagnostics are still required for using any extensions that
is a syntax error or a constraint violation, but not not if they
don't, and that also includes "other forms of constant expressions".
(In other words my assessment here agrees with Keith's.)

I'd say Tim's statement is correct, but I'd place the emphasis very
differently: MaxNumFiles is not permitted as part of a constant
expression (unless the implementation permits it under C99 6.6p10).

What I was trying to express is that this form of CE is allowed
in the sense of not being forbidden. Some forms of CE are
"forbidden" in the sense that they would be constraint violations
even if they were language extensions, although of course they
could be accepted if a diagnostic were produced. So in saying
MaxNumFiles is permitted as a CE, what I meant was it can be
accepted by an implementation without requiring any diagnostic
(with the additional qualifier, already explained, that there is
no portable requirement that it be accepted).

Are there any real-world C implementations that take advantage of this
permission? Specifically, is there a C compiler that accepts
additional forms of constant expressions in (what it claims to be)
conforming mode?

Yes, I believe (some versions of?) gcc to be in this category.

And why is that permission there in the first place? What benefit
does it really provide beyond the existing permission to provide
extensions?

It seems clear that the point is to allow additional forms of
constant expression without absolutely insisting on generating a
diagnostic; in other words to leave the question of diagnostics
up to the discretion of the implementation. Without 6.6p10 any
other forms of constant expression wouldn't meet the Standard's
definition, and if used in places that need constant expressions
would cause constraint violations.

Ben Bacarisse · Mar 5, 2010

... It's either me, or now two of you not noticing (or ignoring) that
"10 == i" satisfies the second but not the first.

I assumed that the "good question" reply was intended to draw Seebs's
attention to that fact; otherwise I'd have pointed it out. Looking
at it now, with all of Richard's "why" questions, it seems possible
that the reply was not meant the way I assumed it was.

Either way, it's not really at the heart of the question unless you
think the switch *caused* the typo.

<snip>

Boomer trying to learn coding in C and C++	6	Dec 16, 2022
Some advice on getting into coding?	5	Jun 14, 2021
embarrassing spaghetti code needs stylistic advice	72	Mar 20, 2009
C coding guidelines	99	Aug 26, 2009
Three stupid C questions: ++,...	23	Jun 11, 2012
UNIX questions should be considered Semi-On Topic on clc	59	Jan 30, 2009
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Three stupid C questions	16	Aug 31, 2010

Stylistic questions on UNIX C coding.

Keith Thompson

Seebs

Keith Thompson

Seebs

Ersek, Laszlo

Ike Naar

Vladimir Jovic

Seebs

Nick

Keith Thompson

Stefan Ram

Willem

Ike Naar

Seebs

Keith Thompson

Nick Keighley

Rainer Weikusat

Ersek, Laszlo

Tim Rentsch

Ben Bacarisse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads