You're appointed as Portability Advisor

Tomás Ó hÉilidhe · Feb 15, 2008

We frequently discuss portability here. I think most of us would agree
that for a program to be portable, the following two criteria must be
met:
1) None of its behaviour is undefined by the Standard.
2) Any behaviour which is unspecified or implementation-defined does not
interfer with the intended behaviour of the program.

When laid out in black and white like that, these rules are quite clear.

However, let's consider this: Let's say you're appointed as the
Portability Advisor for a multi-national company that makes billions of
dollar each year. They pay you $500,000 a year, they have you working 30
hours a week and they give you 60 days paid holiday leave per year. They
don't block newsgroups, and their firewall only blocks the most
offensive of sites. They even get a Santa for the kids at the Christmas
party.

Your job is to screen the code that other programmers in the company
write. Every couple of days there's a fresh upload of code to the
network drive, and your job is to scan thru the code and point out and
alter anything that's not portable. Of course tho, you're given a
context in which to judge the code, for instance:
a) This code must run on everything from a hedge-trimmer to an iPod, to
a Playstation 3,
b) This code must run on all the well-known Desktop PC's

Depending on the context, you judge some code harsher than others. For
instance, in context B, you might allow assumptions that there's an even
number of bits in a byte, also that integer types don't contain padding.
While in context A, you might fire that code right back if you see such
assumptions.

So... it's Thursday morning, you sit down to your desk with a hot cup of
tea and a fig-roll bar, you check your mail. You surf the web for a
couple of minutes, perhaps check the latest scores in the election, or
look up where you can get a new electric-window switch for your car
since it mysteriously stopped working this morning.

You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world? Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?

Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).

The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.

So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

Two such rules I myself would put on the list are:
1) Accessing different union members
2) De-referencing a pointer to an array

Tomás Ó hÉilidhe · Feb 15, 2008

Tomás Ó hÉilidhe:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually

I meant for that to come out as:

typedef union ConfigFlags {
unsigned entire;
char unsigned bytes[sizeof(unsigned)];
} ConfigFlags;

Malcolm McLean · Feb 15, 2008

Tomás Ó hÉilidhe said:
We frequently discuss portability here. I think most of us would agree
that for a program to be portable, the following two criteria must be
met:
1) None of its behaviour is undefined by the Standard.
2) Any behaviour which is unspecified or implementation-defined does not
interfer with the intended behaviour of the program.

There's such a thing as code which is "reasonably portable". The standard
has to guess about what sort of hardware will be available in the future, as
well as what relics of the past will still be around in a few years' time
and which can be ignored.
Then C cannot go the Java route of mandating portability at the cost of
runtime. Even Java broke its own rules because floating point arithmetic was
too slow to standardise.

However as portability expert your job is to be strict. For instance I
thought that, surely, slash slash comments were standard by now. No, my MPI
compiler won't accept them.

Keith Thompson · Feb 15, 2008

TomÃ¡s Ã“ hÃ‰ilidhe said:
Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

[...]

If I haven't been reading comp.lang.c in the last few days, I spend a
few moments wondering what the heck that code is trying to do. Then I
step through it, and once I figure out what it does, I wonder why the
author wrote it that way, especially when there's a clearer and
unambiguously legal way to do the same thing:

double const *const pend = tangents + 5;

Of course 5 is a magic number, so either a named constant should be
used both for the array length and for the offset, or a macro should
be used to compute the length. For example:

#include <stdio.h>

#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))

int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
double const *const pbegin = tangents; /* just for symmetry */
double const *const pend = tangents + ARRAY_LENGTH(tangents);
double const *iter;

for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
return 0;
}

Note that a very similar approach can be used when we don't have a
declared array object, but just a pointer to its first element and its
length. There's no good way to apply the ``*(&tangents + 1)''
approach if you don't have the array object itself.

#include <stdio.h>

#define ARRAY_LENGTH(a) (sizeof (a) / sizeof (*a))

void show_elements(double *array, size_t count)
{
double const *const pbegin = array;
double const *const pend = array + count;
double const *iter;

for (iter = pbegin; iter < pend; iter ++) {
printf("%g\n", *iter);
}
}

int main(void)
{
double tangents[] = { 1.2, 2.3, 3.4, 4.5, 5.6 };
show_elements(tangents, ARRAY_LENGTH(tangents));
return 0;
}

Even leaving portability concerns aside, I find the latter approach
easier to read, easier to use, and easier to think about.

Richard Heathfield · Feb 15, 2008

Tomás Ó hÉilidhe said:

You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

This won't even run on MS-DOS, let alone "anything".

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure".

No, I look at this code and I think, "someone has assumed that unsigned
ints are at least four bytes wide", which isn't true on typical MS-DOS
systems (yes, they're still used, believe it or not), isn't true on
various DSPs, and came within a gnat's whisker of being true on at least
one Cray.

You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.

Yes, he has.

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?

Absolutely, yes.

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

No, I'm sitting there 100% aware that unsigned ints need not be four bytes
wide.

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants.

No, I sit here thinking "why is he being so dumb as to dereference an
object that does not exist?".

Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

Yes, of course. That's what they pay me for, right? "your job is to scan
thru the code and point out and alter anything that's not portable" - and
my yardstick for portability is the C Standard.

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world?

No, we're really going to be so acute as to reject this code in the real
world.

Richard Heathfield · Feb 15, 2008

Richard Heathfield said:

No, I look at this code and I think, "someone has assumed that unsigned
ints are at least four bytes wide", which isn't true on typical MS-DOS
systems (yes, they're still used, believe it or not), isn't true on
various DSPs, and came within a gnat's whisker of
not

being true on at least one Cray.

(The Cray implementation in question very nearly defined CHAR_BIT as 64,
but they changed their mind, apparently quite late in the game. Had they
not changed their mind, sizeof(unsigned) would have been 1 on that
implementation.)

Kaz Kylheku · Feb 15, 2008

We frequently discuss portability here. I think most of us would agree
that for a program to be portable, the following two criteria must be
met:
1) None of its behaviour is undefined by the Standard.

Depends on which standard you choose. According to the C one,

#include <fcntl.h>

is undefined behavior. As a portability advisor to real people working
in real trenches, you have to allow for extensions.

What we want to aim for is not avoiding extensions, but rather /
knowing/ what the documented extensions are and using them
deliberately.

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

You can write to member A and read through member B, if member B is a
character type that treats the thing as an array of bytes.

I would simply remove the member "entire" from the union, and change
it to a struct. Then see what breaks when you recompile the program,
and fix those occurences. The struct itself gives you the entire
thing. You can define objects of that struct type, pass them to
functions, return them, assign them, etc.

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants.

This can simply be edited to:

double *pend = tangents + sizeof tangents / sizeof tangents[0];

Or maybe the size should be a manifest constant somewhere:

double tangents[vector_size];
const double *const pend = &tangents[vector_size];

*(&tangents + 1) stops being neat once you get out of programming
puberty. Stuff like that looked neat when I was learning C for the
first time. It's not neat; it's just cryptic B. S.

99% of the C programmers out there have probably never seen an array
type manipulated as an array type---addresses being taken to make
pointer-to-array types, etc. Their heads will do a double, triple or
even quadruple ``take'' when they see that expression.

Are you, as the Portability Advisor, going to reject this code?

Of course! There is no need for it to be doing what it's trying to do
by that means. There is no need to rely on an undocumented extension
of behavior to get the desired effect.

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world?

Absolutely. At /least/ that obtuse, if not way more.

And don't forget that we have a $500K salary as portability advisor,
and so we must scramble for every little thing that can add a line or
two to our weekly status report, and that can make us look very sharp
and justify our job position in the eyes of senior management.

There is always that!

Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?

It's not stupidly restrictive when you can rewrite the expression in a
way that will not surprise most of the programmers out there, and that
doesn't break any rules.

A restriction is something that actually gets in your way; it makes
something impossible to do at all, or maybe only with an
unsatisfactory workaround.

Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).

There is

3) Wasting people's time when they have to scratch their heads about
what *(&array + 1) actually means and whether or not it's right.

Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

Two such rules I myself would put on the list are:
1) Accessing different union members

Definitely not; count me out from your webpage.

The purpose of a union is to save space in implementing polymorphism.
If you store member X, you read member X.

Type punning is inherently nonportable. It's not enough to say that
type punning is allowed through members of a union, but undefined
elsewhere. To define its behavior, it's not enough to simply permit
some action. The outcome of the action must be specified. And you
cannot do that because it's totally nonportable.

At best you could say that if a member Y is accessed after member X is
stored, then there shall be no aliasing problem: Y will be
reconstituted out of the bits that were actually stored through X.
However, that's not anywhere nearly complete a definition of behavior
to be practically useful. That kind of thing belongs in the
architecture-specific pages of a compiler reference manual, not in the
language.

2) De-referencing a pointer to an array

But that is allowed. I think you mean, dereferencing a pointer to one
element past the end of an array-of-array object, where it's not
pointing to any array.

I tend to agree with this.

That is to say, the address-of operator could have some additional
semantic rules, along these lines:

When the operand of the address-of operator is a pointer-
dereferencing
expression based on the unary * operator, the two operators
effectively
cancel each other out, so that &*(E) is equivalent to (E), provided
that (E) is valid pointer: either a pointer to an object, a null
pointer, or a pointer one element past the end of an array object.

I can't think of a way of allowing (E)->member or (*(E)).member to be
defined when E is null, or otherwise valid but not pointing to an
object. Is there a use for this other than implementing offsetof,
which is already done for you?

Ark Khasin · Feb 15, 2008

Tomás Ó hÉilidhe said:
We frequently discuss portability here. I think most of us would agree
that for a program to be portable, the following two criteria must be
met:
1) None of its behaviour is undefined by the Standard.
2) Any behaviour which is unspecified or implementation-defined does not
interfer with the intended behaviour of the program.

When laid out in black and white like that, these rules are quite clear.

However, let's consider this: Let's say you're appointed as the
Portability Advisor for a multi-national company that makes billions of
dollar each year. They pay you $500,000 a year, they have you working 30
hours a week and they give you 60 days paid holiday leave per year. They
don't block newsgroups, and their firewall only blocks the most
offensive of sites. They even get a Santa for the kids at the Christmas
party.

Your job is to screen the code that other programmers in the company
write. Every couple of days there's a fresh upload of code to the
network drive, and your job is to scan thru the code and point out and
alter anything that's not portable. Of course tho, you're given a
context in which to judge the code, for instance:
a) This code must run on everything from a hedge-trimmer to an iPod, to
a Playstation 3,
b) This code must run on all the well-known Desktop PC's

Depending on the context, you judge some code harsher than others. For
instance, in context B, you might allow assumptions that there's an even
number of bits in a byte, also that integer types don't contain padding.
While in context A, you might fire that code right back if you see such
assumptions.

So... it's Thursday morning, you sit down to your desk with a hot cup of
tea and a fig-roll bar, you check your mail. You surf the web for a
couple of minutes, perhaps check the latest scores in the election, or
look up where you can get a new electric-window switch for your car
since it mysteriously stopped working this morning.

You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

IMHO, the only reason to write portable code is to count on it being
used on various platforms and maintained either centrally or
collaboratively. Such code must be optimized for clarity and avoid any
discernible window of misunderstanding.

I've seen passages e.g. like this:
typedef struct foo {.......} foo;
foo *foo; ........; sizeof(foo);
I don't even bother to learn whether or how the (a?) standard (or a
dialect) resolves foo for the purpose of sizeof. Like in a spoken
language, there are many more correct constructs that make no or
ambiguous sense to us mortals.

To that end, MISRA is a great effort to define what intelligent C coders
may say in a polite society. IMHO, it's an overkill in many respects but
it's a great starting point.

In your union example, the result obviously depends on machine
endianness. Depending on the exact access patterns this dependency may
cancel itself out, but it's an unreasonable burden on the maintainer to
verify it throughout. Thus the code like this should be banned from the
portable club.

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world? Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?

Yes! Respectable people here in a yesterday's thread read the language
rules on this very subject differently. It means that the behavior is
not crystal clear. [And it might not be crystal clear to the compiler
writers either.]

Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation
2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).

The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.

So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

Two such rules I myself would put on the list are:
1) Accessing different union members
2) De-referencing a pointer to an array

Some people cobble together a compiler for, I'd say, a C-inspired
language just to sell their chip. If your portability requirements
include trimmers and such, you may step into this swamp. The standard
may not apply very well there.

If we favor simple constructs over "look what I can do!" we may get not
only more portable but also more efficient code, because the compiler's
optimizer may recognize more idioms. E.g. a dumb rotation of an unsigned
32-bit `a' left by `n' bits, (a<<n)|(a>>(32-n)) is translated by ARM ADS
into one rotate instruction; any attempt to get clever produces worse code.

OTOH, if you have a choice, you shop for a good compiler first; that
depends on how many "look what I can do!" you need to keep in the code.

Just my $0.02F

Ben Bacarisse · Feb 15, 2008

TomÃ¡s Ã“ hÃ‰ilidhe said:
Your job is to screen the code that other programmers in the company
write.

You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure". You have a second suspicion that
perhaps James might have made assumptions about the size of "unsigned",
but inspecting the code you find that he hasn't.

Already pointed out that the code does assume that sizeof(unsigned) >= 4.

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard, are
you really going to reject this code?

You're sitting there 100% aware that the Standard explicitly forbids you
to write to member A of a union and then read from member B, but how
much do you care?

It does not. Not as far as I can see, anyway. It says the result is
"unspecified" with an informative footnote to tell us what range of
unspecified behaviour to expect. Basically we are referred to the
Representation of Types section so, as Portability Tsar, I am quite
happy with the type punning aspect of the code...

*But* I'd reject it, even if we could assume that sizeof(unsigned) >=
4 because one part of that unspecified behaviour is that the
resulting configuration file won't move between systems. If the
config file is on an NFS server, the bits in cf.bytes[3] will depend
on the target architecture the program was compiler for.

This is a penalty that /might/ be worth paying, but not if the
alternative is as simple as writing a 4 bytes array.

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants. Again, you know that the
Standard in all its marvelous rigidity doesn't want you to dereference
that pointer, but are you bothered? Are you, as the Portability Advisor,
going to reject this code?

Yes. This is s clear-cut case. The behaviour is undefined by the
standard. I have posted opinions that suggest I'd like it not to be,
and I am not 100% persuaded that there was any practical reason for
making it so -- but it is. As we speak, compiler writers are tuning
their optimisers, safe in the knowledge that they can do anything they
like with this code. I would not want my product to be in their
hands.

Again, if the payoff is huge, and the alternatives costly, they a case
could be made, but there are too may alternatives here. At the very
least, (void *)(&tangents + 1) has a clearer meaning than above and is
well-defined.

What I'm trying to get across is, that, while we may discuss in black
and white what the Standard permits and what it forbids... are we really
going to be so obtuse as to reject this code in the real world?

How is it obtuse to stick to the standard where practical? Both your
examples have potential risks attached and few benefits. There is
nothing obtuse about avoiding these risks.

Are we
really going to reject some code for a reason that we see as stupidly
restrictive in the language's definition?

Perhaps it might be useful to point out what exactly can go wrong when
we're treading on a particular rule. In both these cases I've mentioned,
I don't think anything can go wrong, not naturally anyway. What _can_
cause problems tho is aspects of the compiler:
1) Over-zealous with its optimisation

I don't think it can be over-zealous if it does not break a correct
program. This is the whole point. If you stick to the letter of the
law you can't be banged up!

2) Deliberately putting in checks (such as terminating the program when
it thinks you're going to access memory out-of-bounds).

The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.

So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

Two such rules I myself would put on the list are:
1) Accessing different union members

OK as it stands, I think, but often a portability nightmare for
practical reasons due to differing representations.

2) De-referencing a pointer to an array

You mean de-referencing a "one past the end" array pointer (for want
of more felicitous wording). I'd be happy if this was allowed in C0x,
but I'd live with any of the alternatives if it were not.

The best example of how this has happened in the past is the now
sanctioned struct array hack.

Keith Thompson · Feb 15, 2008

Ark Khasin said:
IMHO, the only reason to write portable code is to count on it being
used on various platforms and maintained either centrally or
collaboratively. Such code must be optimized for clarity and avoid any
discernible window of misunderstanding.

IMHO, that's hardly the *only* reason to write portable code. There
are benefits even if the code will never be run on more than one
platform. Most of the time, portable code is simpler and clearer than
code that depends on implementation-specific features. (Not all the
time, just most of the time.)

I've seen passages e.g. like this:
typedef struct foo {.......} foo;
foo *foo; ........; sizeof(foo);
I don't even bother to learn whether or how the (a?) standard (or a
dialect) resolves foo for the purpose of sizeof. Like in a spoken
language, there are many more correct constructs that make no or
ambiguous sense to us mortals.

That particular construct is illegal, unless the typedef is declared
in an outer scope and the object in an inner one. In that case, the
declaration is legal, and the "foo" in sizeof(foo) refers to the
innermost declaration (the object) -- but the object declaration hides
the typedef, which is a lousy idea. For example:

typedef struct foo { struct foo *foo; } foo;
{
foo *foo; /* legal, alas */
foo *bar; /* illegal, since the typedef name is hidden */
}

That's just a minor quibble, though; I agree that there are plenty of
things you can legally do that you nevertheless shouldn't. The
correct answer to "What does this code do?" is often "It gets rejected
at the code review.".

[snip]

Kenneth Brody · Feb 15, 2008

Richard said:
Tomás Ó hÉilidhe said:

You get down to it. You open up the network drive and navigate to James
Weir's source file. Its context is "run on anything". You're looking
thru it and you come to the following section of code:

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually
*/

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

Click to expand...

This won't even run on MS-DOS, let alone "anything".

And we're assuming that the issue about the comment is just a typo
by the OP, and not in "James Weir's source file".

No, I look at this code and I think, "someone has assumed that unsigned
ints are at least four bytes wide", which isn't true on typical MS-DOS
systems (yes, they're still used, believe it or not), isn't true on
various DSPs, and came within a gnat's whisker of being true on at least
one Cray.

And perhaps unsigned ints are 8 bytes on some new 64-bit systems? (At
least the above then doesn't invoke UB, however.)

Yes, he has.

He's also made an assumption about endianness.

Absolutely, yes.

Ditto. Even though every system I happen to work on at the moment
uses 4-byte unsigned ints, some are big-endian and others are little.

[...]

No, we're really going to be so acute as to reject this code in the real
world.

CBFalconer · Feb 15, 2008

Tomás Ó hÉilidhe said:
.... snip ...

The first thought I think comes to everyone's mind when we're talking
about these unnecessarily rigid rules, is that the Standard just needs
to be neatly amended. But, of course, it's C we're talking about, where
the current standard is from 1989 and where still not too many people
are paying attention to the 1999 standard that came out nine years ago.

So, I wonder, what can we do? If there was a consenus between many of
the world's most skilled and experienced C programmers that a certain
rule in the Standard were unnecessarily rigid, would it not be worth the
compiler vendors' while to listen? Here at comp.lang.c, there are,
without exageration, some of the world's best C programmers. Instead of
contacting each and every compiler vendor to let them know that we'd
prefer to optimise-away assignments to union members, would it be
convenient, both for the programmers and the compiler vendors, to have a
single place to go to to read what the world's best programmers think?
Should we have a webpage that lists the common coding techniques that
skilled programmers use, but which are officially forbidden or "a grey
area" in the Standard?

You're off-topic on c.l.c. This sort of thing belongs on
comp.std.c.

Old Wolf · Feb 15, 2008

Your job is to screen the code that other programmers in the company
write.
a) This code must run on everything from a hedge-trimmer to an iPod, to

typedef union ConfigFlags {
unsigned entire; /* Write to all bytes
char unsigned bytes[sizeof(unsigned)]; at once or access
} ConfigFlags; them individually

int IsRemoteAdminEnabled(ConfigFlags const cf)
{
return cf.bytes[3] & 0x3u;
}

You look at this code and you think, "Hmm, this chap plans to write to
'entire' and then subsequently read individual bytes by using the
'bytes' member of the structure".

I would think "Hmm, this guy got Tomas/JKop from c.l.c to
write his code". (I've never seen anybody else write 'char
unsigned').

Now, the question is, in the real world, at 10:13am on a sunny Thursday
morning, sitting at your desk with a hot cup of tea, munching away on a
fig-roll bar getting small crumbs between the keys on the keyboard,

What does that have to do with it?

are you really going to reject this code?

Of course you are. On some systems the flag is cf.entire & 3;
on others it's cf.entire & (3 << 24); and on others it's
something else entirely. You'd have to dig through lots more
code to check that this code snippet is correct.

You're sitting there 100% aware that the Standard explicitly
forbids you to write to member A of a union and then read
from member B

Perhaps you are looking for comp.lang.c++

but how much do you care?

If your job is to make the code portable, then you care
about undefined behaviour.

Later on in the code, you come to:

double tangents[5];
...
double *p = tangents;
double const *const pend = *(&tangents + 1);

Again, you look at this code and you think to yourself this really is
quite a neat way of achieving what he wants.

Come on, you wrote this whole post just to keep trying
to rationalize your ridiculous "clever" idea?

Any sane person would think: What the hell is that?,
fire the guy, and fix the code to be readable and correct.

Richard Heathfield · Feb 15, 2008

Old Wolf said:

I would think "Hmm, this guy got Tomas/JKop from c.l.c to
write his code". (I've never seen anybody else write 'char
unsigned').

What about that chap of Irish descent from a year or two back? Graham
somebody, wasn't it? The one who kept using "domestic" instead of
"canonical"? Hmmm, "domestic", grummage grummage grummage, oh yes, it was
indeed Graham somebody, for certain values of Graham: Frederick Gotham, in
fact. He was a great one for saving up adjectives, and "char unsigned" was
indeed amongst his specialites[1] du 2006.

[1] Insert accents to taste.

Tomás Ó hÉilidhe · Feb 15, 2008

Richard Heathfield:

This won't even run on MS-DOS, let alone "anything".

Wups, that's what I get for rushing an example.

The moral of the original post tho:

Do you think that comp.lang.c should have a page where it lists its
"ammendments" to the Standard?

Richard Heathfield · Feb 15, 2008

Tomás Ó hÉilidhe said:

Richard Heathfield:

Wups, that's what I get for rushing an example.

The moral of the original post tho:

Do you think that comp.lang.c should have a page where it lists its
"ammendments" to the Standard?

Personally, no, I don't think that's necessary.

The comp.lang.c newsgroup discusses the C language as it is (de jure, de
facto, and de historio(?)), not as clc would like the C language to be.
That's comp.std.c's job, I guess.

Tomás Ó hÉilidhe · Feb 15, 2008

Old Wolf:

(I've never seen anybody else write 'char
unsigned').

Are you some how arguing that your ignorance of certain programming
styles and techniques somehow renders them deprecated? 99% of C
programmers have never seen anyone use sizeof without it immediately
being followed by parentheses, so are you saying that we should always
put parentheses after sizeof lest we get bullied by the Style Police?

Of course you are. On some systems the flag is cf.entire & 3;
on others it's cf.entire & (3 << 24); and on others it's
something else entirely. You'd have to dig through lots more
code to check that this code snippet is correct.

As has already been pointed out, my snippet was erroneous. The point of
my post tho wasn't to do soley with that snippet.

If your job is to make the code portable, then you care
about undefined behaviour.

It's the first thing you'd care about, I'd imagine.

Come on, you wrote this whole post just to keep trying
to rationalize your ridiculous "clever" idea?

That's one reason, yes. I liked the idea I had for getting the end
pointer of a array, and I could see no reason not to use it other than
politics. More specifically, I couldn't use it just because the
Standard, maybe, might, perhaps, said that I couldn't.

Tomás Ó hÉilidhe · Feb 15, 2008

Richard Heathfield:

<snip tripe>

Forgetting ad hominem attacks for the moment, (or whatever it is you
(plural) are actually trying to do to cast a derisory eye on my proposal),
what do you think of the prospect of comp.lang.c having a page where it
lists its ammendments to the Standard, ammendments that are agreed upon by
some of the world's best C programmers? And before someone goes on to post
more tripe suggesting arrogance or pretentiousnness on my part by implying
that I expressed, either explicitly or implicitly, that I'm a part of this
group, well I have made and make no such expression.

Tomás Ó hÉilidhe · Feb 15, 2008

Richard Heathfield:

The comp.lang.c newsgroup discusses the C language as it is (de jure, de
facto, and de historio(?)), not as clc would like the C language to be.
That's comp.std.c's job, I guess.

Because some of the world's most skilled and most experienced C programmers
hang around comp.lang.c, do you think it would be reasonable to say that it
might be a bit of an authority on these matters?

I agree with you entirely that it is comp.std.c's job, but I think we both
know that lobbying for changes to the C standard is a lost cause. (Even if
the changes were to come to fruition, nobody would pay attention).

The page I'm proposing would be something like "Stupid stuff in the
Standard that shouldn't be there, and stuff that they left out that should
be there".

Malcolm McLean · Feb 15, 2008

Richard Heathfield said:
(The Cray implementation in question very nearly defined CHAR_BIT as 64, >
but they changed their mind, apparently quite late in the game. Had they
not changed their mind, sizeof(unsigned) would have been 1 on that
implementation.)

Come to my arms.

The portability sacred cow	40	Apr 20, 2014
Portability issues (union, bitfields)	7	Nov 4, 2009
Portability	29	Apr 6, 2008
Portability	25	Jun 21, 2005
#pragmas for portability	13	Jul 18, 2010
portability	5	May 22, 2007
portability issues and limits.h	1	Sep 22, 2011
Ordenate and remove duplicate cases of an array passed as a c function argument	0	Sep 27, 2022

You're appointed as Portability Advisor

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Malcolm McLean

Keith Thompson

Richard Heathfield

Richard Heathfield

Kaz Kylheku

Ark Khasin

Ben Bacarisse

Keith Thompson

Kenneth Brody

CBFalconer

Old Wolf

Richard Heathfield

Tomás Ó hÉilidhe

Richard Heathfield

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads