bitfield confusion

I

Ian Collins

Joe said:
Exactly. I don't remember whether it was moving code from 68K to
Sparc, or from Sun Sparc Solaris to i386 Linux that did it... but
virtually all my bitfields were broken. I don't think I've used one
since that particular burning.

Which is odd, considering both Solaris and Linux provide bit-field
ordering macros (_BIT_FIELDS_LTOH and __BYTE_ORDER respectively) to
support big and little endian targets.
 
E

Eric Sosman

Eric said:
On 7/12/2013 5:50 PM, Ian Collins wrote:
[...]
I really don't understand why people get so hung up about bit
fields, or
why they'd want to muck about with shifts and masks. That low level
stuff is the compiler's job. It isn't rocket science to determine the
order of bit fields (my day to day platform has preprocessor macros for
this) and to use them correctly and portably.

Bit-fields are useless as a means of mapping an externally-
defined format portably.

Tell that to the writers of most platform's IP headers.
This is just a special case of "structs are useless as a
means of mapping an externally-defined format portably," except
that when the struct has bit-fields it's even worse.

"portability" has many meanings, ranging from "theoretically portable"
to "works on windows"...
For a specified compiler and target you may be privy to
extra information that allows you to define a struct (with or
without bit-fields) that matches a particular externally-defined
format. But don't kid yourself by imagining that the recipe
for one compiler/target pair will work with the next.

Some do (like IP related headers) and some don't. How you tackle
marshaling them depends on the detail. Would you use masks to extract
data from and insert data into an IP header, or would you use the
structs provided by the system?
As for macros -- Well, let's just start and end with the
observation that the nature and size of the "addressable storage
unit" that holds bit-fields is entirely the implementation's
prerogative, and since the implementation is not even obliged
to document it (it is "unspecified," not "implementation-defined")
the only way you can define your macros is by hoping the compiler
tells you more than is required, or by resorting to guesswork
and hope.

As I said in another response, in all of the real wold situations I have
seen, common sense prevails here.
Structs (with or without bit-fields) tempt you, they seduce
you, they lead you on and go nudge-nudge-wink-wink to entice
you into using them to map external formats. But you'll hate
yourself in the morning, even if you don't find yourself in the
gutter minus your wallet and watch and plus a venereal disease.>

:)

I hope for your sake it's not penicillin-resistant.

Hmmm. Okay, maybe that's a bit too telegraphic. Let's go
into a little more detail, shall we?

First: "Tell that to the writers of most platform's [sic]
IP headers." Sure, I'll tell them their headers are not portable.
I'll say the same about the platform's <setjmp.h>, <stdint.h>, and
<float.h>: All are non-portable, in the sense that the file that
works with one implementation need not work with the next.

Second: "Portability has many meanings, ..." Yeah. One of
those meanings, is "Bit-fields ain't a portable way to map an
externally-defined format." The recipe that works on one platform
may not work with the next.

Third: "Some do, some don't." Or in other words, it's not
portable. Would I use what the system provided? Yes, of course,
just as I would use <stdio.h> and FILE*: fopen() and fprintf()
and so on are portable; the nature of FILE is not.

Fourth: "In all of the real wold [sic] situations I have
seen, common sense prevails here." The lack of any definition
of "common sense" leaves enough wiggle room for a septillion
slippery slimy eels. Even so, "I've never seen it, ergo it
merits no consideration" is an argument unworthy of debate.[*]

[*] Despite the politicians, who will also say "I've never
seen it, ergo it's a CRISIS requiring MORE FUNDS" and will
maintain both positions simultaneously.
 
R

Roberto Waltman

Ian said:
... Nine times out of ten an embedded project
uses a single compiler, so even if portability was an issue (which it
seldom is) it is irrelevant.

That doesn't apply to my current (embedded) environment, where we need
to write and support portable libraries that are used without change
on several different architectures, under several operatings systems,
and compiled with several different compilers.
(I am talking about down-to-the-metal device control libraries &
drivers)
 
I

Ian Collins

Eric said:
Hmmm. Okay, maybe that's a bit too telegraphic. Let's go
into a little more detail, shall we?

First: "Tell that to the writers of most platform's [sic]
IP headers."

The apostrophe is portable to all known English dialects...
Sure, I'll tell them their headers are not portable.
I'll say the same about the platform's <setjmp.h>, <stdint.h>, and
<float.h>: All are non-portable, in the sense that the file that
works with one implementation need not work with the next.

So you would avoid setjump because it isn't portable? Obviously not. If
at platform provides the means to use a feature, use it. This case may
be constrained to those platforms which support networking, but you
wouldn't be manipulating IP packets anywhere else, would you?
Second: "Portability has many meanings, ..." Yeah. One of
those meanings, is "Bit-fields ain't a portable way to map an
externally-defined format." The recipe that works on one platform
may not work with the next.

True, but if it does, use it. Probably most code that works with IP
packets isn't 100% portable C, it will contain platform specific code.
The same applies to the other common use for bit-fields, drivers. These
are even more platform specific.
Third: "Some do, some don't." Or in other words, it's not
portable. Would I use what the system provided? Yes, of course,
just as I would use <stdio.h> and FILE*: fopen() and fprintf()
and so on are portable; the nature of FILE is not.

Fourth: "In all of the real wold [sic] situations I have
seen, common sense prevails here."

I'll give you that typo.
The lack of any definition
of "common sense" leaves enough wiggle room for a septillion
slippery slimy eels. Even so, "I've never seen it, ergo it
merits no consideration" is an argument unworthy of debate.[*]

If there is more than one compiler for a given platform, they are very
unlikely to order bit-fields differently. Yes I haven't seen them all,
but I have seen a lot over the past 30 years or so. If you know of a
counterexample, I'm interested.
 
I

Ian Collins

Roberto said:
That doesn't apply to my current (embedded) environment, where we need
to write and support portable libraries that are used without change
on several different architectures, under several operatings systems,
and compiled with several different compilers.
(I am talking about down-to-the-metal device control libraries &
drivers)

Those are compelling reasons not to use bit-fields!
 
L

Les Cargill

Joe said:
Exactly. I don't remember whether it was moving code from 68K to
Sparc, or from Sun Sparc Solaris to i386 Linux that did it... but
virtually all my bitfields were broken. I don't think I've used one
since that particular burning.


I'd say fixing struct definitions which implement bit fields has
to be easier than some of the other strategies I've seen. I'd
be a bit (heh) nervous not hiding these in a header
file.
 
E

Eric Sosman

Eric said:
Hmmm. Okay, maybe that's a bit too telegraphic. Let's go
into a little more detail, shall we?

First: "Tell that to the writers of most platform's [sic]
IP headers."

The apostrophe is portable to all known English dialects...

Amend your misuse of the plural possessive apostrophe by
consulting any standard reference for English. A few that are
easily accessible offer:

"two cats' toys
three friends' letters
the countries' laws"
-- http://owl.english.purdue.edu/owl/resource/621/01/

"Singers' voices
The cousins' favorite uncle"
-- http://www.meredith.edu/grammar/plural.htm#apostrophe

"two boys' hats two women's hats
two actresses' hats
two children's hats
the Changs' house
the Joneses' golf clubs
the Strauses' daughter
the Sanchezes' artwork
the Hastingses' appointment
the Leeses' books"
-- http://www.grammarbook.com/punctuation/apostro.asp

In all these examples (and others you may find for yourself),
observe the position of the possessive apostrophe when applied to
plurals that end and that do not end in s. Contrast and compare
with "most platform's." Turn over your test papers and begin
writing ... NOW. Five minutes, neatness counts.
 
E

Eric Sosman

Eric said:
Sure, I'll tell them their [platform-specific] headers are not portable.
I'll say the same about the platform's <setjmp.h>, <stdint.h>, and
<float.h>: All are non-portable, in the sense that the file that
works with one implementation need not work with the next.

So you would avoid setjump because it isn't portable? Obviously not. If
at platform provides the means to use a feature, use it. This case may
be constrained to those platforms which support networking, but you
wouldn't be manipulating IP packets anywhere else, would you?

The point you're missing is that one platform's headers are not
portable to another platform, even if the interfaces they describe
are. The bit-fields in the headers' structs (if they use them) are
specific to the platform for which they were written, and will not
necessarily work on other platforms, not even on *any* other platform.
Offering bit-fields as a portable means to achieve an externally-
defined format is folly: It may be possible to match the format on
Platform P with Declaration D, but Declaration D could yield an
entirely different format when compiled on Platform Q.
The lack of any definition
of "common sense" leaves enough wiggle room for a septillion
slippery slimy eels. Even so, "I've never seen it, ergo it
merits no consideration" is an argument unworthy of debate.[*]

If there is more than one compiler for a given platform, they are very
unlikely to order bit-fields differently. Yes I haven't seen them all,
but I have seen a lot over the past 30 years or so. If you know of a
counterexample, I'm interested.

Portability between multiple compilers for a single platform
is occasionally of interest, typically when you're compiling your
own source with Compiler A but using somebody else's binary library
built with Compiler B. Such situations are far from unknown -- I've
dealt with them often enough. (I recall once finding a SIGSEGV in
a binary-only library whose supplier was not interested in fixing it,
and using hex-mode Emacs to turn it into a slow memory leak instead.
Not ideal, but better to let the balloon hiss slowly than go POP!)

However, when people speak of "portability" it seems to me they
are more usually concerned with portability between different systems,
possibly running on different hardware. Development being expensive,
one wants to produce code that can move with minimal effort from
Windows to Linux to Solaris to OS/400, from x86 to x64 to ARM to
Itanium to SPARC to PowerPC to ... If you want to match a dictated
format on all these platforms and more, you should avoid bit-fields;
in fact, you should avoid structs altogether for this purpose.
 
J

James Kuyper

JohnF wrote: ....
You can type the structs in such that they are
packed properly. You may have to explicitly declare pad/filler.

You can then build an appliance to "walk a one" through a
pad o' memory overlaying a struct, then see which struct fields change
and how. The output from such a tool can be a controlled
file , against which you can diff. So you can do this as part of the
make, and flag the build when the diff fails.

I don't know of a more methodical ... approach than that.


Depends on what you mean buy "portable".

What I mean by "portable" is: works on systems that, because the
standard doesn't mandate such a pragma, provides one with a syntax
incompatible with the one you're used to using, or possibly don't even
provide one at all. Also, "portable" means that it works on systems
that, because the standard doesn't prohibit such padding, inserts some
in locations where you mistakenly thought such padding could not be
inserted.

....
BTW, I think the 'C' people missed the opportunity to make
this portable and that mistake was unnecessary.

I find the standard's failure to more tightly specify struct layouts a
big annoyance, and it's one of the first things I'd fix if I had to the
power to do so - but you have to ignore history to declare it
unnecessary. There's a reason the standard leaves those things
unspecified: at the time it was first written, existing implementations
differed radically in how such things were handled, and if the standard
had specified those details, many implementations would have been able
to conform only at the cost of breaking backward compatibility with
existing binaries an existing data files. That's still true today. If
and when this ever gets fixed, the costs are going to be huge.
I think there's a baby in that bathwater.

There's a small town in that bathwater - but it's still true, as far as
truly portable code is concerned. If the variety of systems your
software works on is small enough that you can get away with using
structs to parse or create externally-specified data structures, then it
wouldn't qualify as "truly portable" in my books.
 
I

Ian Collins

James said:
My code has to read data stored in formats not too dissimilar from the
ones described below, and it should do so correctly when ported to many
wildly different platforms. Same-platform compatibility would buy me
almost nothing of value.

OK, time for me to sum up and shut up...

Yes I generally agree with Eric's comment "structs are useless as a
means of mapping an externally-defined format portably". I had a career
changing moment many years ago when I had to get some 68K code running
on SPARC. The code used packed structs to map protocol message packets.
These had a single byte initial type field that caused all of the
following data to be misaligned. Not a problem on 68K, but calamitous
on SPARC! Getting that code to run on the Sun box without a rewrite was
one of the most interesting challenges I've faced.

While inappropriate for portable code, bit-fields do have a place in
code that is inherently not portable, such as drivers or the platform
specific layers in otherwise portable code. One example of the latter
would be the lower layer packet manipulation in a socket or other
network protocol library.
So? I'm talking about the code that performs that data parsing.

On at platform where I am happy to use bit-fields I would uses them.
Otherwise mask and shift.
For purely internal data structures, I wouldn't bother using bit fields
unless my program will be storing a great many copies of the data
structure. The slow access times for bit fields will generally hurt my
program's performance more than the wasted memory - people working in
more memory-constrained environments will obviously have different
preferences.

Agreed.
 
K

Keith Thompson

JohnF said:
The macros I posted in preceding followup are from a portable,
ansi standard, C program that generates gif images, according to
the well-documented gif89 standard. Besides your kind of problem,
C doesn't even guarantee structs are packed. So you can't just
memcpy(block,struct,sizeof(struct)). Many compilers have a pragma
or other mechanism to get packed structs, but there's no portable
way to guarantee it.
[...]

And even the non-portable mechanisms can lead to unsafe code.

For example, if you pack a structure so that an int member doesn't meet
the CPU's alignment requirement, referring to the member directly will
generally work, as the compiler generates whatever code is needed to
make that work, but referring to it indirectly via an int* pointer can
cause your program to crash.

See this question and answer on Stack Overflow (both mine) for details
about gcc's implementation of this:

http://stackoverflow.com/q/8568432/827263
http://stackoverflow.com/a/8574291/827263
 
E

Eric Sosman

Calling code that works on only one platform "portable", just because it
will work on multiple compilers targeting that platform, is setting a
very low standard for portability - I suppose that compared to that
standard, the standard described above must seem very "high" indeed.

There's nothing wrong with platform-specific code - while our client
prohibits my company from producing such code, for other kinds of
software it can make a lot of sense to do things like using bit-fields
to parse data formats. I just find it funny to see the word "portable"
used in the context of such techniques.

After a lengthy (and sometimes bordering on the rancorous)
debate, I think a large part of my disagreement with Ian Collins
comes down to a different slant on what "portable" means. Given
some externally-defined layout involving items that aren't easily
mapped to discrete bytes:

- I maintain that you can't write a struct declaration using
bit-fields that will match the desired layout on all platforms.
I'm right, of course.

- Ian holds that on any given platform odds are that there's a
way to write a struct declaration using bit-fields that will
yield the desired layout. He's right, of course.

That is, I'm considering a specific declaration D and pointing
out that it won't work on all platforms; "Therefore, bit-fields as
a match for external formats are not portable." Meanwhile, Ian
argues that there is a family of declarations D*, and that on any
platform of interest some member of that family satisfies the
requirements; "Therefore, bit-fields as a match for external
formats are (almost certainly) portable." The whole thing comes
down to point of view: The technique is very probably portable,
even though specific instances are not so.

Method versus instance -- that's the root of my misunderstanding.
 
T

Tim Rentsch

Eric Sosman said:
Eric Sosman wrote:
... [snip]

- Ian holds that on any given platform odds are that there's a
way to write a struct declaration using bit-fields that will
yield the desired layout. He's right, of course.

I don't agree. There are plenty of cases where you can, but
also plenty of cases where you can't, even if we stipulate
that the layouts are "natural", and that the platforms and
compilers are mainstream (eg, x86 and gcc).

To be fair, there are also plenty of cases where such fields
can't be extracted by simple shifting and masking either.
These things tend to depend on the impedance match between
the layout and the underlying architecture -- some layouts
just don't work very well on some architectures, and other
layouts on others.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top