Bit-field union bug

Q

Quentin Pope

On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.


struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};
};
 
K

Keith Thompson

Quentin Pope said:
On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.


struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};
};

Whether it's a bug or not, it's not a violation of the C standard.
Compilers can legally insert arbitrary padding between any two members
of a struct, or after the last one.

I think I've seen some compilers that use the declared type of a bit
field to determine its alignment; for example, given
unsigned char x:1
unsigned short y:1
unsigned int z:1
x, y, and z might be aligned on 1-byte, 2-byte, and 4-byte boundaries,
respectively. I don't think there's any support for this in the
standard, which only requires support for bit fields of types int,
signed int, unsigned int, and (new in C99) _Bool.
 
I

Ian Collins

On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.

I don't. Each struct member has to be correctly aligned. There is
nothing in the standard that says adjacent bit field structs should be
coalesced.
struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;

struct keyword missing.
s_trans_id trans_id;
};

Here you have an unnamed structure member, with is isn't standard, so
the compiler could do what it likes, including ignoring it.
 
J

jacob navia

Le 31/08/11 23:23, Quentin Pope a écrit :
On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.


struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};
};

Most compilers have some option to control alignment.

#pragma pack(1) (For lcc-win or Microsoft coompilers)

__attribute__((packed)) (for gcc)

Read your compiler documentation on how to control the alignment.
 
J

Joe Pfeiffer

Quentin Pope said:
On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.


struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};
};

While I would agree with you, the standard disagrees with both of us.
I'm not quite sure what the point of a bitfield is when the compiler is
free to insert arbitrary padding between elements, but that is indeed
the case.
 
J

James Kuyper

On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.


struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};

Ian has already pointed out to you the syntax errors in that declaration.

"An implementation may allocate any addressable storage unit large
enough to hold a bit-field. If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit." (6.7.2.1p10)

However, the thing that follows proto is not another bit field, but a
union. The union contains two structs, both structs contain bit fields,
but those bit fields are not members of the same struct that proto is.
Therefore, 6.7.2.1p10 does not apply.

Here's an alternative approach that should cause the fields to be merged
into the same storage unit. However, there's no guarantees about what
size the "storage unit" is. It could have a size of 8.

struct s_skip_ind {
unsigned proto:4;
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned proto:4;
unsigned trans_val:3;
unsigned trans_id:1;
};

union foo {
struct s_skip_ind skip_ind;
struct s_trans_id trans_id;
};


Note that, with this definition, both 'proto's are required to occupy
the same bits (6.5.2.3p5).
 
S

Seebs

While I would agree with you, the standard disagrees with both of us.
I'm not quite sure what the point of a bitfield is when the compiler is
free to insert arbitrary padding between elements, but that is indeed
the case.

The point is that the compiler is *permitted* to coalesce bitfields. But
not required to, because some systems really don't provide meaningful support
for them.

Basically, bitfields aren't expected to provide an absolutely reliable
mapping onto a series of bits; they're expected to provide a way that you
can hint that you don't need a lot of storage, such that compilers can in
some cases save a few bytes.

-s
 
J

James Kuyper

The point is that the compiler is *permitted* to coalesce bitfields. But
not required to, because some systems really don't provide meaningful support
for them.

Actually, it is required to do so, when adjacent bit-fields in the same
struct fit in a single storage unit (6.7.2.1p10). And since CHAR_BIT is
required to be >=8, these bit-fields, had they been in the same struct,
would have been required to be stored in adjacent bits of the same
storage unit.
Basically, bitfields aren't expected to provide an absolutely reliable
mapping onto a series of bits; they're expected to provide a way that you
can hint that you don't need a lot of storage, such that compilers can in
some cases save a few bytes.

An implementation has a lot of freedom to allocate bit-fields; far too
much to make any portable use of them for communications outside of a
single program. However, there's not quite as much freedom as you
thought there was.
 
P

Peter Nilsson

Quentin Pope said:
On my compiler, the output of "sizeof(foo)" of the following
is 8, instead of 4.

Did your compiler issue the required diagnostic?
Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with
"proto".

I think this is a bug.

Depends whether you're using a conforming C compiler or not.
struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;

C doesn't act like C++ when it comes to struct tags.

C doesn't allow unnamed members in structs and unions, with the
exception of zero width bit-fields.

If we replace this with...

struct foo {
unsigned proto:4;
union {
struct s_skip_ind skip_ind;
struct s_trans_id trans_id;
} u;
};

....then you should realise that a conforming C compiler must be
able to support the following...

struct foo f;
union {
struct s_skip_ind skip_ind;
struct s_trans_id trans_id;
} *up = &f.u;

It would make the compiler's life a tad difficult if it had to
support that _and_ coalesce bit-fields in the way you would like.

See James Kuyper's response for an alternative method.
 
S

Seebs

Actually, it is required to do so, when adjacent bit-fields in the same
struct fit in a single storage unit (6.7.2.1p10).

Thanks for the correction. I had a vague sense that there were circumstances
where it was mandatory, and I hadn't thought through that the separate
struct declarations were the problem.

In fact, so far as I can tell, it's *not* permissible to coalesce adjacent
structs.

Assume CHAR_BIT is 8.

struct a { int x:4; };
struct b { struct a a1, a2; } example_b;

If struct b coalesced the bit fields, what would the address of example_b.a2
be? The "struct a" object MUST have a size of at least 1, and it has to be
an integer size.

So you have to calculate the padding of the struct a independently, and then
use that padded object as the sub-member in other structs.

(I wonder whether that's true in Plan 9 C of anonymous structure members...)

-s
 
J

James Kuyper

....
In fact, so far as I can tell, it's *not* permissible to coalesce adjacent
structs.

Of course. That was apparently not as obvious to the OP as it seems to me.
 
P

Phil Carmody

Seebs said:
The point is that the compiler is *permitted* to coalesce bitfields.

The second thing wasn't a bitfield, it was a union.

I would have been very surprised if the size hadn't been at least 8.
On something like a Cray, I'd not have blinked if it had been 16 bytes.
(Keith - you've used a Cray most recently, I think - can you shed any
light on that?)

Phil
 
K

Keith Thompson

Phil Carmody said:
The second thing wasn't a bitfield, it was a union.

I would have been very surprised if the size hadn't been at least 8.
On something like a Cray, I'd not have blinked if it had been 16 bytes.
(Keith - you've used a Cray most recently, I think - can you shed any
light on that?)

Not on bit fields, sorry.
 
J

James Kuyper

But, this is not two bit-fields. Here's the original code:

I already raised that point in an earlier message. In this message I was
dealing with a different point.

In the case described by the OP, a compiler is prohibited from
coalescing "proto" with any of the other bit-fields, because if the
unnamed union member had been given a name (as it should have been),
that member would have been required to have a distinct address.

However, in the case which I was referring to above, "adjacent
bit-fields in the same struct", it's not merely permitted to coalesce
such bit-fields, it's required to do so, if it can.

Either way, "permitted" was an incorrect description.
 
A

Andrey Tarasevich

On my compiler, the output of "sizeof(foo)" of the following is 8,
instead of 4.

Which indicates that the compiler is aligning the union at
the next int boundary, rather than coalescing it with "proto".

I think this is a bug.

struct s_skip_ind {
unsigned skip_ind:4;
};

struct s_trans_id {
unsigned trans_val:3;
unsigned trans_id:1;
};

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
};
};

No. You are making invalid conclusions from translating invalid code.

C language does not allow anonymous union members in the struct
declaration. Your declaration of `struct foo` is invalid C. If your
compiler accepted this declaration, it is a mere compiler-specific
extension, which has nothing to do with standard C language.

In C language you are required to give that union member an explicit
name, as in

struct foo {
unsigned proto:4;
union {
s_skip_ind skip_ind;
s_trans_id trans_id;
} u; /* <- the name is required */
};

Now note that the above member `u` should have its own offset and its
own address within `struct foo`, since at any place in your code you can
something like

struct foo f;
void *p = &f.u;

For this reason, the compiler simply cannot merge this union member with
the previous bit-field. It has too align `u` at the beginning of an
addressable memory unit.

Apparently your compiler was forgiving enough to let you (illegally)
omit the member name, but wasn't brave enough to take the next step and
do the merging.

In any case, if there is any compiler bug here, it is the failure of the
compiler to report the illegal anonymous union member, not the failure
to merge the bit-fields.
 
Q

Quentin Pope

Whether it's a bug or not, it's not a violation of the C standard.
Compilers can legally insert arbitrary padding between any two members
of a struct, or after the last one.

I think I've seen some compilers that use the declared type of a bit
field to determine its alignment; for example, given
unsigned char x:1
unsigned short y:1
unsigned int z:1
x, y, and z might be aligned on 1-byte, 2-byte, and 4-byte boundaries,
respectively. I don't think there's any support for this in the
standard, which only requires support for bit fields of types int,
signed int, unsigned int, and (new in C99) _Bool.

I see.

In this case, is there an equivalent of the offsetOf() function for
bitfield members, to allow programmers to determine whether or not
bitfields have been packed?

Cheers
QP
 
I

Ian Collins

I see.

In this case, is there an equivalent of the offsetOf() function for
bitfield members, to allow programmers to determine whether or not
bitfields have been packed?

Won't the sizeof operator tell you?
 
Q

Quentin Pope

Won't the sizeof operator tell you?

Not necessarily I don't think.

For example, imagine
struct foo {
unsigned short a:2;
unsigned short b:2;
unsigned short c:2;
unsigned short d:2;
};

If sizeOf(foo) is 3, then there is no way of telling whether a and b are
packed into 1 byte, or if it's b and c, or if it's c and d.

Cheers,
QP
 
B

Ben Bacarisse

Quentin Pope said:
Not necessarily I don't think.

For example, imagine
struct foo {
unsigned short a:2;
unsigned short b:2;
unsigned short c:2;
unsigned short d:2;
};

If sizeOf(foo) is 3, then there is no way of telling whether a and b are
packed into 1 byte, or if it's b and c, or if it's c and d.

Your point is valid but the example is not a good one. First, by using
unsigned short you've gone outside of standard C -- anything is
possible. If you were to replace short with int, then you would get
much more than think you might think you do:

In the "unsigned int" version, the C standard says that a, b, c and d
must be packed together, consecutively, in the first byte of the struct.
The "addressable storage unit" into which they get packed can't be less
than 1 byte in size and a byte can't be less than 8 bits wide. Which
one goes in the high-significance position is not specified but it is
must be either a or d and the other must follow in order.

However, I think it is often the case that code that needs to know how
bit fields are packed is using them beyond their natural purpose.
Shifting and masking gives you far more control over how things pack and
is usually the way to go when it matters.
 
I

Ian Collins

However, I think it is often the case that code that needs to know how
bit fields are packed is using them beyond their natural purpose.
Shifting and masking gives you far more control over how things pack and
is usually the way to go when it matters.

Or the code is written based on knowledge of the compiler being used. I
have often used bit fields to map hardware registers or bits in messages
headers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top