Bit-fields in unsigned short

D

dspfun

Hi,

I want to use bit-fields in unsigned short types since it makes code a
lot easier to read and I don't have to do bit-masks and bit-shifts to
access the individual bit-fields.

However, it seems as C99 only allows bit-fields for unsigned char and
unsigned int, and bit-fields in unsigned short is a GCC extension.

I get the following compilation warning
gcc -Wall -pedantic bit_field_unsigned_short_test.c
bit_field_unsigned_short_test.c:6: warning: type of bit-field 'field1'
is a GCC extension
bit_field_unsigned_short_test.c:7: warning: type of bit-field 'field2'
is a GCC extension
bit_field_unsigned_short_test.c:8: warning: type of bit-field 'field3'
is a GCC extension

When I compile the following code:

gcc -Wall -pedantic bit_field_unsigned_short_test.c
bit_field_unsigned_short_test.c: In function 'main':
bit_field_unsigned_short_test.c:8: warning: type of bit-field 'field1'
is a GCC extension
bit_field_unsigned_short_test.c:9: warning: type of bit-field 'field2'
is a GCC extension
bit_field_unsigned_short_test.c:10: warning: type of bit-field
'field3' is a GCC extension

How can I use bitfields in unsigned short without getting the warnings
about bit-fields and still use -pedantic flag?

Brs,
Markus
 
B

Ben Bacarisse

dspfun said:
I want to use bit-fields in unsigned short types since it makes code a
lot easier to read and I don't have to do bit-masks and bit-shifts to
access the individual bit-fields.

It's worth noting that the semantics are different. Using shifts and/or
masks gives you sequences of bits with known value or significance
whereas how bit-fields are packed into a struct is up to the compiler.
This does not matter so much if your code if aimed at one compiler only,
but you still need to take care that compiler options (or versions)
don't affect the packing.
However, it seems as C99 only allows bit-fields for unsigned char and
unsigned int, and bit-fields in unsigned short is a GCC extension.

C99 allow lots of types but it only insists that an implementation
support _Bool along with signed and unsigned int.
I get the following compilation warning

bit_field_unsigned_short_test.c:6: warning: type of bit-field 'field1'
is a GCC extension
bit_field_unsigned_short_test.c:7: warning: type of bit-field 'field2'
is a GCC extension
bit_field_unsigned_short_test.c:8: warning: type of bit-field 'field3'
is a GCC extension

That's probably not compiling in C99 mode, by the way.
When I compile the following code:

bit_field_unsigned_short_test.c: In function 'main':
bit_field_unsigned_short_test.c:8: warning: type of bit-field 'field1'
is a GCC extension
bit_field_unsigned_short_test.c:9: warning: type of bit-field 'field2'
is a GCC extension
bit_field_unsigned_short_test.c:10: warning: type of bit-field
'field3' is a GCC extension

How can I use bitfields in unsigned short without getting the warnings
about bit-fields and still use -pedantic flag?

What advantage do you hope to get from using unsigned short rather than
unsigned int for bit-fields? I can't see why you don't want to use
unsigned int as the type.
 
D

dspfun

What advantage do you hope to get from using unsigned short rather than
unsigned int for bit-fields?  I can't see why you don't want to use
unsigned int as the type.

The reason is that two bytes are sent from a big endian cpu and "we"
receive it on a little endian cpu. To access the information in the
two bytes I declare a struct with bit-fields.

Brs,
Markus
 
J

Jens

The reason is that two bytes are sent from a big endian cpu and "we"
receive it on a little endian cpu. To access the information in the
two bytes I declare a struct with bit-fields.

Don't do that. There are appropriate macros hton and ntoh for that
purpose already.

Jens
 
J

James Kuyper

The reason is that two bytes are sent from a big endian cpu and "we"
receive it on a little endian cpu. To access the information in the
two bytes I declare a struct with bit-fields.

The standard deliberately under-specifies the layout of structs, and
this is particularly true for bit fields. As a result, you cannot
portably rely upon the bits from the data you're transferring being in
the same location as the bits corresponding to any particular bit field.
If portability isn't an issue, then there's no problem; but for
maximally portable code, your best option is to use a unsigned char
buffer, and extracting the bits by using shift and mask operations.

This still isn't perfect: such code usually relies upon CHAR_BITS==8,
which is also not guaranteed. However, you can at least check at compile
time whether CHAR_BITS==8, and chose an appropriate alternative if it is
not. You cannot determine at compile time which bits of a struct
correspond to a given bit field, though there are ways to do so at run time.
 
B

Ben Bacarisse

dspfun said:
The reason is that two bytes are sent from a big endian cpu and "we"
receive it on a little endian cpu. To access the information in the
two bytes I declare a struct with bit-fields.

That does not help me to see why you think unsigned short will be better
than unsigned int for the bit-field. Can you show a fragment of code,
for example, or show what you used to do until you started to consider
using unsigned short bit-fields?
 
B

Ben Bacarisse

Kenneth Brody said:
It may be worth noting that which bits are used for bit-fields is
implementation-defined.

6.7.2.1p10:

The order of allocation of bit-fields within a unit (high-order
to low-order or low-order to high-order) is implementation-defined.

I found this out the hard way some 25+ years ago when porting a
program which stored flags in bit-fields, and saved the value to a
file. It worked fine until the program was ported to a system which
allocated bits the other way.

A nit-pick if I may... The problem is that the program saved the
representation rather than the value -- i.e. that it wrote out the bytes
used by the implementation rather than what they denoted.

It is possible to store the value of a structure that uses bit-fields in
a portable manner (by using printf, for example) though the very use of
bit-fields often suggests an interest in speed and compactness at odds
with doing so. I suspect that a high proportion of the time, that
interest is misplaced!

<snip>
 
D

dspfun

That does not help me to see why you think unsigned short will be better
than unsigned int for the bit-field.  Can you show a fragment of code,
for example, or show what you used to do until you started to consider
using unsigned short bit-fields?

The data sent from the first CPU is actually 32 bit (unsigned int) and
on the receiving CPU these 32 bit are casted to a struct with two
unsigned char (8 bit) and one unsigned short (16 bit).

I agree with you that it would be better to use a struct with bit-
fields of an unsigned int (32 bit).

Thanks for your help!

Brs,
Markus
 
D

dspfun

A nit-pick if I may...  The problem is that the program saved the
representation rather than the value -- i.e. that it wrote out the bytes
used by the implementation rather than what they denoted.

It is possible to store the value of a structure that uses bit-fields in
a portable manner (by using printf, for example) though the very use of
bit-fields often suggests an interest in speed and compactness at odds
with doing so.  I suspect that a high proportion of the time, that
interest is misplaced!

How do you store the value of a structure that uses bit-fields in a
portable manner?

For example, the following file uses bit-fields and not masks and
shifts. Is this according to the C99-standard or implementation
defined?

/usr/src/linux-2.6.16.60-0.69.1/include/linux/ip.h

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
__u8 tos;
__be16 tot_len;
__be16 id;
__be16 frag_off;
__u8 ttl;
__u8 protocol;
__u16 check;
__be32 saddr;
__be32 daddr;
/*The options start here. */
};
 
B

Ben Bacarisse

dspfun said:
The data sent from the first CPU is actually 32 bit (unsigned int) and
on the receiving CPU these 32 bit are casted to a struct with two
unsigned char (8 bit) and one unsigned short (16 bit).

I know what you mean, but you can't cast to a struct type.
I agree with you that it would be better to use a struct with bit-
fields of an unsigned int (32 bit).

A bit-field of type unsigned int is better than a bit-field of type
unsigned short but I have not said (and I would not say) that you should
use a struct with bit-fields over the one you describe first! You seem
to be taking a step backwards by using bit-fields for any sort.

If you need the code to be portable and to deal with different machine
representations, then using a struct at all is not a good way to impose
a meaning on a block of data. Since you are talking about sending and
receiving and have mentioned endianness already, I think you may have
started by asking the wrong question.
 
B

Ben Bacarisse

dspfun said:
On 9 mar, 17:57, Ben Bacarisse <[email protected]> wrote:
How do you store the value of a structure that uses bit-fields in a
portable manner?

I am sure you know how it is just that you are currently thinking in
terms of bits and bytes a have lost sight that these can mean
something. The simplest way is to print the decimal representation of
the numerical value of each of the structure's members along with some
sort of separator:

struct eg { unsigned int n1 : 4; unsigned int n2 : 4; } s;
/* ... */
printf("{%ud,%ud}\n", s.n1, s.n2);
For example, the following file uses bit-fields and not masks and
shifts. Is this according to the C99-standard or implementation
defined?

I'm sorry but I am not really sure what that means. If you are asking
if C99 specifies everything about the layout, packing and padding of the
structure then the answer is no, it does not.
 
J

James Kuyper

....
How do you store the value of a structure that uses bit-fields in a
portable manner?

In a world where CHAR_BITS is not required to be 8, and there's no
universal standard for character encoding, perfect portability is not
possible. In his response, Ben has described one approach. Another is to
write the fields to an array of unsigned char, assuming that
CHAR_BITS==8 (and testing that assumption with #if !), forcing precisely
the endianness required by your application for multi-byte fields, and
forcing the bit fields into precisely the desired portion of the desired
bytes.
For example, the following file uses bit-fields and not masks and
shifts. Is this according to the C99-standard or implementation
defined?

/usr/src/linux-2.6.16.60-0.69.1/include/linux/ip.h

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,

Every identifier in this struct that starts with "__" violates the name
space reserved to the implementation. That's OK, if ip.h is one of that
implementation's extensions to standard C; but keep in mind that it's
not permitted for use code.
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix<asm/byteorder.h>"

asm/byteorder.h is also not part of the C standard.
#endif
__u8 tos;
__be16 tot_len;
__be16 id;
__be16 frag_off;
__u8 ttl;
__u8 protocol;
__u16 check;
__be32 saddr;
__be32 daddr;
/*The options start here. */
};

unsigned char buffer[20]; // assuming I counted right.

buffer[0] = ihl*16+version;
buffer[1] = tos;
buffer[2] = tot_len/256;
buffer[3] = tot_len%256;
// etc.

It's slow and nasty, but it's more portable than relying upon things
about the layout of structures that are not guaranteed by the C
standard, and more compact than the text-oriented approach described by
Ben. His approach also allows the output to be read by human beings,
whereas this does not.
 
K

Keith Thompson

dspfun said:
The data sent from the first CPU is actually 32 bit (unsigned int) and
on the receiving CPU these 32 bit are casted to a struct with two
unsigned char (8 bit) and one unsigned short (16 bit).

I agree with you that it would be better to use a struct with bit-
fields of an unsigned int (32 bit).

If the members of the struct on the receiving end are two 8-bit unsigned
chars and one unsigned 16-bit short, why are you considering using
bit-fields at all? Do you mena that you're extracting individual
bits from the unsigned short member?

As Ben said, it would be *extremely* helpful if you could show us some
actual code.
 
D

dspfun

As Ben said, it would be *extremely* helpful if you could show us some
actual code.

Consider an example with the following 32-bits containing 6 different
bit-fields that I want to send from a big-endian cpu to a little
endian cpu:

-----------------------------------
typedef struct my_struct {
unsigned int f1:1;
unsigned int f2:4;
unsigned int f3:7;
unsigned int f4:3;
unsigned int f5:12;
unsigned int f6:5;
}my_struct_t;

my_struct_t test_struct;
test_struct.f1 = 0x1;
test_struct.f2 = 0x3;
test_struct.f3 = 0x55;
test_struct.f4 = 0x2;
test_struct.f5 = 0x66;
test_struct.f6 = 0x4;
-----------------------------------

What is the best/most "maintainable" way send and access this data
between the cpus? I guess using bit-fields provides for clearer code
compared to using an unsigned long in (32 bit) and do bit-masks and
bit-shifts.

Brs,
Markus
 
K

Keith Thompson

dspfun said:
Consider an example with the following 32-bits containing 6 different
bit-fields that I want to send from a big-endian cpu to a little
endian cpu:

-----------------------------------
typedef struct my_struct {
unsigned int f1:1;
unsigned int f2:4;
unsigned int f3:7;
unsigned int f4:3;
unsigned int f5:12;
unsigned int f6:5;
}my_struct_t;

my_struct_t test_struct;
test_struct.f1 = 0x1;
test_struct.f2 = 0x3;
test_struct.f3 = 0x55;
test_struct.f4 = 0x2;
test_struct.f5 = 0x66;
test_struct.f6 = 0x4;
-----------------------------------

What is the best/most "maintainable" way send and access this data
between the cpus? I guess using bit-fields provides for clearer code
compared to using an unsigned long in (32 bit) and do bit-masks and
bit-shifts.

Your original question was about unsigned short bit fields.
There's nothing suggesting that in your example.

Certainly bit fields make for more easily readable code than shifting
and masking; it lets you assign names to the individual components.
(For shifts and masks, macros can help, but the code will still
look a little ugly.) But the question is, how can you be sure that
the layout machine A is sending matches the layout that machine B
is expecting?

As several people have mentioned, the standard does not completely
specify how bit fields are laid out. You can't be sure that the
above declaration will result in the same bit-level layout on two
different systems.

If your two machines are identical, with the same C implementation,
you can use whatever data structure you like and transmit it as
raw binary (as long as there are no pointers). But you're going
to have a lot of work to do when you later expand the system to
include some other machine.

Even sending 32-bit integers is non-portable. You can probably
rely on 2's-complement, but byte ordering can easily differ.

The most portable way to transmit data is probably as plain text
using a restricted character set. A sequence of 8-bit bytes is
nearly as good, assuming CHAR_BIT==8 on both ends.

My suggestion: first define the layout of the data you want to
transmit in terms of sequences of 8-bit bytes, then figure out how
to represent it in C.
 
H

Herbert Rosenau

How do you store the value of a structure that uses bit-fields in a
portable manner?

Impossible mission. You have simply no chance to get sure that 2
different implementations are using the same layout of bitfields.
For example, the following file uses bit-fields and not masks and
shifts. Is this according to the C99-standard or implementation
defined?

No. C (any version) gives an implementation any freedom it needs to
define the layout of bitfields as it likes. This may even change from
one version to the next one. It may change frome one compiler to
other.

When you have to exchange data between different computers with
different CPUs or even different OSes or different compilers you can't
exchange bitfields. You may fail by exchange higher values as bytes.
The best chance to get that done well is by either convert binary<
data to text and back (whereas even that can fail when they are using
different codings for text) or you'll use some vfunctions your network
APIs gives you on both sites.



--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 2.0 ist da!
 
D

dspfun

As several people have mentioned, the standard does not completely
specify how bit fields are laid out.  You can't be sure that the
above declaration will result in the same bit-level layout on two
different systems.

So how come the Internet Protocol suite work so well? When looking
briefly at the Linux IP-stack it appears to be using bit-fields for,
for example, the IP-header.

Brs,
Markus
 
B

Ben Pfaff

dspfun said:
So how come the Internet Protocol suite work so well? When looking
briefly at the Linux IP-stack it appears to be using bit-fields for,
for example, the IP-header.

First, Linux is not written in portable C. It targets specific
architectures and C implementations.

Second, even with those assumptions Linux has to deal with
multiple bit-field orders, e.g. from include/linux/ip.h:

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
__u8 tos;
__be16 tot_len;
__be16 id;
__be16 frag_off;
__u8 ttl;
__u8 protocol;
__sum16 check;
__be32 saddr;
__be32 daddr;
/*The options start here. */
};
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top