Bit-fields and integral promotion

L

Lawrence Kirby


It IS the case that bit-fields can be loaded and stored without affecting
anything else, as far as C is concerned, even if the implementation has
to jump through hoops to achieve it.

The statement is also true of ints. The main thing about bitfields is that
they don't necessarily occupy a while number of bytes i.e. a
particular byte may be used by more than one distinct object.
I'd agree that there's a potential difference in the implementation
(although I'm sure there are implementations that load & store chars
like a bitfield, and others that can load aligned bitfields like x.b as
a char). But is that sufficient reason to cause such a significant
difference in the semantics?

IMO implementation details are not a big issue here, they don't make much
difference one way or the other. The issue is how you interpret the C
standard. It is just one case of the overall value-preserving nastiness.
The very decision to adopt "value-preserving" promotions was to minimise
unexpected behaviour.

In which case it singularly failed. Having unsigned types switch to signed
types through non-obvious rules is what most people find "unexpected". :)
Having bitfields alone be "unsigned-preserving" would
be rather unexpected, surely?

Not if you viewed the type of the bit-field as unsigned int to start with.
In that case having a value of type unsigned int "promote" to int would be
odd indeed. However as per my other post it appears that shorter unsigned
int bitfields do promote to int.
I suppose it's just unexpected if you have my view of bitfields though -
I automatically think of "unsigned :8" as just being a custom
sub-integer type, akin to unsigned char. Maybe you think of it as an
unsigned int which just happens to be not allocated all its bits.

Bit-fieldness does seem to be a second class property of types.
I think 6.7.2.1p9 disagrees with you. The bitfield's type may not have a
name, but it is a type.

Whatever is meant by "is interpreted as". Is it a type (modifier) or isn't
it? If bit-fields are part of the type system why aren't they mentioned in
6.2.5?

Lawrence
 
M

Mark Piffer

Jack Klein wrote:

....lot of snippage...
Admittedly it is unfortunate that the standard does not specifically
mention bit fields in describing the usual integer conversions, and
hopefully that can be rectified in a TC or later version.

But since the standard selected what they call "value preserving" over
"sign preserving" operation, it would be seriously inconsistent and a
real source of problems if an 8-bit unsigned char promoted to a signed
int but an 8-bit unsigned bit field promoted to an unsigned int. That
would be rather absurd, don't you think?

One could claim that it is equally absurd to have a unsigned bit field
of width 15 promoted to int and one of width 16 to unsigned int
(following value-preservation rules on an example 16-Bit architecture).
With your argumentation, the type which is used to define the bitfield
is ignored and the smallest value preserving integer is promoted to;
this would lead to e.g. signed, unsigned, signed for 15,16,17-width
bitfields, would it not?
Ah, you snipped your particular statement that my comment addressed,
so I am putting it back in:


I misinterpreted your meaning, so my comment doesn't apply. I was
thrown off by what I think is some incompleteness in your wording. I
think what you meant to say by "make a bit field unsigned" would be
better conveyed by the words "make an unsigned bit field promote to
unsigned int".

But despite the omission from the standard, it seems silly to think
that the compiler designer is given a choice here. Since all other
promotions and conversions are rather scrupulously defined, I find it
hard to believe that the intent was to leave the results of using an
unsigned bit field in an expression implementation defined. In fact,
nothing is implementation-defined unless the standard specifically
states that it is implementation-defined.

In fact, given the lack of mention, using the value of a bit field in
an expression technically produces undefined behavior based on the
wording of the standard today.

[IMHO]
The commitee avoided (consciously or not) to explicitly say "look, we
defined value preservation in most cases but with bitfields we must
revert to sign-preservation. Please don't feel fooled because in most
cases this is what you as programmers actually want". Value
preservation like you propose it burdens the programmers with another
cascade of promotions with weird architecture dependent implications.
OTOH it would be consistent with the rest of C: it does the non-obviuos
;)
[/IMHO]

Mark
 
C

CBFalconer

Alex said:
[big snip]

What are you trying to say?

That bit-fields declared with plain int should be unsigned?
That bit-fields should be "unsigned preserving"?
Both? Something else?

Promoting a signed bit-field to (signed) int requires extra effort
to copy or move the sign bit, but promoting an unsigned bit-field
to (signed) int does not.

True. A priori we don't know whether or not the sign bit of a bit
field is set or not. Therefore the initial expansion into a
register should prefer to be into an unsigned representation,
barring explicit information otherwise. That information can only
come from the designation of the bit field as being signed.

The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size. Once that is settled we have clear rules for
any further promotions, operations, and potential overflows.
 
C

CBFalconer

Kevin said:
.... snip ...


I think 6.7.2.1p9 disagrees with you. The bitfield's type may not
have a name, but it is a type.

Are you thinking of this (from N869)?

[#8] A bit-field shall have a type that is a qualified or
unqualified version of _Bool, signed int, or unsigned int.
A bit-field is interpreted as a signed or unsigned integer
type consisting of the specified number of bits.95) If the
value 0 or 1 is stored into a nonzero-width bit-field of
type _Bool, the value of the bit-field shall compare equal
to the value stored.

If so, I think it bolsters my view. (_Bool being a 1 bit type and
new to C99).
 
C

CBFalconer

Mark said:
.... snip ...

[IMHO]
The commitee avoided (consciously or not) to explicitly say "look,
we defined value preservation in most cases but with bitfields we
must revert to sign-preservation. Please don't feel fooled because
in most cases this is what you as programmers actually want". Value
preservation like you propose it burdens the programmers with
another cascade of promotions with weird architecture dependent
implications. OTOH it would be consistent with the rest of C: it
does the non-obviuos ;)
[/IMHO]

I think that, together with my description of a sane implementor,
pretty well settles it. Now the thing is to get some verbiage into
the standard saying so.

It would suffice to say that any bitfield is treated as having the
specified type for any further operations.
 
T

Thad Smith

CBFalconer said:
Because a char, of any flavor, occupies a complete addressable
unit, and can be loaded and stored without affecting anything else
(at least as far as C is concerned). That does not apply to
bitfields, which may spread over byte demarcations (but not over
int demarcations).

It's implementation-defined whether bitfields are spread over integer
"units" or not. The Borland 16-bit C compiler did indeed spread
bitfields across byte and word boundaries as long as they were fully
contained in 2 or less bytes, taking advantage of the 80x86 ability of
non-aligned memory access. A struct containing 7 9-bit bitfields was
allocated 8 8-bit bytes.

Thad
 
C

CBFalconer

Thad said:
It's implementation-defined whether bitfields are spread over
integer "units" or not. The Borland 16-bit C compiler did indeed
spread bitfields across byte and word boundaries as long as they
were fully contained in 2 or less bytes, taking advantage of the
80x86 ability of non-aligned memory access. A struct containing
7 9-bit bitfields was allocated 8 8-bit bytes.

I believe there is something in the standard forbidding that.
First, bitfields larger than an int are not allowed, and they are
not allowed to cross int boundaries, IIRC.
 
A

Alex Fraser

CBFalconer said:
Alex said:
[big snip]

What are you trying to say?

That bit-fields declared with plain int should be unsigned?
That bit-fields should be "unsigned preserving"?
Both? Something else?

Promoting a signed bit-field to (signed) int requires extra effort
to copy or move the sign bit, but promoting an unsigned bit-field
to (signed) int does not.

True. A priori we don't know whether or not the sign bit of a bit
field is set or not. Therefore the initial expansion into a
register should prefer to be into an unsigned representation,
barring explicit information otherwise. [...]

So bit-fields declared with plain int should be unsigned, because otherwise
all use will need the extra effort I mentioned (and you described before)?
The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size.

Does that mean you think the standard says promotion of bit-fields is
"unsigned preserving"? That appears to be a minority view.

Alex
 
J

James Kuyper

CBFalconer wrote:
....
The point is "what type do we have after extracting the field into
a register". I claim that we have either a signed or unsigned
integer, depending solely on the type designation of the bit field,
and not on its size. Once that is settled we have clear rules for
any further promotions, operations, and potential overflows.

The standard specifies that what you have is an int, not an unsigned
int, if all the values of the original type can be represented in an
int, regardless of whether the original type was signed or unsigned.
That's certainly true for unsigned int:8, and on many implementations
it's also true for unsigned int:30. On most implementations it applies
to unsigned short, and on almost every implementation it applies to
unsigned char.

Are you claiming that for "unsigned int i:8", the original type is
"unsigned int"? I'll agree that 6.3.1.1p2 is unclear about that issue.
However, do you really want "unsigned int i:8" to be handled by
different rules than "unsigned char c", even on a machine where
CHAR_BITS==8?

Counter arguments:
6.7.2.1p3 refers to "... the type that is specified if the colon and
expression are omitted." This implies that it would be a different type
than is specified when the colon and expression are present. If that
weren't the case, it could have simply said "... the specified type".

6.7.2.1p8 says "A bit-field is interpreted as a signed or unsigned
integer type consisting of the specified number of bits." It might not
be a named type, and therefore can't be the subject of an explicit cast,
but it is it's own type, and that type has a different number of bits
than it would have if the colon and the expression were absent.
 
K

Keith Thompson

CBFalconer said:
I believe there is something in the standard forbidding that.
First, bitfields larger than an int are not allowed, and they are
not allowed to cross int boundaries, IIRC.

I think you're mistaken.

C99 6.7.2.1p10 says:

An implementation may allocate any addressable storage unit
large enough to hold a bit-field. If enough space remains,
a bit-field that immediately follows another bit-field in
a structure shall be packed into adjacent bits of the same
unit. If insufficient space remains, whether a bit-field that
does not fit is put into the next unit or overlaps adjacent
units is implementation-defined. The order of allocation of
bit-fields within a unit (high-order to low-order or low-order
to high-order) is implementation-defined. The alignment of the
addressable storage unit is unspecified.
 
C

CBFalconer

Alex said:
.... snip ...


Does that mean you think the standard says promotion of bit-fields is
"unsigned preserving"? That appears to be a minority view.

What's this jargon about unsigned preserving, value preserving,
etc. We extract the bits. If it is an unsigned field we are done
(other bits being zeroed). If it is a signed field we use the left
most field bit to set the final sign, doing the appropriate things
to the other bits formerly unspecified bits. Now we have either a
signed or an unsigned int. Which depends only on the declaration.
No guessing involved. KISS.
 
A

Alex Fraser

CBFalconer said:
What's this jargon about unsigned preserving, value preserving,
etc.

What's this problem you have with answering my questions? :)

Unsigned preserving: a "small" signed type is promoted to int, and a "small"
unsigned type is promoted to unsigned int.

Values preserving: if int can represent all values of a given "small" type,
it is promoted to int; otherwise it is promoted to unsigned int.

("Small" here covers un/signed bit-fields, un/signed char and un/signed
short.)

Alex
 
C

CBFalconer

Alex said:
What's this problem you have with answering my questions? :)

Unsigned preserving: a "small" signed type is promoted to int,
and a "small" unsigned type is promoted to unsigned int.

Values preserving: if int can represent all values of a given
"small" type, it is promoted to int; otherwise it is promoted to
unsigned int.

("Small" here covers un/signed bit-fields, un/signed char and
un/signed short.)

No, small means bit-fields. chars are covered in the standard. In
your terms I claim only unsigned-preserving makes sense.
 
R

Richard Bos

It's a non-Standard view.
What's this jargon about unsigned preserving, value preserving,
etc.

It's the same jargon that the ISO C Standard Committee uses in their
Rationale:

# The unsigned preserving approach calls for promoting the two smaller
# unsigned types to unsigned int.

# The value preserving approach calls for promoting those types to
# signed int if that type can properly represent all the values of the
# original type, and otherwise for promoting those types to unsigned
# int.

# After much discussion, the C89 Committee decided in favor of
# value preserving rules,

Oh, and

# The Standard clarifies that the integer promotion rules also apply to
# bit fields.

Richard
 
A

Antoine Leca

En (e-mail address removed), Joe Wright va escriure:
#include <stdio.h>
unsigned char c = 0;
if (c - 5 < 0) foo();
else bar();

My output is..

foo

..and it must be so with any C compiler. Its called Integral
Promotion.

Pedantic reaction, but I am not 100% sure.
The reason is that I am not sure char is required to be "smaller" than int,
even if you have a <stdio.h> header, under _any_ conforming implementation.

I know EOF should be negative. But I cannot find the words that require it
to be different from any unsigned char for a _freestanding_ implemantation
(which may misses the <ctype.h> header.)


Note: I am reading this from comp.std.c; it may be just forgotten for
comp.lang.c readers: to all effects, it is probably sane to consider than
char are not as wide as int.

Antoine
 
K

kuyper

CBFalconer wrote:
....
No, small means bit-fields. chars are covered in the standard. In
your terms I claim only unsigned-preserving makes sense.

Well, bit-fields are also covered in the standard, by precisely the
same wording. The standard doesn't use the term "small" to describe
them, it's use here was purely as a way to simplify the wording. You
can propose your own alternative definition of "small". However, your
re-defined meaning of "small" can't be used in the above definitions.
Those definitions are correct only when they refer to all types for
which the range of possible values is a subset of the range of 'int'
values. That includes all signed bit-field types, and all unsigned
bit-field types with a width smaller than the width of 'int'.
 
K

Kevin Bracey

In message <[email protected]>
"Mark Piffer said:
One could claim that it is equally absurd to have a unsigned bit field
of width 15 promoted to int and one of width 16 to unsigned int
(following value-preservation rules on an example 16-Bit architecture).

No more absurd than

unsigned char uc;
unsigned short us;

uc + 1 -> int
us + 1 -> unsigned int

is it? Unsigned short promotes to unsigned int only because it is 16 bits
like int. Unsigned char does not. On another, 32-bit, implementation unsigned
short would promote to int. This "absurdity" already exists for the main
types.
With your argumentation, the type which is used to define the bitfield
is ignored and the smallest value preserving integer is promoted to;
this would lead to e.g. signed, unsigned, signed for 15,16,17-width
bitfields, would it not?

17-bit bitfields are not permitted on 16-bit systems, so the only issue is
that 16-bit ones promote differently to smaller ones. Just like unsigned
short promoting differently to unsigned char.
 
K

Keith Thompson

Antoine Leca said:
I know EOF should be negative. But I cannot find the words that require it
to be different from any unsigned char for a _freestanding_ implemantation
(which may misses the <ctype.h> header.)

EOF is defined in <stdio.h>, which freestanding implementations aren't
required to provide.
 
A

Antoine Leca

En (e-mail address removed), Keith Thompson va escriure:
EOF is defined in <stdio.h>, which freestanding implementations
aren't required to provide.

Yes. But I do not see what conclusion you draw from this.

The example code (said to be strictly conforming) assumed that int was
strictly wider than unsigned char. While at the same time it #included
<stdio.h>. My idea is that the fact it #includes <stdio.h> does not force to
use a hosted implementation (that is, a freestanding one can provide a
<stdio.h> header, and a otherwise conforming program could perfectly use
it.) And on the other hand I cannot spot in the Standard a requirement for a
freestanding implementation to have int strictly wider from [unsigned] char.

Sorry if I was unclear, and I hope you got me this time (but not sure :O) )


Antoine
 
K

Keith Thompson

Antoine Leca said:
En (e-mail address removed), Keith Thompson va escriure:

Yes. But I do not see what conclusion you draw from this.

The example code (said to be strictly conforming) assumed that int was
strictly wider than unsigned char. While at the same time it #included
<stdio.h>. My idea is that the fact it #includes <stdio.h> does not force to
use a hosted implementation (that is, a freestanding one can provide a
<stdio.h> header, and a otherwise conforming program could perfectly use
it.)

Hmm. A conforming freestanding implementation doesn't have to provide
<stdio.h>, and doesn't have to accept a strictly conforming program
that uses <stdio.h>. If a freestanding implementation does provide
<stdio.h>, I don't see any requirement for it to do so in a manner
that would be conforming for a hosted implementation; for example,
<stdio.h> might not even define EOF, or it might define EOF as a
positive value.

Certainly a freestanding implementation that provides some of the
non-required headers *should* do so in a consistent manner, but as far
as I can tell the standard doesn't say so.
And on the other hand I cannot spot in the Standard a requirement for a
freestanding implementation to have int strictly wider from [unsigned] char.

I don't see any such (explicit) requirement even for hosted
implementations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top