Bitwise Operator Effects on Padding Bits

G

Guest

Do you have a machine and C compiler that supports _Bool?  Is it
mythical?

The type _Bool needn't necessarily have padding bits. In fact, in
every implementation I inspected it hasn't.
 
B

Ben Bacarisse

The type _Bool needn't necessarily have padding bits. In fact, in
every implementation I inspected it hasn't.

How do you know? I ask because it is easy to confuse value bits with
the undefined behaviour of padding bits (I've made that mistake myself
in this very newsgroup).

BTW you are quite right about the first part -- I don't think _Bool
*has* to have padding bits either -- but having them is one of the
easiest way to ensure that _Bool behaves as it should so I'd guess it is
not uncommon. That really is a guess though -- it could be that _Bool
has no padding bits more often than it does.

The problem is a subtle one because you can't use the normal means to
determine the maximum value that an unsigned type can hold. For any
unsigned type other than _Bool, T t = -1 initialises t to be the type's
maximum value, but a conversion to _Bool is specifically defined to
produce either 0 or 1.

If you set bits directly in the representation of a _Bool, how can you
tell if those bits are value or padding bits? They might look like
value bits (in that you get a value > 1 from the _Bool object) but that
could just be the result of undefined behaviour from a representation
that does not "represent a value of the object type" (6.2.6.1 p5).

I think you can get close to an answer by defining a bit-field of type
_Bool. The bit-field width can be no larger than the width of the field
type (the width is the number of value bits in an unsigned type). If an
implementation has sizeof (_Bool) == 1 and it permits

struct bt { _Bool b : CHAR_BIT; } s;

then, yes, that implementation has _Bool with no padding. In such cases
I'd want to check that s.b = -1 gave s.b == 1 as specified. gcc 4.4.3
rejects any _Bool bit-field with a size > 1.

Unfortunately, the paragraph in question was modified by TC2 and I don't
have the base C99 document to know what it said before. It's possible
that C99 (without TC2) does not even provide this method of detecting
the width of a _Bool.
 
K

Keith Thompson

Ben Bacarisse said:
If you set bits directly in the representation of a _Bool, how can you
tell if those bits are value or padding bits? They might look like
value bits (in that you get a value > 1 from the _Bool object) but that
could just be the result of undefined behaviour from a representation
that does not "represent a value of the object type" (6.2.6.1 p5).
[...]

Unfortunately, the paragraph in question was modified by TC2 and I don't
have the base C99 document to know what it said before. It's possible
that C99 (without TC2) does not even provide this method of detecting
the width of a _Bool.

TC2 didn't touch 6.2.6.1p5, which says:

Certain object representations need not represent a value of
the object type. If the stored value of an object has such a
representation and is read by an lvalue expression that does
not have character type, the behavior is undefined. If such
a representation is produced by a side effect that modifies
all or any part of the object by an lvalue expression that
does not have character type, the behavior is undefined.
Such a representation is called a _trap representation_.

It did affect 6.2.6.1p6, which changed from C99:

When a value is stored in an object of structure or union
type, including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values. The values of padding bytes shall
not affect whether the value of such an object is a trap
representation. Those bits of a structure or union object that
are in the same byte as a bit-field member, but are not part
of that member, shall similarly not affect whether the value
of such an object is a trap representation.

to N1256:

When a value is stored in an object of structure or union
type, including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values. The value of a structure or union object is
never a trap representation, even though the value of a member
of the structure or union object may be a trap representation.

There was also a change in the footnote, from:

Thus, for example, structure assignment may be implemented
element-at-a-time or via memcpy.

to:

Thus, for example, structure assignment need not copy any
padding bits.
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
If you set bits directly in the representation of a _Bool, how can you
tell if those bits are value or padding bits? They might look like
value bits (in that you get a value > 1 from the _Bool object) but that
could just be the result of undefined behaviour from a representation
that does not "represent a value of the object type" (6.2.6.1 p5).
[...]

Unfortunately, the paragraph in question was modified by TC2 and I don't
have the base C99 document to know what it said before. It's possible
that C99 (without TC2) does not even provide this method of detecting
the width of a _Bool.

TC2 didn't touch 6.2.6.1p5, which says:

No indeed. I was not clear. After talking about 6.2.6.1 p5 I moved on
to the only way I could think of for actually testing the number of
value bits in the type _Bool. The ambiguous "paragraph in question" is
the one that imposes a constraint on the size of a bit-field: 6.7.2.1
p3. This is the one whose pre-TC2 contents are a mystery to me.

Sorry to have misled you into a wild corrigenda chase.

<snip>
 
K

Keith Thompson

Ben Bacarisse said:
Keith Thompson said:
Ben Bacarisse said:
If you set bits directly in the representation of a _Bool, how can you
tell if those bits are value or padding bits? They might look like
value bits (in that you get a value > 1 from the _Bool object) but that
could just be the result of undefined behaviour from a representation
that does not "represent a value of the object type" (6.2.6.1 p5).
[...]

Unfortunately, the paragraph in question was modified by TC2 and I don't
have the base C99 document to know what it said before. It's possible
that C99 (without TC2) does not even provide this method of detecting
the width of a _Bool.

TC2 didn't touch 6.2.6.1p5, which says:

No indeed. I was not clear. After talking about 6.2.6.1 p5 I moved on
to the only way I could think of for actually testing the number of
value bits in the type _Bool. The ambiguous "paragraph in question" is
the one that imposes a constraint on the size of a bit-field: 6.7.2.1
p3. This is the one whose pre-TC2 contents are a mystery to me.

Sorry to have misled you into a wild corrigenda chase.

<snip>

In C99, 6.7.2.1p3 says:

The expression that specifies the width of a bit-field shall be
an integer constant expression that has nonnegative value that
shall not exceed the number of bits in an object of the type
that is specified if the colon and expression are omitted. If
the value is zero, the declaration shall have no declarator.

In N1256, it says:

The expression that specifies the width of a bit-field shall
be an integer constant expression with a nonnegative value
that does not exceed the width of an object of the type that
would be specified were the colon and expression omitted. If
the value is zero, the declaration shall have no declarator.

The change is in response to Defect Report #262,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_262.htm>.
Apart from some cleanup of the wording, the effect is to refer to
the "width" of the type rather than the ambiguous "number of bits".

It's not clear (to me, anyway) what the width of _Bool is supposed to
be. It seems to me that the Standard doesn't preclude either of the
following (assume CHAR_BIT==8 and sizeof(_Bool)==1):

_Bool has 1 value bit and 7 padding bits and can only represent
the values 0 and 1; its width is 1.

_Bool has 8 value bits and can represent values from 0 to
255 inclusive (but storing a value other than 0 or 1 requires
tricks); its width is 8.

More concretely, what is the required behavior of this program?

#include <stdio.h>
int main(void)
{
if (sizeof(_Bool) == 1) {
_Bool b;
*(unsigned char*)&b = 2;
printf("b = %d\n", b);
}
else {
puts("sizeof(_Bool) != 1");
}
return 0;
}

Assuming sizeof(_Bool)==1, must it print "b = 2", or is the behavior
undefined?
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
Keith Thompson said:
[...]
If you set bits directly in the representation of a _Bool, how can you
tell if those bits are value or padding bits? They might look like
value bits (in that you get a value > 1 from the _Bool object) but that
could just be the result of undefined behaviour from a representation
that does not "represent a value of the object type" (6.2.6.1 p5).

[...]

Unfortunately, the paragraph in question was modified by TC2 and I don't
have the base C99 document to know what it said before. It's possible
that C99 (without TC2) does not even provide this method of detecting
the width of a _Bool.

TC2 didn't touch 6.2.6.1p5, which says:

No indeed. I was not clear. After talking about 6.2.6.1 p5 I moved on
to the only way I could think of for actually testing the number of
value bits in the type _Bool. The ambiguous "paragraph in question" is
the one that imposes a constraint on the size of a bit-field: 6.7.2.1
p3. This is the one whose pre-TC2 contents are a mystery to me.

Sorry to have misled you into a wild corrigenda chase.

<snip>

In C99, 6.7.2.1p3 says:

The expression that specifies the width of a bit-field shall be
an integer constant expression that has nonnegative value that
shall not exceed the number of bits in an object of the type
that is specified if the colon and expression are omitted. If
the value is zero, the declaration shall have no declarator.

In N1256, it says:

The expression that specifies the width of a bit-field shall
be an integer constant expression with a nonnegative value
that does not exceed the width of an object of the type that
would be specified were the colon and expression omitted. If
the value is zero, the declaration shall have no declarator.

The change is in response to Defect Report #262,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_262.htm>.
Apart from some cleanup of the wording, the effect is to refer to
the "width" of the type rather than the ambiguous "number of bits".

Ah, very interesting. Thank you. Is there an index that shows what
changes are due to which defect reports? The last time I wanted to know
I downloaded a whole bunch of them and used grep.
It's not clear (to me, anyway) what the width of _Bool is supposed to
be. It seems to me that the Standard doesn't preclude either of the
following (assume CHAR_BIT==8 and sizeof(_Bool)==1):

_Bool has 1 value bit and 7 padding bits and can only represent
the values 0 and 1; its width is 1.

_Bool has 8 value bits and can represent values from 0 to
255 inclusive (but storing a value other than 0 or 1 requires
tricks); its width is 8.

I think both are permitted along with all points in between (i.e. any
width from 1 to CHAR_BIT seems to me to be permitted). I can't find any
text that imposes any tighter constraints on the implementation.
More concretely, what is the required behavior of this program?

#include <stdio.h>
int main(void)
{
if (sizeof(_Bool) == 1) {
_Bool b;
*(unsigned char*)&b = 2;
printf("b = %d\n", b);
}
else {
puts("sizeof(_Bool) != 1");
}
return 0;
}

Assuming sizeof(_Bool)==1, must it print "b = 2", or is the behavior
undefined?

I think it is up to the implementation. I say this simply because I
can't see any reason why either option is prohibited. If the
implementation chooses to give _Bool at least two value bits then it
must print "b = 2" but if there is only one value bit the result is
undefined. Of course, you may well get "b = 2" even when the behaviour
is undefined.

The current definition requires special treatment of _Bool in several
places whereas defining _Bool to have only one value bit would have
almost exactly the same effect with slightly fewer special cases.
Therefore I'd say that the permission to have more than 1 value bit is
deliberate.
 
L

lawrence.jones

Keith Thompson said:
More concretely, what is the required behavior of this program?

#include <stdio.h>
int main(void)
{
if (sizeof(_Bool) == 1) {
_Bool b;
*(unsigned char*)&b = 2;
printf("b = %d\n", b);
}
else {
puts("sizeof(_Bool) != 1");
}
return 0;
}

Assuming sizeof(_Bool)==1, must it print "b = 2", or is the behavior
undefined?

I think it's unspecified. It's certainly not required to print "b = 2"
since the width of _Bool is permitted to be 1. It could result in
undefined behavior if the value 2 happens to be a trap representation,
but I don't think there's any undefined behavior otherwise.
 
L

lawrence.jones

Keith Thompson said:
_Bool is also the only predefined integer type that doesn't have _MIN
and _MAX macros in <limits.h> or <stdint.h>.

Since they are, by definition, 0 and 1, there didn't seem to be much
point.
 
B

Ben Bacarisse

Keith Thompson said:
Ben Bacarisse said:
Keith Thompson <[email protected]> writes: [...]
It's not clear (to me, anyway) what the width of _Bool is supposed to
be. It seems to me that the Standard doesn't preclude either of the
following (assume CHAR_BIT==8 and sizeof(_Bool)==1):

_Bool has 1 value bit and 7 padding bits and can only represent
the values 0 and 1; its width is 1.

_Bool has 8 value bits and can represent values from 0 to
255 inclusive (but storing a value other than 0 or 1 requires
tricks); its width is 8.

I think both are permitted along with all points in between (i.e. any
width from 1 to CHAR_BIT seems to me to be permitted). I can't find any
text that imposes any tighter constraints on the implementation.

Are widths exceeding CHAR_BIT forbidden? I know that _Bool is required
to have a lower rank than any other standard integer type, but I don't
think that implies it can't have a wider range than unsigned char
(though it could cause some interesting effects if it does).

I think you are right, and I'd come to the same conclusion while writing
another post but forgot it by the time I replied here! I see no reason
why _Bool can't be wider that unsigned char.

<snip>
 
S

Seebs

Since they are, by definition, 0 and 1, there didn't seem to be much
point.

Are they, though? If you can write a value other than zero or one in
which isn't a trap representation (using the unsigned-char interface),
maybe the largest value the type can hold should be BOOL_MAX, even
though it might not be 1.

I think the simplest solution, though, is to just declare that any
value other than 0 or 1 is actually a trap representation, it's just that
the undefined behavior you get from it may include "inexplicably
evaluating to a value outside the range of the type, such as 2, but
not raising any exceptions".

-s
 
P

Peter Nilsson

Seebs said:
Are they, though?

What other values can you assign to it?
If you can write a value other than zero or
one in which isn't a trap representation (using the unsigned-
char interface), maybe the largest value the type can hold
should be BOOL_MAX, even though it might not be 1.

Hmm... Playing advocate... if you had said...

If you can write a value other than 0..UINT_MAX which isn't a
trapresentation (using the unsigned-char interface), maybe the
largest value the type can hold should be REAL_UINT_MAX, even
though it might not be UINT_MAX.

How much sense would that make?
 
S

Seebs

What other values can you assign to it?

You can't assign other values to it, but it's not obvious
that a value outside that range can't be stored in it. We
know for sure that it can't possibly be a one-bit physical
type -- it has to have at least CHAR_BIT bits of physical
storage.
Hmm... Playing advocate... if you had said...

If you can write a value other than 0..UINT_MAX which isn't a
trapresentation (using the unsigned-char interface), maybe the
largest value the type can hold should be REAL_UINT_MAX, even
though it might not be UINT_MAX.

How much sense would that make?

None, because UINT_MAX is defined in terms of the values that
are represented in the type -- but _Bool, never really says that
you can't represent other values, just that anything assigned
into it turns into a zero or one.

-s
 
S

Shao Miller

Seebs said:
You can't assign other values to it, but it's not obvious
that a value outside that range can't be stored in it. We
know for sure that it can't possibly be a one-bit physical
type -- it has to have at least CHAR_BIT bits of physical
storage.

I was under the impression that if we do not take the address of
non-bit-field objects with type '_Bool', that the implementation could
pack them together and reads/stores of their values could be translated
as bitwise operations, perhaps on any old register.

Since only 0 or 1 can be assigned and it's a "standard unsigned integer
type", things are pretty simple. If we take an address, maybe not.
 
S

Shao Miller

Shao said:
I was under the impression that if we do not take the address of
non-bit-field objects with type '_Bool', that the implementation could
pack them together and reads/stores of their values could be translated
as bitwise operations, perhaps on any old register.

Since only 0 or 1 can be assigned and it's a "standard unsigned integer
type", things are pretty simple. If we take an address, maybe not.

Then again, 6.3.1.2 does say "Boolean". :)
 
N

Nick Keighley

its overcast and looks like rain
Why don't you write a little program and let the computer figure it
out?

because that only tells you what a particular machine (and its
implementaion does), not what the C standard says.
 
P

Peter Nilsson

Seebs said:
Peter Nilsson said:
What other values can you assign to [_Bool]?

You can't assign other values to it, but it's not obvious
that a value outside that range can't be stored in it.

It's not obvious that a value outside the range of unsigned
int can't be stored in an unsigned int with padding bits.
We know for sure that it can't possibly be a one-bit
physical type -- it has to have at least CHAR_BIT bits
of physical storage.

We can know for sure whether a given type has padding bits.
We know it has to have a multiple of CHAR_BIT bits of
physical storage.
None, because UINT_MAX is defined in terms of the values
that are represented in the type

And _Bool isn't?
-- but _Bool, never really says that you can't represent
other values, just that anything assigned into it turns
into a zero or one.

With the exception of unsigned char, no padded unsigned
integer type ever really says that you can't represent
other values, just that anything assigned (or converted)
into it will fit into its range.

The fact that _Bool's width is narrower than a character is
no different to any other padded unsigned type being narrower
than it's byte size * CHAR_BIT bits.
 
S

Seebs

Seebs said:
Peter Nilsson said:
What other values can you assign to [_Bool]?
You can't assign other values to it, but it's not obvious
that a value outside that range can't be stored in it.
It's not obvious that a value outside the range of unsigned
int can't be stored in an unsigned int with padding bits.

Hmm. Interesting point. By definition, padding bits don't contribute
to the value.
The fact that _Bool's width is narrower than a character is
no different to any other padded unsigned type being narrower
than it's byte size * CHAR_BIT bits.

In which case, I'm pretty sure that the value has to end up being always
exactly zero or one, regardless of the contents of the padding bits.

At least, that would seem to be the intent. In practice, _Bool is a
special case because on most targets it's probably the only integer type
with any padding.

-s
 
G

Guest

How do you know?  I ask because it is easy to confuse value bits with
the undefined behaviour of padding bits (I've made that mistake myself
in this very newsgroup).

I know by code inspection. The assignment of a _Bool object to an int
object was compiled to a plain move of a 32 bit word to a 32 bit word.
If the _Bool object would have padding bits, this move weren't
sufficient for the conversion to int.
(By the way, I don't see how one can confuse bits with behaviour.)
 
B

Ben Bacarisse

I know by code inspection. The assignment of a _Bool object to an int
object was compiled to a plain move of a 32 bit word to a 32 bit word.
If the _Bool object would have padding bits, this move weren't
sufficient for the conversion to int.

What part of the semantics of _Bool and/or int does this plain move
violate?

If, for example, all non-zero settings of 31 padding bits correspond to
trap representations then the conversion to int is undefined if anything
other than the single value bit is set (6.2.6.1 p5). Hence a plain copy
is as good an anything else.
(By the way, I don't see how one can confuse bits with behaviour.)

It was shorthand for "it is easy to confuse the behaviour you'd expect
from extra value bits with the undefined behaviour that is permitted for
various settings of padding bits".
 
G

Guest

What part of the semantics of _Bool and/or int does this plain move
violate?

If, for example, all non-zero settings of 31 padding bits correspond to
trap representations then the conversion to int is undefined if anything
other than the single value bit is set (6.2.6.1 p5).  Hence a plain copy
is as good an anything else.

I appreciate the possibility of your explanation, that the inspected
translation could be a utilization of the undefinedness of trap
representation access behaviour. Let's note that padding bits are not
necessary for the existence of trap representations; combinations of
value bits which do not represent a value of the object type would
also do.
Now I even get to think that it is by all means (except from an
implementation's documentation) undecidable whether _Bool has padding
bits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top