Bit mask

A

Als

What's an efficient way to mask a last 3 bits of a 8-bit char and make them
all zero?

Bit-shifting is possible but not sure if it is efficient enough.

Example:

01011[010] --> 01011[000]

Thanks!
 
J

Joona I Palaste

Als said:
What's an efficient way to mask a last 3 bits of a 8-bit char and make them
all zero?
Bit-shifting is possible but not sure if it is efficient enough.

01011[010] --> 01011[000]

Well, a really simple way would be ANDing with ~7.
For example:
unsigned char x = 0x5A; /* the same as binary 01011010 */
x = x & ~7; /* clear last 3 bits */
Is it efficient? It depends on your implementation.
 
K

Kevin Goodsell

Mark said:
I meant nothing WRT efficiency, just that I thought that &= is the more
common idiom and that you needed to cast ~0x07 to unsigned char.

Why would the cast be needed? It seems useless to me, since the result
will be promoted back to int (probably - unsigned int is theoretically
possible also) anyway.

-Kevin
 
A

Als

Eric Sosman said:
Als said:
What's an efficient way to mask a last 3 bits of a 8-bit char and make them
all zero?

Bit-shifting is possible but not sure if it is efficient enough.

Example:

01011[010] --> 01011[000]

Many or perhaps even most C implementations use an
eight-bit `char', but that is not actually guaranteed
by the language, and implementations using wider `char'
are known to exist. Still:

unsigned char byte = 0x5A; /* 00...01011010 */

Is there any reason that you use "unsigned char" instead of "char" above?
Thanks!
 
P

pete

Als said:
Eric Sosman said:
Als said:
What's an efficient way to mask a last 3 bits of a 8-bit char and make them
all zero?

Bit-shifting is possible but not sure if it is efficient enough.

Example:

01011[010] --> 01011[000]

Many or perhaps even most C implementations use an
eight-bit `char', but that is not actually guaranteed
by the language, and implementations using wider `char'
are known to exist. Still:

unsigned char byte = 0x5A; /* 00...01011010 */

Is there any reason that you use "unsigned char"
instead of "char" above?

The result of bitwise operations are implementation defined
of the sign bit is set prior to or during the operation.
 
J

Jack Klein

01011[010] --> 01011[000]

Well, a really simple way would be ANDing with ~7.
For example:
unsigned char x = 0x5A; /* the same as binary 01011010 */
x = x & ~7; /* clear last 3 bits */
Is it efficient? It depends on your implementation.

Why not do:

unsigned char x = 0x5A;

x &= (unsigned char) ~0x07;

No gain, really. x will be promoted to either int or unsigned int,
and the numeric literal 0x5A, which has type int, will either be
unchanged or also promoted to unsigned int, before binary operation is
performed. Then the result will be converted back to unsigned char
for assignment back into x.

So casting the constant to unsigned char merely adds noise to the
source without changing a thing. If the constant were large enough
that it might actually be negative, a cast to (unsigned) could be
beneficial.
 
A

Anders Mikkelsen

Eric said:
unsigned char x = 90;

x = x & ~7;

I'm curious about using the ~ operator. Will the compiler recognise ~7
as a constant value, or will the compiled code include one's complement
instruction(s)?

Wouldn't this be better:

x &= 0xf8;

For me this is more readable than the above...


Regards,
Anders
 
P

pete

Anders said:
I'm curious about using the ~ operator. Will the compiler recognise ~7
as a constant value,

It *is* a constant value.
or will the compiled code include one's complement
instruction(s)?

It can do that too, but I would expect that it wouldn't.
Wouldn't this be better:

x &= 0xf8;

For me this is more readable than the above...

That will mask an 8 bit char as OP specified,
but (x &= ~7) will work on a char of any width.
 
M

Morris Dovey

Anders said:
I'm curious about using the ~ operator. Will the compiler recognise ~7
as a constant value, or will the compiled code include one's complement
instruction(s)?

Hmm. Very dangerous to predict compiler behaviors - or did you
have a particular compiler in mind? It's a much a constant value
as -7 (which still says nothing about the compiler's behavior)
Wouldn't this be better:

x &= 0xf8;

For me this is more readable than the above...

Perhaps more readable for you; but what if CHAR_BIT isn't less
than nine bits? Imagine that this program is run on my DS9K
configured for 12-bit chars - what then?
 
M

Mark A. Odell

No gain, really. x will be promoted to either int or unsigned int,
and the numeric literal 0x5A, which has type int, will either be
unchanged or also promoted to unsigned int, before binary operation is
performed. Then the result will be converted back to unsigned char
for assignment back into x.

So casting the constant to unsigned char merely adds noise to the
source without changing a thing. If the constant were large enough
that it might actually be negative, a cast to (unsigned) could be
beneficial.

I'm so used to 32-bit machines now maybe I'm tainted. Thanks Jack. So I
only need to cast when 'x' or the mask constant exceeds a value
representable signed int?
 
P

pete

No gain, really. x will be promoted to either int or unsigned int,
and the numeric literal 0x5A, which has type int, will either be
unchanged or also promoted to unsigned int, before binary operation is
performed. Then the result will be converted back to unsigned char
for assignment back into x.

There's two things wrong with that:
1 The left operand of the assignment operator doesn't get promoted.
2 Constants of type int, are not subject to integer promotions.

N869

6.5.16.1 Simple assignment
Semantics
[#2] In simple assignment (=), the value of the right
operand is converted to the type of the assignment
expression and replaces the value stored in the object
designated by the left operand.

6.5.16 Assignment operators
Semantics
[#3]
The type of an assignment expression is
the type of the left operand unless the left operand has
qualified type, in which case it is the unqualified version
of the type of the left operand.


6.3.1.1 Boolean, characters, and integers
[#2]
If an int can represent all values of the original type, the
value is converted to an int; otherwise, it is converted to
an unsigned int. These are called the integer
promotions.
 
E

Eric Sosman

Christopher said:
Is ~7 guaranteed to be ...11111111111000 on all systems?

Yes. There's this pettifogging possibility, though,
that ...1111000 could be a trap representation for `int'
(as far as I know, the only platform for which this is
true is the Deathstation 9000, and then only on alternate
Thursdays when the moon is full). For 100% safety, you
could write `~7u' instead.

... and if that's the worst thing you need to worry
about, you are to be envied.
 
A

Alexander Bartolich

begin followup to Christopher Benson-Manica:
Is ~7 guaranteed to be ...11111111111000 on all systems?

Well, on a machine running trinary logic ... perhaps using three
charge states of an atom ... where the decimal number '7' is
represented by the digits '21' ... well, how the **** is bitwise
negation meant to work there?

Perhaps like multiplication with -1, i.e. the zero 'trit' is left
unchanged while the outer two trit values are flipped. In that case
negating '21' results in '12' since leading zeros are not modified.

Yeah.
 
P

pete

Eric said:
Yes. There's this pettifogging possibility, though,
that ...1111000 could be a trap representation for `int'
(as far as I know, the only platform for which this is
true is the Deathstation 9000, and then only on alternate
Thursdays when the moon is full). For 100% safety, you
could write `~7u' instead.

I think that's best. As a matter of policy,
I prefer to avoid bitwise operations on signed types,
unless there is a special reason.
 
P

Peter Nilsson

Eric Sosman said:
Yes. There's this pettifogging possibility, though,
that ...1111000 could be a trap representation for `int'
(as far as I know, the only platform for which this is
true is the Deathstation 9000, and then only on alternate
Thursdays when the moon is full). For 100% safety, you
could write `~7u' instead.

Well, if you think that ~ can change padding bits, then ~7u isn't safe
either, since even unsigned types can have padding, and hence,
potential trap representations (apart from the uintN_t types of
course).

A case to hypothetically consider would be if INT_MIN did not have a
magnitude of 2^N or 2^N-1. In other words, if you consider that an int
can have a bizarre range like (-78269..65318).
 
P

pete

Peter said:
Well, if you think that ~ can change padding bits,
then ~7u isn't safe either, since even unsigned types can have
padding, and hence, potential trap representations
(apart from the uintN_t types of course).

He may have been refering to negative zero, instead of padding bits.

In C99 there's only 3 formats for representing negative integers,
but in C89, the representation for negative integer values
is only specified in broad terms relating to sign and value bits,
which would allow an implementation to define any particular
negative integer value representation, as negative zero.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top