bit vector question: why shift (>>) 5?

gokkog · Sep 16, 2006

Hello,

Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

From http://netlib.bell-labs.com/cm/cs/pearls/bitsort.c

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Some references to bit vector tutorials are also welcome.

Thanks and best regards,
Wenjie

Bill Pursell · Sep 16, 2006

Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

From http://netlib.bell-labs.com/cm/cs/pearls/bitsort.c

Click to expand...

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?" This would be much more clearly written as:
void set(int i) { a [ i / BITSPERWORD ] |= (1 << ( i % BITSPERWORD));}
etc. That should answer your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

And BITSPERWORD would be better defined as
#define BITSPERWORD ( sizeof (int) * CHAR_BIT)

Jean-Marc Bourguet · Sep 16, 2006

Bill Pursell said:
Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

From http://netlib.bell-labs.com/cm/cs/pearls/bitsort.c

Click to expand...

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Click to expand...

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?"

Note that the book was first published in the mid 80's and a collection of
columns published before. I don't know how much it has been revised in the
recent printings/editions.

This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

I don't think so (note: i is an *signed* int).

Yours,

gokkog · Sep 17, 2006

Jean-Marc Bourguet said:
Bill Pursell said:

Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

From http://netlib.bell-labs.com/cm/cs/pearls/bitsort.c

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Click to expand...

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?"

Click to expand...

Note that the book was first published in the mid 80's and a collection of
columns published before. I don't know how much it has been revised in the
recent printings/editions.

I am reading the second edition (1999), according to the author, all
the example codes have been re-written.

This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

Click to expand...

I don't think so (note: i is an *signed* int).

You are talking about specifically "<< as in (1<<...)" not ">> as in
a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind your
hint?

My thanks to all of you.
Wenjie

CBFalconer · Sep 17, 2006

.... snip ...

You are talking about specifically "<< as in (1<<...)" not ">> as
in a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind
your hint?

From N869:

6.5.7 Bitwise shift operators

Syntax

[#1]
shift-expr:
additive-expr
shift-expr << additive-expr
shift-expr >> additive-expr

Constraints

[#2] Each of the operands shall have integer type.

Semantics

.... snip ...

[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1+2E2, reduced
modulo one more than the maximum value representable in the
result type. If E1 has a signed type and nonnegative value,
and E1+2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.

[#5] The result of E1 >> E2 is E1 right-shifted E2 bit
positions. If E1 has an unsigned type or if E1 has a signed
type and a nonnegative value, the value of the result is the
integral part of the quotient of E1 divided by the quantity,
2 raised to the power E2. If E1 has a signed type and a
negative value, the resulting value is implementation-
defined.

i.e. for negative values the results are not portable. So don't do
that. The phrase "2E2" above is a result of the conversion to
text, and means 2 to the E2 power.

Jean-Marc Bourguet · Sep 17, 2006

I am reading the second edition (1999), according to the author, all the
example codes have been re-written.

That don't say how important the rewriting has been. It could have been
just adding the return type of main and prototypes. It could have been
more important. Even if the rewriting has been more important, when you
have something under your eyes, you are influenced by it.

This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead to better
code, but it's highly questionable. (Most compilers will generate
exactly the same code for the two.)

Click to expand...

I don't think so (note: i is an *signed* int).

Click to expand...

You are talking about specifically "<< as in (1<<...)" not ">> as in
a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind your
hint?

Shifting a negative value is undefined or implementation defined behaviour
(depending on the direction, see CBFalconer post), so most implementations
will not do something special to handle those cases while still trying to
meet the expectations of people knowing the target architecture.

Most two's complement machines have an arithmetic right shift which keep
the sign and I expect most C compiler for those machines will use it when
shifting signed int. The result is that, on those machine right shifting
is a division truncated towards minus infinity while C99 standard demand
that division is truncated towards zero. So few compiler will generate the
same code for a right shift of a signed int and for a division.

To do the right thing for bit vector, the first thing to do is to modify
the declaration and allocation of a so that negative indexes are possible.
That done, it seems to me that Jon Bentley's version will do the right
thing on most two's complement machines but that is not demanded by the
standard.

Yours,

Ben Bacarisse · Sep 17, 2006

CBFalconer said:
6.5.7 Bitwise shift operators

[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1+2E2, reduced
modulo one more than the maximum value representable in the

The phrase "2E2" above is a result of the conversion to
text, and means 2 to the E2 power.

The phrase "E1+" is also, I hope, a result of the conversion since in
my copy of a later draft the plus is, of course, a multiply operation.

T.M. Sommers · Sep 17, 2006

Jean-Marc Bourguet said:
That don't say how important the rewriting has been. It could have been
just adding the return type of main and prototypes. It could have been
more important. Even if the rewriting has been more important, when you
have something under your eyes, you are influenced by it.

The code in question does not exist in the first edition (1986)
of the book.

coder123 · Jul 5, 2022

2^5 = 32, each integer in the array represents a 32 bit integer. It becomes clear if you work it out with pen and paper.

vector	4	Nov 28, 2013
Qsort() messing with my entire Code	0	Apr 25, 2022
Finding a bit string in another bitstring	9	Oct 30, 2009
strange problem of sorting	7	Jul 6, 2007
shift question	3	Apr 16, 2006
shift question	0	Apr 16, 2006
Optimize a discovery algorithm	15	Jan 1, 2014
left shift operator behaves like left rotate when the operand is a variable.	2	Jun 8, 2011

bit vector question: why shift (>>) 5?

gokkog

Bill Pursell

Jean-Marc Bourguet

gokkog

CBFalconer

Jean-Marc Bourguet

Ben Bacarisse

T.M. Sommers

coder123

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads