bit vector question: why shift (>>) 5?

G

gokkog

Hello,


Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Some references to bit vector tutorials are also welcome.


Thanks and best regards,
Wenjie
 
B

Bill Pursell

Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?" This would be much more clearly written as:
void set(int i) { a [ i / BITSPERWORD ] |= (1 << ( i % BITSPERWORD));}
etc. That should answer your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

And BITSPERWORD would be better defined as
#define BITSPERWORD ( sizeof (int) * CHAR_BIT)
 
J

Jean-Marc Bourguet

Bill Pursell said:
Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?"

Note that the book was first published in the mid 80's and a collection of
columns published before. I don't know how much it has been revised in the
recent printings/editions.
This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

I don't think so (note: i is an *signed* int).

Yours,
 
G

gokkog

Jean-Marc Bourguet said:
Bill Pursell said:
Recently I have the book Programming Pearls. Nice read! Perhaps it is
weekend, I cannot understand the following codes well:
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }

From http://netlib.bell-labs.com/cm/cs/pearls/bitsort.c

Why Jon #define SHIFT 5? (and #define MASK 0x1F)? Are there any
differences if I define SHIFT as 3? How about 8?

Are you sure this is "Programming Pearls" and not "Introduction
to Obfuscation?"

Note that the book was first published in the mid 80's and a collection of
columns published before. I don't know how much it has been revised in the
recent printings/editions.

I am reading the second edition (1999), according to the author, all
the example codes have been re-written.
This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead
to better code, but it's highly questionable. (Most compilers
will generate exactly the same code for the two.)

I don't think so (note: i is an *signed* int).

You are talking about specifically "<< as in (1<<...)" not ">> as in
a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind your
hint?


My thanks to all of you.
Wenjie
 
C

CBFalconer

.... snip ...

You are talking about specifically "<< as in (1<<...)" not ">> as
in a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind
your hint?
From N869:

6.5.7 Bitwise shift operators

Syntax

[#1]
shift-expr:
additive-expr
shift-expr << additive-expr
shift-expr >> additive-expr

Constraints

[#2] Each of the operands shall have integer type.

Semantics

.... snip ...

[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1+2E2, reduced
modulo one more than the maximum value representable in the
result type. If E1 has a signed type and nonnegative value,
and E1+2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.

[#5] The result of E1 >> E2 is E1 right-shifted E2 bit
positions. If E1 has an unsigned type or if E1 has a signed
type and a nonnegative value, the value of the result is the
integral part of the quotient of E1 divided by the quantity,
2 raised to the power E2. If E1 has a signed type and a
negative value, the resulting value is implementation-
defined.

i.e. for negative values the results are not portable. So don't do
that. The phrase "2E2" above is a result of the conversion to
text, and means 2 to the E2 power.
 
J

Jean-Marc Bourguet

I am reading the second edition (1999), according to the author, all the
example codes have been re-written.

That don't say how important the rewriting has been. It could have been
just adding the return type of main and prototypes. It could have been
more important. Even if the rewriting has been more important, when you
have something under your eyes, you are influenced by it.
This would be much more clearly written as: void set(int i) { a [ i /
BITSPERWORD ] |= (1 << ( i % BITSPERWORD));} etc. That should answer
your questions.

There are arguments that using the << operater will lead to better
code, but it's highly questionable. (Most compilers will generate
exactly the same code for the two.)

I don't think so (note: i is an *signed* int).

You are talking about specifically "<< as in (1<<...)" not ">> as in
a[i>>SHIFT]"? I tried both versions (Bill Pursell & Jon Bentley),
neither of them handles minus numbers correctly. So what is behind your
hint?

Shifting a negative value is undefined or implementation defined behaviour
(depending on the direction, see CBFalconer post), so most implementations
will not do something special to handle those cases while still trying to
meet the expectations of people knowing the target architecture.

Most two's complement machines have an arithmetic right shift which keep
the sign and I expect most C compiler for those machines will use it when
shifting signed int. The result is that, on those machine right shifting
is a division truncated towards minus infinity while C99 standard demand
that division is truncated towards zero. So few compiler will generate the
same code for a right shift of a signed int and for a division.


To do the right thing for bit vector, the first thing to do is to modify
the declaration and allocation of a so that negative indexes are possible.
That done, it seems to me that Jon Bentley's version will do the right
thing on most two's complement machines but that is not demanded by the
standard.

Yours,
 
B

Ben Bacarisse

CBFalconer said:
6.5.7 Bitwise shift operators
[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1+2E2, reduced
modulo one more than the maximum value representable in the
The phrase "2E2" above is a result of the conversion to
text, and means 2 to the E2 power.

The phrase "E1+" is also, I hope, a result of the conversion since in
my copy of a later draft the plus is, of course, a multiply operation.
 
T

T.M. Sommers

Jean-Marc Bourguet said:
That don't say how important the rewriting has been. It could have been
just adding the return type of main and prototypes. It could have been
more important. Even if the rewriting has been more important, when you
have something under your eyes, you are influenced by it.

The code in question does not exist in the first edition (1986)
of the book.
 
Joined
Jul 5, 2022
Messages
1
Reaction score
0
2^5 = 32, each integer in the array represents a 32 bit integer. It becomes clear if you work it out with pen and paper.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top