Bit enumeration algorithms

A

adam.ierymenko

I have an algorithm question for you all:

I am familiar with table-free bit counting algorithms such as the MIT
HAKMEM algorithm and others. However, I have an application where I
need to enumerate the indexes of every bit in an integer (64 bit
integer in this case). For example, if I had the following 8-bit
integer:

01001000

I want an array to contain the numbers [ 3,6 ] along with an integer
'2' telling me that two bits were set.... or I want the same
information in some other form.

So right now what I'm doing is looping, testing, and shifting. It's
not bad, but I'm wondering if there's anything faster.

I tried inline assembly language using the 'bsf' instruction (bit scan
forward), but found that using this instruction seems to be *slower*
(yeah, I was surprised too!) than just the naive loop-test-shift even
though far fewer instructions were executed in the 'bsf'
implementation. I think this instruction uses a lot of clock cycles.
I know it's a rarely used instruction, so it's possible that modern
processors choose to implement it inefficiently to make room for more
efficient implementations of commonly used instructions. (This was an
Athlon-MP machine.)

So right now loop-test-shift with no special inline assembly language
voodoo is the fastest thing I've come up with. I do of course test for
the integer being zero as it loops, so it will abort the loop if no
more bits are left.

By the way, what I end up doing at the end of this is to pick a set bit
at random. If there's any fast shortcut to doing this and skipping the
whole bit enumeration, that would work too. Note that in my
application the bits are usually going to be sparse, so just testing
random bits turns out to be no better than enumerating bits and then
picking one.

So any wizards want to take this one up?

-Adam
 
R

Robert Mabee

I am familiar with table-free bit counting algorithms such as the MIT
HAKMEM algorithm and others. However, I have an application where I
need to enumerate the indexes of every bit in an integer (64 bit
integer in this case).

Why not use a table? You can use a binary search to reduce the range
from 64 bits to something appropriate, like 16 or 8, and then a direct
table operation to get the last bits of the index, to either the MSB
or LSB as you like. Turn off that bit and repeat to get another index.
Need more speed? Don't need to repeat binary search until byte it found
is used up; then only need search of remaining bytes.
 
A

adam.ierymenko

Robert said:
Why not use a table? You can use a binary search to reduce the range
from 64 bits to something appropriate, like 16 or 8, and then a direct
table operation to get the last bits of the index, to either the MSB
or LSB as you like. Turn off that bit and repeat to get another index.
Need more speed? Don't need to repeat binary search until byte it found
is used up; then only need search of remaining bytes.

That's a good idea that for some reason I hadn't thought of.

One way to do it would be to have two tables. The first table would be
a table containing all possible search orders to search the 8 bytes of
a 64-bit integer. The second table would contain bit enumerations for
all possible values from 0x0 to 0xff (hex).

Step 1: pick a search order at random from the first table

Step 2: iterate according to the search order through the bytes in the
64-bit integer until a nonzero byte is found

Step 3: look up it's bit enumeration from the second table

Step 4: pick a random set bit and return

That's pretty good. when I get a chance I'll implement it and see how
it does vs. the naive method.

Any other ideas out there?

-Adam
 
R

Robert Mabee

One way to do it would be to have two tables. The first table would be
a table containing all possible search orders to search the 8 bytes of
a 64-bit integer. The second table would contain bit enumerations for
all possible values from 0x0 to 0xff (hex).

Step 1: pick a search order at random from the first table

Step 2: iterate according to the search order through the bytes in the
64-bit integer until a nonzero byte is found

Step 3: look up it's bit enumeration from the second table

Step 4: pick a random set bit and return

If I understand what you're proposing, it will not give uniform weight
to the possible outcomes, which probably is required. Suppose one byte
has one set bit and another has 8. The two bytes have equal probability
of being chosen but one divides that probability among 8 outcomes.

How about a binary search using a random preference for which half of
the range to check? Something like
for (int pos = 0, step = 32; step >= 1; step >>= 1)
if (random_bits & step)
{ if (arg_bits >> pos + step) pos += step;
}
else if (((arg_bits >> pos) & (1 << step) - 1) == 0)
pos += step;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top