a bit of a puzzle

  • Thread starter Steven G. Johnson
  • Start date
S

Steven G. Johnson

Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

To save you the trouble of compiling and running it yourself, here is
what it produces for n = 0,1,2,...,31:

0 -> 0, 1 -> 1, 2 -> 0, 3 -> 2, 4 -> 0, 5 -> 1, 6 -> 0, 7 -> 3, 8 ->
0, 9 -> 1, 10 -> 0, 11 -> 2, 12 -> 0, 13 -> 1, 14 -> 0, 15 -> 4, 16 ->
0, 17 -> 1, 18 -> 0, 19 -> 2, 20 -> 0, 21 -> 1, 22 -> 0, 23 -> 3, 24 -
 
R

Richard Heathfield

Steven G. Johnson said:

To save you the trouble of compiling and running it yourself, here is
what it produces for n = 0,1,2,...,31:

0 -> 0, 1 -> 1, 2 -> 0, 3 -> 2, 4 -> 0, 5 -> 1, 6 -> 0, 7 -> 3, 8 ->
0, 9 -> 1, 10 -> 0, 11 -> 2, 12 -> 0, 13 -> 1, 14 -> 0, 15 -> 4, 16 ->
0, 17 -> 1, 18 -> 0, 19 -> 2, 20 -> 0, 21 -> 1, 22 -> 0, 23 -> 3, 24 -

Obviously I can't tell what the algorithm is actually used for, but it is
isomorphic to the following usage: the Nth number represents the number of
levels beneath the Nth item discovered in a perfectly balanced binary tree
during a postorder traversal (left, centre, right).
 
S

Steven G. Johnson

Obviously I can't tell what the algorithm is actually used for, but it is
isomorphic to the following usage: the Nth number represents the number of
levels beneath the Nth item discovered in a perfectly balanced binary tree
during a postorder traversal (left, centre, right).

There is a simpler, purely arithmetic definition of what it computes.

By "why?" I don't mean "why was this particular code
written" (obviously unanswerable) but "why does it compute what it
does for all n (once you figure out what it does)?" i.e., how does it
work?
 
H

Harald van Dijk

Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};

n = ~n;
return bar[(a * (n & (-n))) >> 27];

n = ~n;
n = a * (n & -n);
return bar[n >> 27];

You're assuming that the multiplication (a * (n & (-n))) will take place
in 32 bits. This is not the case when int or unsigned int has more than
32 bits. Store the result in n to be sure.
}

To save you the trouble of compiling and running it yourself, here is
what it produces for n = 0,1,2,...,31:

0 -> 0, 1 -> 1, 2 -> 0, 3 -> 2, 4 -> 0, 5 -> 1, 6 -> 0, 7 -> 3, 8 -> 0,
9 -> 1, 10 -> 0, 11 -> 2, 12 -> 0, 13 -> 1, 14 -> 0, 15 -> 4, 16 -> 0,
17 -> 1, 18 -> 0, 19 -> 2, 20 -> 0, 21 -> 1, 22 -> 0, 23 -> 3, 24 -

This looks like the number of consecutive bits set, starting from the
least significant.

(~n) & (-~n) can be written as (n + 1) & ~n. It is the value of the
lowest bit that is not set. The multiplication by 0x05f66a47 happens to
produce unique values in bits 27-31 for the first 32 powers of two (I'm
not seeing how this value was obtained), so after extracting those bits,
you can use a lookup table to find the result.
 
S

Steven G. Johnson

Just to be clear, I know what it does and how it works; I'm not asking
for programming help. I'm posing it as a puzzle for amusement and
edification.

Background:

It's a C implementation of an algorithm I found in a rather famous
book, for a rather simple and important arithmetic problem (which
shows up as a subproblem in Gray codes, low-discrepancy sequences, and
many other problems), and is one of the fastest and most compact
solutions to this problem that I could find (without using assembly)
(although there is one slightly faster method on my machine that is
not so compact).

What I found amusing is that, if you don't comment the code, it seems
quite nonobvious what this algorithm computes, and even if you deduce
that from the function outputs it seems very nonobvious why/how it
works.

Regards,
Steven G. Johnson
 
C

CBFalconer

Steven G. Johnson said:
Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

Doesn't seem to work very well on a machine with 16 bit integers.
 
S

Steven G. Johnson

(~n) & (-~n) can be written as (n + 1) & ~n. It is the value of the
lowest bit that is not set. The multiplication by 0x05f66a47 happens to
produce unique values in bits 27-31 for the first 32 powers of two (I'm
not seeing how this value was obtained), so after extracting those bits,
you can use a lookup table to find the result.

Yes, the algorithm is described in Knuth volume 4A (draft fascicle)
section 7.1.3 ("Bitwise tricks and techniques"), who attributes it
Lauter and others in 1997. The main trick is the magic constant
"0x05f66a47"; Knuth describes the properties it requires (related to
de Bruijn) cycles, and the fact that there are many such constants
that will work; this particular one was found by a brute-force search.

It's a cute and compact way of finding the number of rightmost-1 bits
(or rightmost-zero bits, without the ~n). A slightly (~15%) faster
way (but less compact), at least on my machine, is to search the bytes
of ~n, starting from the least-significant byte, until a nonzero byte
is found, and then use a 256-element lookup table...this has slower
worst-case performance (for cases where the rightmost zero bit is in
the most significant byte), but slightly better average-case
performance (YMMV) (since a random number has a 255/256 chance of
having its rightmost zero in the least-significant byte). Of course,
on processors like x86 you can do faster in assembly, or with gcc's
__builtin_ctz function, but it's a fun little puzzle to do this as
fast and/or as compactly as possible in plain C.

Steven
 
K

Keith Thompson

CBFalconer said:
Steven G. Johnson said:
Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

Doesn't seem to work very well on a machine with 16 bit integers.

Most machines have 16-bit integers; often the 16-bit integer type is
called "short".

If you mean 16-bit ints, I don't see how that would cause a problem,
since the fucntion's argument is of type uint32_t. (The unsigned
result shouldn't be a problem unless you're worried about integers
with more than 32767 bits.)

Or am I missing something?
 
C

CBFalconer

Keith said:
CBFalconer said:
Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n) {
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

Doesn't seem to work very well on a machine with 16 bit integers.

Most machines have 16-bit integers; often the 16-bit integer type is
called "short".

If you mean 16-bit ints, I don't see how that would cause a problem,
since the fucntion's argument is of type uint32_t. (The unsigned
result shouldn't be a problem unless you're worried about integers
with more than 32767 bits.)

Or am I missing something?

Yeah, I meant int, and was sloppy. To me, uint32_t doesn't exist,
since it is not guaranteed. Lets make the complaint about a
machine with an 18 bit int.
 
J

Joachim Schmitz

CBFalconer said:
Steven G. Johnson said:
Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

Doesn't seem to work very well on a machine with 16 bit integers.
Mind to enlighten me why?

Bye, Jojo
 
G

Gerry Ford

I'll admit to the non-obviousness. Would you state its relevance to
X.690-0207.pdf ?

--
Gerry Ford



"Anybody who says, that a high-speed collision with water is the same as
with concrete, likely has more experience with the former than the latter."
 
U

user923005

Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
     const uint32_t a = 0x05f66a47;
     static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,­10,29,9,8,7};
     n = ~n;
     return bar[(a * (n & (-n))) >> 27];

}

To save you the trouble of compiling and running it yourself, here is
what it produces for n = 0,1,2,...,31:

0 -> 0, 1 -> 1, 2 -> 0, 3 -> 2, 4 -> 0, 5 -> 1, 6 -> 0, 7 -> 3, 8 ->
0, 9 -> 1, 10 -> 0, 11 -> 2, 12 -> 0, 13 -> 1, 14 -> 0, 15 -> 4, 16 ->
0, 17 -> 1, 18 -> 0, 19 -> 2, 20 -> 0, 21 -> 1, 22 -> 0, 23 -> 3, 24 -


0, 25 -> 1, 26 -> 0, 27 -> 2, 28 -> 0, 29 -> 1, 30 -> 0, 31 -> 5- Hide quoted text -

Here's a 64 bit version of something very similar:
const int lsz64_tbl[64] =
{
0, 31, 4, 33, 60, 15, 12, 34,
61, 25, 51, 10, 56, 20, 22, 35,
62, 30, 3, 54, 52, 24, 42, 19,
57, 29, 2, 44, 47, 28, 1, 36,
63, 32, 59, 5, 6, 50, 55, 7,
16, 53, 13, 41, 8, 43, 46, 17,
26, 58, 49, 14, 11, 40, 9, 45,
21, 48, 39, 23, 18, 38, 37, 27,
};
//Gerd Isenberg's implementation of bitscan:
int GerdBitScan(Bitboard bb)
{
const Bitboard lsb = (bb & -(long long) bb) - 1;
const unsigned int foldedLSB = ((int) lsb) ^ ((int) (lsb >> 32));
return lsz64_tbl[foldedLSB * 0x78291ACF >> 26];
}

//Gerd Isenberg's implementation of bitscan with clear:
int GerdBitScanReset(Bitboard *bb)
{
const Bitboard lsb = (bb[0] & -(long long) bb[0]) - 1;
const unsigned int foldedLSB = ((int) lsb) ^ ((int) (lsb >> 32));
bb[0] &= (bb[0] - 1);
return lsz64_tbl[foldedLSB * 0x78291ACF >> 26];
}

Chess programmers will recognize DeBrun's sequences. It was
popularized by Matthew Henry, IIRC.
See, for instance:
http://chessprogramming.wikispaces.com/BitScan
 
U

user923005

Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?
#include <stdint.h>
unsigned foo(uint32_t n)
{
     const uint32_t a = 0x05f66a47;
     static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,­­10,29,9,8,7};
     n = ~n;
     return bar[(a * (n & (-n))) >> 27];

To save you the trouble of compiling and running it yourself, here is
what it produces for n = 0,1,2,...,31:
0 -> 0, 1 -> 1, 2 -> 0, 3 -> 2, 4 -> 0, 5 -> 1, 6 -> 0, 7 -> 3, 8 ->
0, 9 -> 1, 10 -> 0, 11 -> 2, 12 -> 0, 13 -> 1, 14 -> 0, 15 -> 4, 16 ->
0, 17 -> 1, 18 -> 0, 19 -> 2, 20 -> 0, 21 -> 1, 22 -> 0, 23 -> 3, 24 -

Here's a 64 bit version of something very similar:
const int       lsz64_tbl[64] =
{
    0, 31, 4, 33, 60, 15, 12, 34,
    61, 25, 51, 10, 56, 20, 22, 35,
    62, 30, 3, 54, 52, 24, 42, 19,
    57, 29, 2, 44, 47, 28, 1, 36,
    63, 32, 59, 5, 6, 50, 55, 7,
    16, 53, 13, 41, 8, 43, 46, 17,
    26, 58, 49, 14, 11, 40, 9, 45,
    21, 48, 39, 23, 18, 38, 37, 27,};

//Gerd Isenberg's implementation of bitscan:
int             GerdBitScan(Bitboard bb)
{
    const Bitboard  lsb = (bb & -(long long) bb) - 1;
    const unsigned int foldedLSB = ((int) lsb) ^ ((int) (lsb >> 32));
    return lsz64_tbl[foldedLSB * 0x78291ACF >> 26];

}

//Gerd Isenberg's implementation of bitscan with clear:
int             GerdBitScanReset(Bitboard *bb)
{
    const Bitboard  lsb = (bb[0] & -(long long) bb[0]) - 1;
    const unsigned int foldedLSB = ((int) lsb) ^ ((int) (lsb >> 32));
    bb[0] &= (bb[0] - 1);
    return lsz64_tbl[foldedLSB * 0x78291ACF >> 26];

}

Chess programmers will recognize DeBrun's sequences.  It was
popularized by Matthew Henry, IIRC.
See, for instance:http://chessprogramming.wikispaces.com/BitScan- Hide quoted text -

- Show quoted text -

Oops... Left out the typedef necessary to grok this code:
typedef unsigned long long Bitboard;
 
R

Richard

CBFalconer said:
Steven G. Johnson said:
Here is a little algorithm I came across whose implementation is
amusingly obscure: what simple function does the following C code
compute, and why?

#include <stdint.h>
unsigned foo(uint32_t n)
{
const uint32_t a = 0x05f66a47;
static const unsigned bar[32] =
{0,1,2,26,23,3,15,27,24,21,19,4,12,16,28,6,31,25,22,14,20,18,11,5,30,13,17,10,29,9,8,7};
n = ~n;
return bar[(a * (n & (-n))) >> 27];
}

Doesn't seem to work very well on a machine with 16 bit integers.

Why? Please explain.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top