how to get the thighest bit position in big integers?

mmgarvey · Oct 5, 2008

Hi,

I'm using python to develop some proof-of-concept code for a
cryptographic application. My code makes extended use of python's
native bignum capabilities.

In many cryptographic applications there is the need for a function
'get_highest_bit_num' that returns the position number of the highest
set bit of a given integer. For example:

get_highest_bit_num( (1 << 159)) == 159
get_highest_bit_num( (1 << 160) - 1) == 159
get_highest_bit_num( (1 << 160)) == 160

I implemented this the following way:

def get_highest_bit_num(r):
i = -1
while r > 0:
r >>= 1
i = i + 1
return i

This works, but it is a very unsatisfying solution, because it is so
slow.

My second try was using the math.log function:

import math
r = (1 << 160) - 1
print highest_bit_num(r) # prints out 159
print math.floor(math.log(r, 2)) # prints out 160.0

We see that math.log can only serve as a heuristic for the highest bit
position. For small r, for example r = (1 << 16) - 1, the result from
math.log(, 2) is correct, for big r it isn't any more.

My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python? I know that my two
ideas
can be combined to get something working. But is there a *better* way,
that isn't that hackish?

cheers,
mmg

Gary Herron · Oct 5, 2008

Hi,

I'm using python to develop some proof-of-concept code for a
cryptographic application. My code makes extended use of python's
native bignum capabilities.

In many cryptographic applications there is the need for a function
'get_highest_bit_num' that returns the position number of the highest
set bit of a given integer. For example:

get_highest_bit_num( (1 << 159)) == 159
get_highest_bit_num( (1 << 160) - 1) == 159
get_highest_bit_num( (1 << 160)) == 160

How about a binary search?

from bisect import bisect
BITS = [1<<n for n in range(1,1000)] # Use max number of bits here.
bisect(BITS, 1<<159) 159
bisect(BITS, 1<<160-1) 159
bisect(BITS, 1<<160) 160

Click to expand...

Click to expand...

I have no clue if this is faster or not. The comparison function used
in the search is (potentially) slow, but the search is O(log n) on the
number of bits rather than O(n), so its worth a test.

If you run timing test, let us know the results.

Gary Herron

Terry Reedy · Oct 5, 2008

Hi,

I'm using python to develop some proof-of-concept code for a
cryptographic application. My code makes extended use of python's
native bignum capabilities.

In many cryptographic applications there is the need for a function
'get_highest_bit_num' that returns the position number of the highest
set bit of a given integer. For example:

get_highest_bit_num( (1 << 159)) == 159
get_highest_bit_num( (1 << 160) - 1) == 159
get_highest_bit_num( (1 << 160)) == 160

I implemented this the following way:

def get_highest_bit_num(r):
i = -1
while r > 0:
r >>= 1
i = i + 1
return i

This works, but it is a very unsatisfying solution, because it is so
slow.

One alternative is to compare r against successive larger powers of 2,
starting with 2**0 == 1. Given that you have p=2**(n-1), you could test
whether generating 2**n for large n is faster as 1<<n or p<<1. A
refinement would be to raise the test power k at a time instead of 1 at
a time, tuning k for the implementation. Then do binary search in the
interval n, n+k. I once read that longs store 15 bits in pairs of
bytes. If true today, k = 15 might be a good choice since <<15 would
mean tacking on a pair of 0 bytes.

My second try was using the math.log function:

import math
r = (1 << 160) - 1
print highest_bit_num(r) # prints out 159
print math.floor(math.log(r, 2)) # prints out 160.0

We see that math.log can only serve as a heuristic for the highest bit
position. For small r, for example r = (1 << 16) - 1, the result from
math.log(, 2) is correct, for big r it isn't any more.

I tested and floor(log(2**n,2) == n for n in range(400), so your only
worry is that the answer is 1 too high. Easy to check and fix

res = int(math.floor(math.log(r, 2)))
if (1<<res) > r: res -=1

This works for 2**n-1 over the same range (all tests with 3.0rc1).

My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python?

If you call the standard methods hackish, I have no idea what would
satisfy you ;-)

Terry Jan Reedy

Aaron \Castironpi\ Brady · Oct 5, 2008

Duncan said:
My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python? I know that my two
ideas
can be combined to get something working. But is there a *better* way,
that isn't that hackish?

Click to expand...

How about using the hex representation of the value?

OFFSET = dict(("%x"%i, int(c)) for i,c in enumerate("5433222211111111"))
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[s[0]]

You can replace the dict if it's faster.

OFFSET= tuple( int(x) for x in "5433222211111111" )
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[int(s[0],16)]

P.S. Back home, this sort of 'nitpicking' would be judged
unconstructive. Worth pointing out, or not worth saying?

P.S.S. 'Thighest' bit? I thought the spam filters would catch that.

Terry Reedy · Oct 5, 2008

Scott said:
Since floating point has to identify the position of the highest bit,
you can use that hardware to "quickly" get to the highest bit. IEEE
has the mantissa .5 <= mantissa < 1., but some other floating point
formats treated the mantissa in different ranges. This should work
for anything where the exponent is truly a "binary point." In an older
IBM floating point format, for example, the exponent was a "hexadecimal
point," so the exponent only went up 1 when you multiplied by 16.

EXP_OF_ONE = math.frexp(1.0)[1]

def high_bit(value):
'''Find the highest order bit of a positive integer <= maxfloat'''
assert value > 0
return math.frexp(value)[1] - EXP_OF_ONE

Your point, that taking floor(log2(x)) is redundant, is a good catch.
However, you should have added 'untested' ;-). When value has more
significant bits than the fp mantissa can hold, this expression can be 1
off (but no more, I assume). The following correction should be
sufficient:

res = math.frexp(value)[1] - EXP_OF_ONE
test = 1<<res
if test > r: res -= 1
elif 2*test < r: res += 1

For value = 2**n -1, n > 53, it is always 1 too high on my Intel
machine, so the first correction is sometimes needed. I do not know if
the second is or not.

Terry Jan Reedy

Rich Healey · Oct 5, 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Duncan said:
Duncan said:

My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python? I know that my two
ideas
can be combined to get something working. But is there a *better* way,
that isn't that hackish?

Click to expand...

How about using the hex representation of the value?

OFFSET = dict(("%x"%i, int(c)) for i,c in enumerate("5433222211111111"))
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[s[0]]

Click to expand...

You can replace the dict if it's faster.

OFFSET= tuple( int(x) for x in "5433222211111111" )
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[int(s[0],16)]

P.S. Back home, this sort of 'nitpicking' would be judged
unconstructive. Worth pointing out, or not worth saying?

P.S.S. 'Thighest' bit? I thought the spam filters would catch that.

That should be P.P.S.

PS: This is also unconstructive nitpicking.

Kind regards

Rich

- --
Rich Healey - iTReign \ .''`. / (e-mail address removed)
Developer / Systems Admin \ : :' : / (e-mail address removed)
AIM: richohealey33 \ `. `' / (e-mail address removed)
MSN: (e-mail address removed) \ `- / (e-mail address removed)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjpVZUACgkQLeTfO4yBSAcgeACgr45Hu4XKyMjCf0jwq1LR35tU
Lv8AnRn2RgHCxJ3QwYpNO8/DjLncKv2t
=/WZa
-----END PGP SIGNATURE-----

Aaron \Castironpi\ Brady · Oct 5, 2008

Duncan said:
Duncan said:

(e-mail address removed) wrote:

Click to expand...

OFFSET = dict(("%x"%i, int(c)) for i,c in enumerate("5433222211111111"))
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[s[0]]

Click to expand...

OFFSET= tuple( int(x) for x in "5433222211111111" )
def get_highest_bit_num(r):
s = "%x"%r
return len(s) * 4 - OFFSET[int(s[0],16)]

That's really counterintuitive. (That's the word, yes.) They're both
function calls and both global variables. Maybe you could use 'len(s)
* 4' to mask out the rest and lookup that index, rather than
converting -back- to integer. Or, better yet, take 'ord' of s[0].
(Ha ha, -you- time it.)

Aaron \Castironpi\ Brady · Oct 5, 2008

That should be P.P.S.

PS: This is also unconstructive nitpicking.

Kind regards

Rich

Hey, whatever gets the bills paid, friend.

Mark Dickinson · Oct 6, 2008

Your point, that taking floor(log2(x)) is redundant, is a good catch.
However, you should have added 'untested' ;-). When value has more
significant bits than the fp mantissa can hold, this expression can be 1
off (but no more, I assume). The following correction should be
sufficient:

res = math.frexp(value)[1] - EXP_OF_ONE
test = 1<<res
if test > r: res -= 1
elif 2*test < r: res += 1

For value = 2**n -1, n > 53, it is always 1 too high on my Intel
machine, so the first correction is sometimes needed. I do not know if
the second is or not.

See also http://bugs.python.org/issue3439
where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Mark

bearophileHUGS · Oct 6, 2008

Mark Dickinson:

See alsohttp://bugs.python.org/issue3439
where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Sounds useful, I've used the gmpy.numdigits(x [,base]) few times.

Bye,
bearophile

Holger · Oct 6, 2008

See alsohttp://bugs.python.org/issue3439
where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Doesn't that depend on the underlying implementation?

Anyway, here's a pretty one (I think):
def highest_bit(n, maxbits = 256):
bit = 0
while maxbits > 1:
maxbits = maxbits >> 1
mask_b = ((1<<maxbits)-1)
mask_a = mask_b<<maxbits
a = n & mask_a
b = n & mask_b
if a:
bit = bit + maxbits
n = a >> maxbits
else:
n = b
return bit

I suspect you would call that a O(logn)) solution

Holger

Holger · Oct 6, 2008

def highest_bit(n, maxbits = 256):
bit = 0
while maxbits > 1:
maxbits >>= 1
a = n >> maxbits
if a:
bit += maxbits
n = a
return bit

is sligtly better

M.-A. Lemburg · Oct 6, 2008

Hi,

I'm using python to develop some proof-of-concept code for a
cryptographic application. My code makes extended use of python's
native bignum capabilities.

In many cryptographic applications there is the need for a function
'get_highest_bit_num' that returns the position number of the highest
set bit of a given integer. For example:

get_highest_bit_num( (1 << 159)) == 159
get_highest_bit_num( (1 << 160) - 1) == 159
get_highest_bit_num( (1 << 160)) == 160

I implemented this the following way:

def get_highest_bit_num(r):
i = -1
while r > 0:
r >>= 1
i = i + 1
return i

This works, but it is a very unsatisfying solution, because it is so
slow.

My second try was using the math.log function:

import math
r = (1 << 160) - 1
print highest_bit_num(r) # prints out 159
print math.floor(math.log(r, 2)) # prints out 160.0

We see that math.log can only serve as a heuristic for the highest bit
position. For small r, for example r = (1 << 16) - 1, the result from
math.log(, 2) is correct, for big r it isn't any more.

My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python? I know that my two
ideas
can be combined to get something working. But is there a *better* way,
that isn't that hackish?

In Python 2.6, you can use the bin() builtin:
159

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Oct 06 2008)________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

Fredrik Johansson · Oct 6, 2008

Hi,

My question to the group: Does anyone know of a non-hackish way to
determine the required bit position in python? I know that my two
ideas
can be combined to get something working. But is there a *better* way,
that isn't that hackish?

No non-hackish way.

I've myself needed a version of this function for a speed-critical
application. For my purposes, it needs to be fast both for small and
extremely large numbers; the following is the fastest version I've
come up with. The idea to use bisect with a precomputed list when
possible, and math.log (with a correction) for larger numbers. There's
a tradeoff: a longer bisect table makes it fast for larger number, but
the bisection also becomes slower for small numbers. For my
application, 300 turned out to be a reasonable compromise.

from bisect import bisect
from math import log

powers = [1<<_ for _ in range(300)]

def bitcount(n):
bc = bisect(powers, n)
if bc != 300:
return bc
bc = int(log(n, 2)) - 4
return bc + bctable[n>>bc]

bctable = map(bitcount, range(1024))

Calling this function is still slow, so when possible, I track the
approximate number of bits in a number across operations, and do (bc
+bctable[n>>bc]) inline (where bc is a low estimate of the number of
bits) to obtain the exact bit count.

Fredrik

Aaron \Castironpi\ Brady · Oct 6, 2008

Your point, that taking floor(log2(x)) is redundant, is a good catch.
However, you should have added 'untested' ;-). When value has more
significant bits than the fp mantissa can hold, this expression can be 1
off (but no more, I assume). The following correction should be
sufficient:

Click to expand...

res = math.frexp(value)[1] - EXP_OF_ONE
test = 1<<res
if test > r: res -= 1
elif 2*test < r: res += 1

Click to expand...

For value = 2**n -1, n > 53, it is always 1 too high on my Intel
machine, so the first correction is sometimes needed. I do not know if
the second is or not.

Click to expand...

See alsohttp://bugs.python.org/issue3439
where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Mark

That generates an odd error with ctypes.
Traceback (most recent call last):
File "_ctypes/callbacks.c", line 295, in 'calling callback function'
ctypes.ArgumentError: argument 1: <type 'exceptions.OverflowError'>:
long int to
o long to convert
2227205
Seems 'ctypes' tries to call PyLong_AsUnsignedLong for you. Anyone
know a way around it?

Thomas Heller · Oct 7, 2008

Mark said:
Your point, that taking floor(log2(x)) is redundant, is a good catch.
However, you should have added 'untested' ;-). When value has more
significant bits than the fp mantissa can hold, this expression can be 1
off (but no more, I assume). The following correction should be
sufficient:

res = math.frexp(value)[1] - EXP_OF_ONE
test = 1<<res
if test > r: res -= 1
elif 2*test < r: res += 1

For value = 2**n -1, n > 53, it is always 1 too high on my Intel
machine, so the first correction is sometimes needed. I do not know if
the second is or not.

Click to expand...

See also http://bugs.python.org/issue3439
where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Mark

Here is ctypes code that works (tested with Python 2.5):

from ctypes import *

_PyLong_NumBits = pythonapi._PyLong_NumBits
_PyLong_NumBits.argtypes = py_object,
_PyLong_NumBits.restype = c_int

for i in range(64):
x = 1 << i
print i, _PyLong_NumBits(long(x))

Thomas

mmgarvey · Oct 7, 2008

See also http://bugs.python.org/issue3439

where there's a proposal to expose the _PyLong_NumBits method. This
would give an O(1) solution.

Here is surely the wrong place to respond to
http://bugs.python.org/msg71384
but I want to make clear that I think that (0).numbits()==-1
is the natural solution. At least for all square-and-multiply-like
algorithms needed in cryptography (modular exponentiation, point
multiplication
on elliptic curves, generation of lucas numbers, etc) this property
of .numbits()
would be needed.

Mark Dickinson · Oct 8, 2008

but I want to make clear that I think that (0).numbits()==-1
is the natural solution. At least for all square-and-multiply-like
algorithms needed in [...]

Can you clarify this? Why is -1 the natural solution? I
can see a case for 0, for -infinity (whatever that means),
or for raising an exception, but I can't see why -1 would
be at all useful or natural.

Here's a straightforward (mildly inefficient)
implementation of the left-to-right square-and-multiply algorithm
for powering. It works just fine with nbits(0) == 0.

def nbits(n):
return 0 if n == 0 else len(bin(abs(n)))-2

def LtoRpow(a, n):
acc = 1
for i in reversed(xrange(nbits(n))):
acc *= acc
if n & (1<<i):
acc *= a
return acc

MRAB · Oct 8, 2008

but I want to make clear that I think that (0).numbits()==-1
is the natural solution. At least for all square-and-multiply-like
algorithms needed in [...]

Click to expand...

Can you clarify this? Why is -1 the natural solution? I
can see a case for 0, for -infinity (whatever that means),
or for raising an exception, but I can't see why -1 would
be at all useful or natural.

[snip]
Compare it with, say, str.find, which returns the position if found
(>=0) or -1 if not found.

int.numbits should return the position of the highest set bit (>=0) or
-1 if none found.

On the other hand, if you compare it wirh str.index then int.numbits
should raise ValueError if none found.

Actually, the name "numbits" suggests that it returns the number of
bits, so (1).numbits() should return 1 and (0).numbits() should return
0. If it's called "highbit" then it suggests that it returns the
position of the highest (set) bit, so (1).highbit() should return 0
and (0).highbit() should return -1 (or raise an exception).

Terry Reedy · Oct 8, 2008

MRAB said:
but I want to make clear that I think that (0).numbits()==-1
is the natural solution. At least for all square-and-multiply-like
algorithms needed in [...]

Click to expand...

Can you clarify this? Why is -1 the natural solution? I
can see a case for 0, for -infinity (whatever that means),
or for raising an exception, but I can't see why -1 would
be at all useful or natural.

Click to expand...

[snip]
Compare it with, say, str.find, which returns the position if found
(>=0) or -1 if not found.

str.find is an historical anomaly that should not be copied. It
was(is?) a wrapper for C's string find function. C routinely uses -1 to
mean None for functions statically typed to return ints. The Python
version logically should return None and usually does for other functions.

How to not load an insanely big dataset in less than 50 hrs	1	Sep 2, 2023
How to position the tooltip comment on these buttons?	9	Nov 4, 2023
What option should I take? Java Senior Software Developer or Genesys Developer in a really big company?	0	Jul 27, 2023
Starting a Big Project - Looking to Learn	2	Jan 18, 2020
Trying to get the average value of the elements, please help ! JavaScript	3	Dec 13, 2022
How to use Densenet121 in monai	0	Feb 16, 2024
Big problem I need to solve with some unix utils	1	Jun 19, 2022
JavaScript: how to keep track of the circle in canvas on specific path?	0	Mar 20, 2023

how to get the thighest bit position in big integers?

mmgarvey

Gary Herron

Terry Reedy

Aaron \Castironpi\ Brady

Terry Reedy

Rich Healey

Aaron \Castironpi\ Brady

Aaron \Castironpi\ Brady

Mark Dickinson

bearophileHUGS

Holger

Holger

M.-A. Lemburg

Fredrik Johansson

Aaron \Castironpi\ Brady

Thomas Heller

mmgarvey

Mark Dickinson

MRAB

Terry Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads