Dictionary, integer compression

D

dineshv

If you store a large number of integers (keys and values) in a
dictionary, do the Python internals perform integer compression to
save memory and enhance performance? Thanks

Dinesh
 
B

bearophileHUGS

Dinesh:
If you store a large number of integers (keys and values) in a
dictionary, do the Python internals perform integer compression to
save memory and enhance performance?  Thanks

Define what you mean with "integer compression" please. It has several
meanings according to the context. For example this one:
http://en.wikipedia.org/wiki/Golomb_coding
(The answer is probably negative).

Bye,
bearophile
 
D

dineshv

Yes, "integer compression" as in Unary, Golomb, and there are a few
other schemes.

It is known that for large (integer) data sets, encoding and decoding
the integers will save space (memory and/or storage) and doesn't
impact performance.

As the Python dictionary is a built-in (and an important data
structure), I wondered if the Python internals used integer
compression for the dictionary (especially as the size of the
dictionary grew)?

Dinesh
 
D

Diez B. Roggisch

dineshv said:
Yes, "integer compression" as in Unary, Golomb, and there are a few
other schemes.

It is known that for large (integer) data sets, encoding and decoding
the integers will save space (memory and/or storage) and doesn't
impact performance.

As the Python dictionary is a built-in (and an important data
structure), I wondered if the Python internals used integer
compression for the dictionary (especially as the size of the
dictionary grew)?

I doubt that, as even integers are python-objects that might be reused. Thus
every integer-entry costs at least sizeof(void*) bytes.

Diez
 
B

bearophileHUGS

dineshv:
Yes, "integer compression" as in Unary, Golomb, and there are a few
other schemes.

OK. Currently Python doesn't uses Golomb and similar compression
schemes.
But in Python3 all integers are multi-precision ones (I don't know yet
what's bad with the design of Python2.6 integers), and a multi-
precision integer takes a space that is proportional to its number of
bits, so this is a compression scheme (even if not designed for small
integers as you ask for).

Bye,
bearophile
 
D

dineshv

Hi bearophile

Thanks for that about Python3. My integers range from 0 to 9,999,999
and I have loads of them. Do you think Python3 will help?

I want to do testing on my local machine with the large numbers of
integers and was wondering if I can get away with an existing Python
data structure or will I have to code a compression scheme.

Dinesh
 
B

bearophileHUGS

dineshv:
Thanks for that about Python3.  My integers range from 0 to 9,999,999
and I have loads of them.  Do you think Python3 will help?

Nope.
Bye,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top