Any way to use a range as a key in a dictionary?

M

Mudcat

I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.

So there are a couple of ways to do this that I've seen. I can loop
that many times and create a huge dictionary. This isn't a good idea.
I can just assume if a key isn't there that it's not relevant. That's
a better idea.

However I wondered if there was a way to simply use a range as a key
reference somehow. I played around with some options with no success.
Or maybe there was another way to do this with another data type that
I haven't thought about.

Thanks
 
D

Daniel Fetchinson

I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.

So there are a couple of ways to do this that I've seen. I can loop
that many times and create a huge dictionary. This isn't a good idea.
I can just assume if a key isn't there that it's not relevant. That's
a better idea.

However I wondered if there was a way to simply use a range as a key
reference somehow. I played around with some options with no success.
Or maybe there was another way to do this with another data type that
I haven't thought about.

Ranges are lists and lists are unhashable. But tuples are hashable so
if you convert your lists into tuples, that will work:
x = dict( )
x[range(10)]='hello'
Traceback (most recent call last):
File said:
x[tuple(range(10))]='hello'
x[tuple(range(10))] 'hello'
 
P

Paul Rubin

Mudcat said:
However I wondered if there was a way to simply use a range as a key
reference somehow.

Dictionaries aren't really the right structure for that. You want a
tree or binary search structure of some sort. The bisect module might
be helpful, but you will have to write some code by hand if you use
it.
 
A

Aaron Brady

Mudcat said:
I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.

[...]

i don't completely follow what you are doing, but i currently use the
following to find a transition in a finite automaton for a regular
expression, and i suspect it's similar to what you want.

my problem is that i want to associate certain ranges of characters with a
particular value.  for example, a-c might be 1, p-x might be 2.  i use
intervals that are (a, b) tuples, where a and b are the inclusive bounds
(so (a, b) means a <= x <= b).
snip

and i just realised that i deleted the code that lets me also associate
values with intervals!  but hopefully that gives you the idea.  bisect can
be used to find values via a sorted index.  so once you find "index"
above, you can use that to return a value from an array.

i doubt this is super-fast (i think the bisect library is pure python),
but it was good enough for my initial attempt.

andrew

I wonder if there is a faster method than a binary search, or what the
proof that there isn't is.
 
C

Carl Banks

I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.


I seems like all you really need it to use the get method. If the
item doesn't exist in the dictionary it returns None (or whatever you
pass in as the optional second argument). For instance:


d = { 1: "s", 56: "w", 4363: "n", 8953: "k" }

d.get(1) -> "s"
d.get(56) -> "w"
d.get(10) -> None
d.get(10,"x") -> "x"


Carl Banks
 
D

Dennis Lee Bieber

I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.

So there are a couple of ways to do this that I've seen. I can loop
that many times and create a huge dictionary. This isn't a good idea.
I can just assume if a key isn't there that it's not relevant. That's
a better idea.
Easiest done by NOT using the
dictionary[key]
syntax to retrieve values. Instead use
dictionary.get(key, "NOT_RELEVANT_VALUE")
.... v = theDict.get(x, NOT_RELEVANT)
.... if v is not NOT_RELEVANT:
.... print x, v
....
1 I'm here
5 So am I

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
P

Paul Rubin

Dennis Lee Bieber said:
... v = theDict.get(x, NOT_RELEVANT)
... if v is not NOT_RELEVANT:
... print x, v

I think you'd normally do this with

if x in theDict:
print x, v

but the OP was asking about a different problem, involving looking up
numeric ranges, which could conceivably be very large, thus the
suggestions about interval trees and so forth.
 
C

Carl Banks

I think you'd normally do this with

     if x in theDict:
          print x, v

Where does v come from?
but the OP was asking about a different problem, involving looking up
numeric ranges, which could conceivably be very large, thus the
suggestions about interval trees and so forth.

The fact that the OP asked about what to do with the the other 65526
values suggests that he only wanted to use intervals to fill in the
gaps between meaningful values (thus having complete coverage over the
16-bit integer range). An interval tree would not be a good approach
for that situation.


Carl Banks
 
C

Carl Banks

Where does v come from?

Oops, pasted from original.  Meant of course "print x, theDict[x]".

You have look up x twice with that code, whereas you wouldn't have to
with this:

v = theDict.get(x)
if v is not None:
print x, v

or with this:

try:
v = theDict[x]
except KeyError:
pass
else:
print x,v

Also, these minimize the number of times you have to type x. Thus I
recommend either of these two, or a defaultdict, for this.


Carl Banks
 
P

Peter Otten

Mudcat said:
I would like to use a dictionary to store byte table information to
decode some binary data. The actual number of entries won't be that
large, at most 10. That leaves the other 65525 entries as 'reserved'
or 'other' but still need to be somehow accounted for when
referenced.

So there are a couple of ways to do this that I've seen. I can loop
that many times and create a huge dictionary. This isn't a good idea.
I can just assume if a key isn't there that it's not relevant. That's
a better idea.

However I wondered if there was a way to simply use a range as a key
reference somehow. I played around with some options with no success.
Or maybe there was another way to do this with another data type that
I haven't thought about.

You can use a defaultdict
from collections import defaultdict
d = defaultdict(lambda: "reserved", [(1,2), (2,3)])
d[1] 2
d[2] 3
d[42]
'reserved'

If the keys are successive integers starting at 0 a list is also an option.
It makes setting ranges to a particular value easy:
d = ["reserved"]*2**16
d[10:20] = [42]*10
d[5], d[10], d[15], d[20]
('reserved', 42, 42, 'reserved')

Peter
 
C

Carl Banks

Carl Banks said:
     if x in theDict:
          print x, v
Where does v come from?
Oops, pasted from original.  Meant of course "print x, theDict[x]".
You have look up x twice with that code, whereas you wouldn't have to
with this:
v = theDict.get(x)
if v is not None:
    print x, v

Note that while you only lookup x in the dict once your code does still
involve two dict lookups: once to lookup the get method and once to lookup
x. It also involves creating a new bound method object so if performance is
a concern you'll find that either the 'if x in theDict: ...' or the
try...except are likely to be faster.

Not necessarily: if the hash calculation for x is expensive enough the
get version would still be faster. If all you're doing is looking up
strings or ints (as the OP is doing) it's hardly going to matter
either way.


Carl Banks
 
P

Paul Rubin

Carl Banks said:
Not necessarily: if the hash calculation for x is expensive enough the
get version would still be faster.

Yeah, the get version with the special marker value is just ugly IMO,
as is the version with exceptions. Maybe there should be a two-value
version:

found, value = d.xget(key [,default])

xget returns a 2-tuple, the first element of which is a boolean
indicating whether the key was present. The second is the value (if
present) or the default (if supplied), or None.

Another possible interface:

values = d.vget(key)

This returns a list containing the values found for the key, or an
empty list if none are found. In the case where d is a dict,
the list is therefore always either 1-element or empty, but one
could imagine other mapping types where longer lists could come back.
 
D

Dennis Lee Bieber

Yeah, the get version with the special marker value is just ugly IMO,
as is the version with exceptions. Maybe there should be a two-value
version:

found, value = d.xget(key [,default])

xget returns a 2-tuple, the first element of which is a boolean
indicating whether the key was present. The second is the value (if
present) or the default (if supplied), or None.
And what real difference is that going to gain versus the existing
..get() method where the default is a sentinel? You still end up with the
need for some subsequent test:

vl = d.get(key, SENTINEL)
if vl is not SENTINEL:
#do stuff

vs

f, vl = d.xget(key)
if f:
#do stuff

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
P

Paul Rubin

Dennis Lee Bieber said:
And what real difference is that going to gain versus the existing
.get() method where the default is a sentinel?

It's just less ugly. I don't know a way to get a unique sentinel
other than "sentinel = object()" or something like that, creating more
clutter. Maybe in a specific application, I can figure out a value
that won't occur in the dictionary and use that, but it's cleaner to
use a generic construction.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,869
Messages
2,569,911
Members
46,168
Latest member
wql4450989

Latest Threads

Top