how are dictionary literals handled by the interpreter?

A

akameswaran

I wrote up a quick little set of tests, I was acutally comparing ways
of doing "case" behavior just to get some performance information. Now
two of my test cases had almost identical results which was not at all
what I expected. Ultimately I realized I don't really know how
literals are treated within the interpreter.

The two implementations I was looking at were:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}

def doIt(self,a):
exec(self.caseDict.get(a))
return retval



def caseFunc3(a):
exec( {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}.get(a))
return retval


I had expected caseFunc3 to be slower. I had thought the interpreter
would be recreating the dictionary each time, but that doesn't seem to
be the case since performance of the class version and the function
version are nearly identical on most runs. If i rewrite caseFunc3 as:

def caseFunc4(a):
exec( dict({'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}).get(a))
return retval

now with the explicit use of dict, i see the performace of the
functional version decline as I initially expected.

So what is happeneing in caseFunc3. It seems as though the literal is
"cached". The other hypothesis I came up with is the name lookup for
self.caseDict takes the same amount of time as creating the dictionary
literal - but that doesn't make sense to me.

Thanks
 
J

Jason

I wrote up a quick little set of tests, I was acutally comparing ways
of doing "case" behavior just to get some performance information. Now
two of my test cases had almost identical results which was not at all
what I expected. Ultimately I realized I don't really know how
literals are treated within the interpreter.

The two implementations I was looking at were:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}

def doIt(self,a):
exec(self.caseDict.get(a))
return retval



def caseFunc3(a):
exec( {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}.get(a))
return retval


I had expected caseFunc3 to be slower. I had thought the interpreter
would be recreating the dictionary each time, but that doesn't seem to
be the case since performance of the class version and the function
version are nearly identical on most runs. If i rewrite caseFunc3 as:

def caseFunc4(a):
exec( dict({'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}).get(a))
return retval

now with the explicit use of dict, i see the performace of the
functional version decline as I initially expected.

So what is happeneing in caseFunc3. It seems as though the literal is
"cached". The other hypothesis I came up with is the name lookup for
self.caseDict takes the same amount of time as creating the dictionary
literal - but that doesn't make sense to me.

Thanks

Why not check to see what the interpreter is doing? Rather than
dealing with your overly complicated dictionaries, I've made a simple,
one case dictionary. I've also done a similar bit to replicate your
doIt method.
.... exec( {'a': "retval = 'a'"}.get(a) )
.... return retval
........ exec( dict({'a': "retval = 'a'"}).get(a) )
.... return retval
........ def doIt(self, a):
.... exec(self.caseDict.get(a))
.... return retval
....

Then, use the dis module to disassemble the function objects: 2 0 BUILD_MAP 0
3 DUP_TOP
4 LOAD_CONST 1 ('a')
7 LOAD_CONST 2 ("retval = 'a'")
10 ROT_THREE
11 STORE_SUBSCR
12 LOAD_ATTR 0 (get)
15 LOAD_FAST 0 (a)
18 CALL_FUNCTION 1
21 LOAD_CONST 0 (None)
24 DUP_TOP
25 EXEC_STMT

3 26 LOAD_NAME 2 (retval)
29 RETURN_VALUE 2 0 LOAD_NAME 0 (dict)
3 BUILD_MAP 0
6 DUP_TOP
7 LOAD_CONST 1 ('a')
10 LOAD_CONST 2 ("retval = 'a'")
13 ROT_THREE
14 STORE_SUBSCR
15 CALL_FUNCTION 1
18 LOAD_ATTR 1 (get)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1
27 LOAD_CONST 0 (None)
30 DUP_TOP
31 EXEC_STMT

3 32 LOAD_NAME 3 (retval)
35 RETURN_VALUE 3 0 LOAD_FAST 0 (self)
3 LOAD_ATTR 1 (caseDict)
6 LOAD_ATTR 2 (get)
9 LOAD_FAST 1 (a)
12 CALL_FUNCTION 1
15 LOAD_CONST 0 (None)
18 DUP_TOP
19 EXEC_STMT

4 20 LOAD_NAME 4 (retval)
23 RETURN_VALUE
Take a look at what happens before the 'get' attribute is loaded in
each case. In case 3, you've simply created a dictionary literal,
which is a very fast operation under Python. In case 4, you've created
a dictionary literal, then you call the dict() function. The dict
function will create a dictionary from the supplied dictionary, and
return the shallow copy.

Case 3 is slower, but the Python developers have worked to make
dictionary creation and look-up very fast. Did you use the timeit
module to test your functions?

--Jason
 
A

akameswaran

the lengthy dictionaries were due to my initial testing comparing
if/ifelse constructs to dictionary cases.. btw if/ifelse seems a lot
faster till you get to at least 12 cases, and for most applications,
the number of cases would be even higher before dictionaries beat a
bunch of ifs
Why not check to see what the interpreter is doing? Rather than
dealing with your overly complicated dictionaries, I've made a simple,
one case dictionary. I've also done a similar bit to replicate your
doIt method.
Thanks, i didn't know about dis. So this is very useful.
... exec( {'a': "retval = 'a'"}.get(a) )
... return retval
...
... exec( dict({'a': "retval = 'a'"}).get(a) )
... return retval
...
... def doIt(self, a):
... exec(self.caseDict.get(a))
... return retval
...

Then, use the dis module to disassemble the function objects:
2 0 BUILD_MAP 0
3 DUP_TOP
4 LOAD_CONST 1 ('a')
7 LOAD_CONST 2 ("retval = 'a'")
10 ROT_THREE
11 STORE_SUBSCR
12 LOAD_ATTR 0 (get)
15 LOAD_FAST 0 (a)
18 CALL_FUNCTION 1
21 LOAD_CONST 0 (None)
24 DUP_TOP
25 EXEC_STMT

3 26 LOAD_NAME 2 (retval)
29 RETURN_VALUE
2 0 LOAD_NAME 0 (dict)
3 BUILD_MAP 0
6 DUP_TOP
7 LOAD_CONST 1 ('a')
10 LOAD_CONST 2 ("retval = 'a'")
13 ROT_THREE
14 STORE_SUBSCR
15 CALL_FUNCTION 1
18 LOAD_ATTR 1 (get)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1
27 LOAD_CONST 0 (None)
30 DUP_TOP
31 EXEC_STMT

3 32 LOAD_NAME 3 (retval)
35 RETURN_VALUE
3 0 LOAD_FAST 0 (self)
3 LOAD_ATTR 1 (caseDict)
6 LOAD_ATTR 2 (get)
9 LOAD_FAST 1 (a)
12 CALL_FUNCTION 1
15 LOAD_CONST 0 (None)
18 DUP_TOP
19 EXEC_STMT

4 20 LOAD_NAME 4 (retval)
23 RETURN_VALUE

Take a look at what happens before the 'get' attribute is loaded in
each case. In case 3, you've simply created a dictionary literal,
which is a very fast operation under Python. In case 4, you've created
a dictionary literal, then you call the dict() function. The dict
function will create a dictionary from the supplied dictionary, and
return the shallow copy.
I knew case4 would be slower, it did what it intended, but didn't
answer my question. How's that for confused thinking!!
Case 3 is slower, but the Python developers have worked to make
dictionary creation and look-up very fast. Did you use the timeit
module to test your functions?
I had a simple timer, not timeit.

def timeIt(func,vals):
start = time.time()
for item in vals:
func(item)
return time.time()-start

#reinvented the wheel too, not a good day to program

The output provided was a good starting point, and unless I am really
misreading things, python is not caching my dictionary literal. It
would appear as though it gets loaded every iteration - which is what I
expected.

So here's the odd part - and one I still find remarkably
counter-intuitive. and i'll set up some of the params here

1) I did not count object instantiation in the time taken - so in the
class based version the dictionary was already created before I start
timing

2) Once the dictionary is either created (function) or retreived(class)
the execution is identical. That can be seen from the output above you
provided.

which lead me to hypothesize:
3) loading a dictionary literal seems faster than accessing that
dictionary as a class member.

I really find 3 to be surprising although I guess I shouldn't - lookups
are not cheap... I repeat that to myself all the time. apparently they
really really aren't cheap.

Now that begs the question... and I'm writing something up to test it.
Will a larger dictionary change this dynamic. My suspicion yes, how
much bigger I have no idea.


and mostly thanks for the intro to a new module. With all the esoteric
junk I play with and experiment with I can't believe I didn't stumble
on it before.
 
S

sjdevnull

I wrote up a quick little set of tests, I was acutally comparing ways
of doing "case" behavior just to get some performance information. Now
two of my test cases had almost identical results which was not at all
what I expected. Ultimately I realized I don't really know how
literals are treated within the interpreter.

The two implementations I was looking at were:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}

def doIt(self,a):
exec(self.caseDict.get(a))
return retval

Have you considered eliminating the exec() wart just doing:

self.caseDict = {'a'='a', 'b'='b'}
.....
def doIt(self, a):
return self.caseDict[a]

(or the get call)
Or for more complex cases going with something like:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':self.do_a, 'b':self.do_b}
def do_a(self):
return 'a'
def do_b(self):
return 'b'
def doIt(self, a):
return self.caseDict[a]()

or eliminating the init and
def doIt(self, a):
return getattr(self, "do_"+a)()

? Obviously with these two you might want a bit more complexity for a
default if the attribute/dict entry is missing, but the simple case
shows the idea.
 
A

akameswaran

Have you considered eliminating the exec() wart just doing:

thr dict wart was added - needlessly simply to see if there was a
difference. case4 was a reaction to the odd timings. I wouldn't use
it ever in the real world as it is totally unnecessary.
self.caseDict = {'a'='a', 'b'='b'}
....
def doIt(self, a):
return self.caseDict[a]

(or the get call)
Or for more complex cases going with something like:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':self.do_a, 'b':self.do_b}
def do_a(self):
return 'a'
def do_b(self):
return 'b'
def doIt(self, a):
return self.caseDict[a]()

or eliminating the init and
def doIt(self, a):
return getattr(self, "do_"+a)()

This is clever, but really only works for this contrived example. If I
wanted to base my cases on an object.... Still I like it's elimination
of seemingly redundant dictionary. The original purpose of all of this
was to get a feel for performance differences between endless if/else
statements and a dictionary for performing switch/case type stuff.

The upshot of all of this, is I will probably stick to the if/else
construct except for fairly complex cases for two reasons

1) only in the most braindead version of an if/else construction is
dictionary faster for small (ie less than 20) different cases.
2) dictionary method won't work if I want an switch could match
multiple case conditions
 
A

akameswaran

Have you considered eliminating the exec() wart just doing:
Sorry I misread exec as dict in my earlier reply. I did have another
class based implementation that used functions in the class. That was
the only one that performed well against if/else. And only for large
number of cases. When I compared the two implementations I stated here
- that's where I wondered about my understanding of dictionary literals
with regards to performance. I could also rewrite the function version
to store functions in the dictionary instead of statements for exec -
but here I was just wanting to compare retrieving a dictionary class
member, vs reading a dictionary literal. I think the code did a good
job of isolating that as the only change - and that is validated by
using the dissassembler.
self.caseDict = {'a'='a', 'b'='b'}
....
def doIt(self, a):
return self.caseDict[a]

(or the get call)
Or for more complex cases going with something like:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':self.do_a, 'b':self.do_b}
def do_a(self):
return 'a'
def do_b(self):
return 'b'
def doIt(self, a):
return self.caseDict[a]()

or eliminating the init and
def doIt(self, a):
return getattr(self, "do_"+a)()

? Obviously with these two you might want a bit more complexity for a
default if the attribute/dict entry is missing, but the simple case
shows the idea.
 
B

Bruno Desthuilliers

I wrote up a quick little set of tests, I was acutally comparing ways
of doing "case" behavior just to get some performance information. Now
two of my test cases had almost identical results which was not at all
what I expected. Ultimately I realized I don't really know how
literals are treated within the interpreter.

The two implementations I was looking at were:

class caseFunction(object):
def __init__(self):
self.caseDict = {'a':"retval = 'a'",
'b':"retval='b'","c":"retval='c'","d":"retval='d'",

"e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
"i":"retval='i'"}

def doIt(self,a):
exec(self.caseDict.get(a))
return retval

Err... Why would you want to exec anything here ? Remember that
Python's functions are objects too:

def funcA(*args, **kw):
return "funcA called with %s %s" % (str(args), kw)

def funcB(*args, **kw):
return "funcB called with %s %s" % (str(args), kw)

def funcC(*args, **kw):
return "funcC called with %s %s" % (str(args), kw)

def defaultFunc(*args, **kw):
return "defaultFunc called with %s %s" % (str(args), kw)

class SwitchFunc(object):
def __init__(self, default, **kw):
self._default = default
self._switch = kw

# makes the object callable.
def __call__(self, case, *args, **kw):
func = self._switch.get(case, self._default)
return func(*args, **kw)

switch = SwitchFunc(defaultFunc, a=funcA, b=funcB, c=funcC)

for case in "abcX":
print switch(case, "foo", q=42)

HTH
 
A

akameswaran

Bruno said:
Err... Why would you want to exec anything here ? Remember that
Python's functions are objects too:
Largely because it was immaterial to what I am asking about here, which
is dictionary literals. I was also curious about how much overhead
exec had - even on simple statements, and it's about as bad as I would
have guessed. Finally - it's a little quicker to write out the exec
dictionary than all of the functions ;)
def funcA(*args, **kw):
return "funcA called with %s %s" % (str(args), kw)

def funcB(*args, **kw):
return "funcB called with %s %s" % (str(args), kw)

def funcC(*args, **kw):
return "funcC called with %s %s" % (str(args), kw)

def defaultFunc(*args, **kw):
return "defaultFunc called with %s %s" % (str(args), kw)

class SwitchFunc(object):
def __init__(self, default, **kw):
self._default = default
self._switch = kw

# makes the object callable.
def __call__(self, case, *args, **kw):
func = self._switch.get(case, self._default)
return func(*args, **kw)

switch = SwitchFunc(defaultFunc, a=funcA, b=funcB, c=funcC)

for case in "abcX":
print switch(case, "foo", q=42)
Now I'm not sure what the value of "semi" abstraction is here. If we
are going to make the class semi generic, might as well have it use
exec, then the function determined by the switch statement could be
dynamically changed at runtime(and run like a pig). Otherwise, we need
to have those functions implemented so what do we gain? I was also
going to say, why not make the functions instance methods, but --
surprise surprise - when I timed it out, the lookup was ever so
slightly slower for the instance methods. I had to do about 100,000
loops of it for the speed difference to be definitive, but oddly enough
lookup to the module function was faster than looking up an instance
method.
 
B

Bruno Desthuilliers

Largely because it was immaterial to what I am asking about here, which
is dictionary literals.

Indeed. But it could as well have been related to not knowing about
first-class functions (I always suspect something wrong when I see and
exec or eval...)
I was also curious about how much overhead
exec had - even on simple statements, and it's about as bad as I would
have guessed. Finally - it's a little quicker to write out the exec
dictionary than all of the functions ;)
(snip)

Now I'm not sure what the value of "semi" abstraction is here.

Nor am I - I never came to write such a code in now 6+ years of Python
programming (and believe me, I sometimes write weird and hairy code !-).
If we
are going to make the class semi generic, might as well have it use
exec, then the function determined by the switch statement could be
dynamically changed at runtime

Actually, there are few things that can't be dynamically changed at
runtime in Python !-p
(and run like a pig).
Otherwise, we need
to have those functions implemented so what do we gain?

If you find a way to run code that's not been written one way or
another, *please* share with us !-)
I was also
going to say, why not make the functions instance methods, but --
surprise surprise - when I timed it out, the lookup was ever so
slightly slower for the instance methods.

Hardly surprising - method lookup and invocation is a somewhat involved
process (read about the descriptor protocol and __getattribute__'s
implementation for new style classes...).
I had to do about 100,000
loops of it for the speed difference to be definitive, but oddly enough
lookup to the module function was faster than looking up an instance
method.

Yes. The faster lookup is probably in the local namespace, but then it
would requires a rewrite of the switcher for each and every use case -
in which case it's simpler to rewrite the whole mechanism where needed...

As as side note, the absence of a switch statement in Python usually
leads me to think of a better (IMHO at least) design - but I'm not
really a speed-freak, so YMMV...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top