Speed quirk: redundant line gives six-fold speedup

MrJean1 · Aug 25, 2005

Two observations:

1 - The difference in run time with and without the dummy* globals is
due to a difference in the number of invokations of the search()
function: 1,140 resp. 27,530 in my environment.

To verify, just change the line

def search():
....

to

searches = 0
def search():
global searches
searches += 1
....

and add at the very end

print searches, "searches"

2 - The run times with and without the dummy* variables is equal(ly
slow) if the LLentry() class and min() function call are modified to be
independent of the object value.

Change line

class LLentry: pass

to

LLinst = 0
class LLentry(object):
def __init__(self):
global LLinst
LLinst += 1
self.I = LLinst

and change line

mm = min((c.S, c) for c in rowitems(h))[1].D

to

mm = min((c.S, c.I, c) for c in rowitems(h))[2].D

/Jean Brouwers

Claudio Grondi · Aug 26, 2005

I've learnt my lesson

Thank you for your help, and apologies

for wasting other people's time with this as well as my own!

I've learnt my lesson reading through this thread, too.

I am glad to be given the chance of wasting my time
with it and very happy and thankful, that you posted
your problem here before I have bumped into similar
one myself. I suppose, that there are many others
on this newsgroup who appreciated your posting,
too.

Claudio

Benjamin Niemann · Aug 26, 2005

Raymond said:
Thanks for your post. It is cute, brilliant, and interesting.

I haven't had time to investigate but can point at the likely cause.
All of the global names are stored in a single hash table. Search time
is dictated by two factors, the sparseness of the hash table and the
distribution of hash values. With respect to sparseness, whenever that
table becomes 2/3 full, it grows by a factor of four and becomes only
1/6 full (leading to many fewer collisions). With respect to
distribution, it should be noted that string hash values are decidely
non-random and your variable names likely congested consecutive spaces
in a nearly full table (resulting in seven times as many search probes
to find a global value).

When the extra value was added, it likely resized the table four-fold
and redistributed the hash values into fewer consecutive positions.

If that's the cause, then here's another reason to use long, descriptive
names instead of C64-BASIC style a, b, c, i, j... - with long names the
chances of hash collisions are pretty low.
Or everyone will start optimizing their programs by using long, *random*
names

Raymond Hettinger · Aug 26, 2005

[Raymond Hettinger]
[Benjamin Niemann]

If that's the cause, then here's another reason to use long, descriptive
names instead of C64-BASIC style a, b, c, i, j... - with long names the
chances of hash collisions are pretty low.
Or everyone will start optimizing their programs by using long, *random*
names

The wink applied to second line but could have also applied to the
first. For the most part, the non-randomness of string hashes tends to
work in your favor -- it can result in collision free hash tables.

For optimization, the best approach is to use local variables in your
innermost loops.

Alternatively, if you want to be tricky, try something like this:

globals().update(globals())

What this does and why it can improve search times is left as an
exercise for the reader

Raymond

Blue J Ciphertext Program	2	Nov 22, 2023
Fibonacci	0	May 13, 2023
My Status, Ciphertext	2	Nov 28, 2023
Python code problem	2	Apr 23, 2023
Beginner. 2d rotation gives unexpected results.	4	Jul 23, 2013
ChatGPT will make us Job(Home)less	3	Jan 22, 2023
How to play corresponding sound?	2	Jun 10, 2023
Speed of Python	8	Sep 7, 2007

Speed quirk: redundant line gives six-fold speedup

MrJean1

Claudio Grondi

Benjamin Niemann

Raymond Hettinger

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads