LRU cache?

P

Paul Rubin

Anyone got a favorite LRU cache implementation? I see a few in google
but none look all that good. I just want a dictionary indexed by
strings, that remembers the last few thousand entries I put in it.

It actually looks like this is almost a FAQ. A well-written
implementation would probably make a good standard library module.
 
T

Thomas Wittek

Paul said:
Anyone got a favorite LRU cache implementation? I see a few in google
but none look all that good. I just want a dictionary indexed by
strings, that remembers the last few thousand entries I put in it.

I don't know a module for that (although it might exist), but I could
imagine how to implement it.

Generally a LRU cache could be implemented as a (linked) list with a
max. size of N.
On usage of an object this object will be moved to the top of the list
(and removed from the old position).
If it's not yet stored in the cache, you (fetch it from the source and)
add it on top and remove the last one, if the list exceeds N entries.

So you need to do two things:
1) Maintain a list: Use a (linked) list.
2) Random access to that list: Use a dict (hash), that also has to be
updated on removal/addition of objects.
 
S

Steve Holden

Thomas said:
I don't know a module for that (although it might exist), but I could
imagine how to implement it.

Generally a LRU cache could be implemented as a (linked) list with a
max. size of N.
On usage of an object this object will be moved to the top of the list
(and removed from the old position).
If it's not yet stored in the cache, you (fetch it from the source and)
add it on top and remove the last one, if the list exceeds N entries.

So you need to do two things:
1) Maintain a list: Use a (linked) list.
2) Random access to that list: Use a dict (hash), that also has to be
updated on removal/addition of objects.
Paul's an old enough hand that he's well able to determine this for
himself. He's also polite enough not to reply along those lines :)

*I* don't know of module for tracking how plants grow, but I could
imagine how to implement it ...

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
 
A

Alex Martelli

Paul Rubin said:
Anyone got a favorite LRU cache implementation? I see a few in google
but none look all that good. I just want a dictionary indexed by
strings, that remembers the last few thousand entries I put in it.

So what's wrong with Evan Prodromou's lrucache.py module that's in pypi?
Haven't used it, but can't see anything wrong at a glance.


Alex
 
N

Nikita the Spider

Paul Rubin said:
Anyone got a favorite LRU cache implementation? I see a few in google
but none look all that good. I just want a dictionary indexed by
strings, that remembers the last few thousand entries I put in it.

It actually looks like this is almost a FAQ. A well-written
implementation would probably make a good standard library module.

This one works for me:
http://www.webfast.com/~skip/python/Cache.py
 
P

Paul Rubin

So what's wrong with Evan Prodromou's lrucache.py module that's in pypi?
Haven't used it, but can't see anything wrong at a glance.

Thanks, I wasn't aware of that one. I notice two things about it:
1) it's under a GPL-incompatible license (therefore also incompatible
with the PSF license); that's a problem for what I'm doing.

2) it uses heapq to figure out which entry is the oldest. I guess
that's not too unreasonable and maybe it's necessary. But unless I'm
missing something it's never necessary to insert new records into the
middle of the list-by-age. So I think it may be easiest to just keep
a doubly linked list of records with the invariant that they are
ordered by age, keeping track of where both ends are; plus a hash
table to map indices to nodes in the linked list. On any access that
hits the cache, the found node gets removed and moved to the front of
the list in O(1) operations. This may be easiest with a circular
list.
 
P

Paul Rubin

Nikita the Spider said:

Thanks, this one keeps track of the actual clock time rather than just
the relative age of cache entries, and has kind of a weird ageing
mechanism, building and sorting an O(n) sized list when the cache gets
95% full, giving amortized O(log n) operations per update. But the
license is ok and I guess the functionality I need is there. So I may
use it and/or code something.
 
Q

qyloxe

#simple, but effective (sometimes)

class ICORCache:
def __init__(self,agetfunc,amaxlen=100):
self.GetValue=agetfunc
self.MaxLen=amaxlen
self.VDict={}
self.KDict={}
self.VPos=1
self.AccessRatio=0
self.HitRatio=0
def __getitem__(self,key):
self.AccessRatio=self.AccessRatio+1
id=self.KDict.get(key,0)
if id:
self.HitRatio=self.HitRatio+1
return self.VDict[id][1]
else:
v=self.VDict.get(self.VPos,0)
if v:
del self.KDict[v[0]]
ret=apply(self.GetValue,(key,))
self.VDict[self.VPos]=[key,ret]
self.KDict[key]=self.VPos
self.VPos=self.VPos+1
if self.VPos>self.MaxLen:
self.VPos=1
return ret
def dump(self):
print 'Access ratio:',self.AccessRatio
print 'Hit ratio:',self.HitRatio
print 'Hit %:',100.0*self.HitRatio/self.AccessRatio
print 'VDict len:',len(self.VDict.keys())
print 'KDict len:',len(self.KDict.keys())

#tests
if __name__=='__main__':
import random

def GetCalculatedValue(key):
x=`key`+'aaaa'
y=key*0.5
z=x+`y`
return z

# model of distribution
amax=1000
def GetIterative(key):
return key
def GetRandom(key):
return random.randint(1,amax)

l1,l2,l3=random.randint(1,amax),random.randint(1,amax),random.randint(1,amax)
def GetNDist(key):
global l1,l2,l3
l1,l2,l3=l2,l3,random.randint(1,amax)
return int((l1+l2+l3)/3)

def Main():
for aname,afunc in [['Iterative',GetIterative],
['Random',GetRandom],['NDist',GetNDist]]:
acache=ICORCache(GetCalculatedValue,100)
for i in range(1000):
k=afunc(i)
x=acache[k]
y=GetCalculatedValue(k)
if x!=y:
print 'Error!!!',k,x,y
print 'Name:',aname
acache.dump()
print
return

Main()
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top