Looking for a buffered/windowed iterator

S

samwyse

I have Python program that lets me interact with a bunch of files.
Unfortunately, the program assumes that the bunch is fairly small, and
I have thousands of files on relatively slow storage. Just creating a
list of the file names takes several minutes, so I'm planning to
replace the list with an iterator in another thread. However, each
file requires several seconds to load before I can work with it. The
current code alleviates this via a thread that loads the next file
while I'm examining the current one. I'd like my new iterator to
provide a fixed window around the current item. I've looked at the
pairwise recipe, but it doesn't allow me to peek at items within the
window (to generate thumbnails, etc), and I've looked at arrayterator,
but it divides the stream into small contiguous blocks where crossing
a block boundary is relatively expensive.

Previous discussions in c.l.py (primarily those that propose new
functions to be added to itertools) claim that people do this all the
time, but seem woefully short of actual examples. Before I possibly
re-invent the wheel(*), could someone point me to some actual code
that approximates what I want to do? Thanks.

(*) Re-inventing a wheel is generally pretty simple. But then you
discover that you also need to invert axle grease, a leaf spring
suspension, etc.
 
R

Robert Kern

Previous discussions in c.l.py (primarily those that propose new
functions to be added to itertools) claim that people do this all the
time, but seem woefully short of actual examples. Before I possibly
re-invent the wheel(*), could someone point me to some actual code
that approximates what I want to do? Thanks.

From grin, my grep-alike:
http://pypi.python.org/pypi/grin

def sliding_window(seq, n):
""" Returns a sliding window (up to width n) over data from the iterable

Adapted from the itertools documentation.

s -> (s0,), (s0, s1), ... (s0,s1,...s[n-1]), (s1,s2,...,sn), ...
"""
it = iter(seq)
result = ()
for i, elem in itertools.izip(range(n), it):
result += (elem,)
yield result
for elem in it:
result = result[1:] + (elem,)
yield result


A slight modification of this should get what you want.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,045
Latest member
DRCM

Latest Threads

Top