How to iterate through a sequence, grabbing subsequences?

M

Matthew Wilson

I wrote a function that I suspect may already exist as a python builtin,
but I can't find it:

def chunkify(s, chunksize):
"Yield sequence s in chunks of size chunksize."
for i in range(0, len(s), chunksize):
yield s[i:i+chunksize]

I wrote this because I need to take a string of a really, really long
length and process 4000 bytes at a time.

Is there a better solution?

Matt
 
B

Bruno Desthuilliers

Matthew said:
I wrote a function that I suspect may already exist as a python builtin,
but I can't find it:

def chunkify(s, chunksize):
"Yield sequence s in chunks of size chunksize."
for i in range(0, len(s), chunksize):
yield s[i:i+chunksize]

I wrote this because I need to take a string of a really, really long
length and process 4000 bytes at a time.

Is there a better solution?

I don't know if it's better, but StringIO let you read a string as if it
was a file:

def chunkify(s, chunksize):
f = StringIO.StringIO(long_string)
chunk = f.read(chunksize)
while chunk:
yield chunk
chunk = f.read(chunksize)
f.close()

Now I'm sure someone will come up with a solution that's both far better
and much more obvious (at least if you're Dutch <g>)
 
F

Fredrik Lundh

Matthew said:
I wrote a function that I suspect may already exist as a python builtin,
but I can't find it:

def chunkify(s, chunksize):
"Yield sequence s in chunks of size chunksize."
for i in range(0, len(s), chunksize):
yield s[i:i+chunksize]

I wrote this because I need to take a string of a really, really long
length and process 4000 bytes at a time.

Is there a better solution?

what's wrong with your solution ?

</F>
 
G

George Sakkis

Matthew said:
I wrote a function that I suspect may already exist as a python builtin,
but I can't find it:

def chunkify(s, chunksize):
"Yield sequence s in chunks of size chunksize."
for i in range(0, len(s), chunksize):
yield s[i:i+chunksize]

I wrote this because I need to take a string of a really, really long
length and process 4000 bytes at a time.

Is there a better solution?

There's not any builtin for this, but the same topic came up just three
days ago: http://tinyurl.com/qec2p.

Regards,
George
 
T

Tim Chase

def chunkify(s, chunksize):
"Yield sequence s in chunks of size chunksize."
for i in range(0, len(s), chunksize):
yield s[i:i+chunksize]

I wrote this because I need to take a string of a really, really long
length and process 4000 bytes at a time.

Is there a better solution?

My first thought, if len(s) is truely huge, would be to replace
range() with xrange() so that you don't build a list of
len(s)/chunksize elements, just to throw it away.

However, I think that's about as good as this common idiom gets.

I've seen variants which will always yield portions of chunksize
in size, padding it out to the proper length, but that's a
specification issue that you don't seem to want/need.

-tkc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top