advice : how do you iterate with an acc ?

V

vd12005

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

regards
 
B

bonono

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?
I think it is quite ok as the last "if acc:" is just an "end-of-stream"
implicit marker, whereas during the loop, you have explicit markers to
signal end/start of blocks. There is no unwanted variable introduced
and I don't see how it can be error prone.

This is one of the case I won't try to make it a one liner, because it
is already very natural :)
 
B

Bengt Richter

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)

import itertools
for condresult, acciter in itertools.groupby(fileinput.imput(), condition):
if not condresult:
dosomething(list(acciter)) # or dosomething(acciter) if iterator is usable

IOW, groupy collects contiguous lines for which condition evaluates to a distinct
value. Assuming this is a funtion that returns only two distinct values (for true
and false, like True and False), then if I understand your program's logic, you
do nothing with the line(s) that actually satisfy the condition, you just trigger
on them as delimiters and want to process the nonempty groups of the other lines,
so the "if not condresult:" should select those. Groupby won't return an empty group AFAIK,
so you don't need to test for that. Also, you won't need the list call in list(acciter)
if your dosomething can accept an iterator instead of a list.

Regards,
Bengt Richter
 
D

Dan Sommers

On 2 Dec 2005 16:45:38 -0800,
hello,
i'm wondering how people from here handle this, as i often encounter
something like:
acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2
BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

If doSomething handled an empty list gracefully, then you would have
less repetition:

acc = []
for line in fileinput.input():
if condition(line):
doSomething(acc) #1
acc = []
else:
acc.append(line)
doSomething(acc) #2

If condition were simple enough and the file(s) small enough, perhaps
you could read the whole file at once and use split to separate the
pieces:

contents = file.read()
for acc in contents.split( "this is the delimiter line\n" ):
doSomething(acc.split("\n"))

(There are probably some strange cases of repeated delimiter lines or
delimiter lines occurring at the beginning or end of the file for which
the above code will not work. Caveat emptor.)

If condition were a little more complicated, perhaps re.split would
work.

Or maybe you could look at split and see what it does (since your code
is conceptually very similar to it).

OTOH, FWIW, your version is very clean and very readable and fits my
brain perfectly.

HTH,
Dan
 
B

bonono

Bengt said:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.
 
J

Jeffrey Schwab

hello,

i'm wondering how people from here handle this, as i often encounter
something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

BTW i am particularly annoyed by #1 and #2 as it is a reptition, and i
think it is quite error prone, how will you do it in a pythonic way ?

Could you add a sentry to the end of your input? E.g.:

for line in fileinput.input() + line_that_matches_condition:

This way, you wouldn't need a separate check at the end.
 
B

Ben Finney

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

Looks like you'd be better off making an Accumulator that knows what
to do.
... def flush(self):
... if len(self):
... print "Flushing items: %s" % self
... del self[:]
...
... "spam", "eggs", "FLUSH",
... "beans", "rat", "FLUSH",
... "strawberry",
... ] ... if line == 'FLUSH':
... acc.flush()
... else:
... acc.append(line)
...
Flushing items: ['spam', 'eggs']
Flushing items: ['beans', 'rat']
>>> acc.flush() Flushing items: ['strawberry']
>>>
 
B

Bengt Richter

Bengt said:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.
>>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>> import itertools
>>> def condition(item): return item=='t' ...
>>> def dosomething(it): return 'doing something with %r'%list(it) ...
>>> for condresult, acciter in itertools.groupby(seq, condition):
... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]

Regards,
Bengt Richter
 
B

bonono

Bengt said:
Bengt said:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.
seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
import itertools
def condition(item): return item=='t' ...
def dosomething(it): return 'doing something with %r'%list(it) ...
for condresult, acciter in itertools.groupby(seq, condition):
... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]
Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.
 
S

Scott David Daniels

Jeffrey said:
hello,

.... i often encounter something like:

acc = [] # accumulator ;)
for line in fileinput.input():
if condition(line):
if acc: #1
doSomething(acc) #1
acc = []
else:
acc.append(line)
if acc: #2
doSomething(acc) #2

Could you add a sentry to the end of your input? E.g.:
for line in fileinput.input() + line_that_matches_condition:
This way, you wouldn't need a separate check at the end.

Check itertools for a good way to do this:

import itertools
SENTRY = 'something for which condition(SENTRY) is True'

f = open(filename)
try:
for line in itertools.chain(f, [SENTRY]):
if condition(line):
if acc:
doSomething(acc)
acc = []
else:
acc.append(line)
assert acc == []
finally:
f.close()


--Scott David Daniels
(e-mail address removed)
 
B

Bengt Richter

Bengt said:
Bengt Richter wrote:
It looks to me like itertools.groupby could get you close to what you want,
e.g., (untested)
Ah, groupby. The generic string.split() equivalent. But the doc said
the input needs to be sorted.
seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
import itertools
def condition(item): return item=='t' ...
def dosomething(it): return 'doing something with %r'%list(it) ...
for condresult, acciter in itertools.groupby(seq, condition):
... if not condresult:
... dosomething(acciter)
...
'doing something with [3, 1, 4]'
'doing something with [0, 3, 4, 2]'
'doing something with [3, 1, 4]'

I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
and returns iterators for the subsequences that yield the same key function result, along with that result.
So it's a general subsequence extractor. You just have to supply the key function to make the condition value
change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5):
... print '%6s: %r'%(condresult, list(acciter))
...
True: [0]
False: [1, 2]
True: [3]
False: [4]
True: [5, 6]
False: [7, 8]
True: [9]
False: [10, 11]
True: [12]
False: [13, 14]
True: [15]
False: [16, 17]
True: [18]
False: [19]

or a condresult that stays the same in groups, but every group result is different:
for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):
... print '%6s: %r'%(condresult, list(acciter))
...
0: [0, 1, 2]
1: [3, 4, 5]
2: [6, 7, 8]
3: [9, 10, 11]
4: [12, 13, 14]
5: [15, 16, 17]
6: [18, 19]
Thanks. So it basically has an internal state storing the last
"condition" result and if it flips(different), a new group starts.
So it appears. But note that "flips(different)" seems to be based on ==,
and default key function is just passthrough like lambda x:x, so e.g. integers
and floats will group together if their values are equal.
E.g., to elucidate further,

Default key function:
>>> from itertools import groupby
>>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]):
... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]

Group by bool value:
>>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool):
... print k, list(g)
...
False [0, 0.0, 0j, [], (), None]
True [1, 1.0, 1j]

It's not trying to sort, so it doesn't trip on complex
>>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]):
... print k, list(g)
...
0 [0, 0.0, 0j]
[] [[]]
() [()]
None [None]
1 [1, 1.0]
1j [1j]
2j [2j]

But you have to watch out if you try to pre-sort stuff that includes complex numbers
>>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])):
... print k, list(g)
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

And if you do sort using a key function, it doesn't mean groupy inherits that keyfunction for grouping
unless you specify it
... if isinstance(x, (int, long, float)): return x
... else: return type(x).__name__
...
>>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun)):
... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
None [None]
0j [0j]
1j [1j]
2j [2j]
[] [[]]
() [()]

Vs giving groupby the same keyfun
>>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun), keyfun):
... print k, list(g)
...
0 [0, 0.0]
1 [1, 1.0]
NoneType [None]
complex [0j, 1j, 2j]
list [[]]
tuple [()]


Exmple of unsorted vs sorted subgroup extraction:
>>> for k,g in groupby('this that other thing note order'.split(), key=lambda s:s[0]):
... print k, list(g)
...
t ['this', 'that']
o ['other']
t ['thing']
n ['note']
o ['order']

vs.
>>> for k,g in groupby(sorted('this that other thing note order'.split()), key=lambda s:s[0]):
... print k, list(g)
...
n ['note']
o ['order', 'other']
t ['that', 'thing', 'this']

Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in case a split with args was used.

Regards,
Bengt Richter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top