How to overide "for line in file:"

D

David Morgenthaler

How does one overide the iterator implied by the construct "for line
in file:"?

For example, suppose I have a file containing row,col pairs on each
line, and I wish to write a subclass of file that will transform these
to x,y coordinates using an affine transform. I'd like it to look
something like this (but this clearly doesn't work):

class myFile(file):
def __init__(self,name,affine):
self.affine = affine
file.__init__(name)

def next(self):
for line in file.__iter__(self):
r,c = ",".split(line[:-1]
yield self.affine(r,c)


Thanks in advance!
dave
 
J

John Roth

How does one overide the iterator implied by the construct "for line
in file:"?

For example, suppose I have a file containing row,col pairs on each
line, and I wish to write a subclass of file that will transform these
to x,y coordinates using an affine transform. I'd like it to look
something like this (but this clearly doesn't work):

class myFile(file):
def __init__(self,name,affine):
self.affine = affine
file.__init__(name)

def next(self):
for line in file.__iter__(self):
r,c = ",".split(line[:-1]
yield self.affine(r,c)


Thanks in advance!

I'm not sure what you're trying to do, but you may be
looking for the readline() method.

John Roth
 
A

Aahz

How does one overide the iterator implied by the construct "for line
in file:"?

You don't.
For example, suppose I have a file containing row,col pairs on each
line, and I wish to write a subclass of file that will transform these
to x,y coordinates using an affine transform.

Don't. Write a completely separate iterator (a generator) that does the
transform, which takes an iterator providing strings (lines) as its
input. Then you can do this:

for item in transform(f):

which should do what you want and be completely Pythonic.
 
S

Shalabh Chaturvedi

David said:
How does one overide the iterator implied by the construct "for line
in file:"?

'for' uses __iter__().

Subclass file and redefine __iter__(self). It should return an object that
has the next() method which returns the next item, or raises StopIteration
if finished. Returning self from __iter__() is ok (then you can put the
next() in the same class).

An easier alternative is to make __iter__() a generator function (so calling
it automatically returns a 'generator' object, which has next() on it, and
also raises StopIteration when done).

HTH,
Shalabh
 
T

Terry Reedy

Wrap it, don't replace it. See below.

Your class needs an __iter__ function. However, unless you have some other
reason for the heavy duty class option, much easier is something like:

def xycoords(lines):
# lines is an iterable that yields text lines of appropriate format
for line in lines:
<extract numbers and transform>
yield x,y

You can feed xycoords a literal list of lines for testing and then an open
file for production use.
Chaining iterators like this is part of their intended use.

Terry J. Reedy
 
D

David Morgenthaler

Your class needs an __iter__ function. However, unless you have some other
reason for the heavy duty class option, much easier is something like:

def xycoords(lines):
# lines is an iterable that yields text lines of appropriate format
for line in lines:
<extract numbers and transform>
yield x,y

You can feed xycoords a literal list of lines for testing and then an open
file for production use.
Chaining iterators like this is part of their intended use.

Hmmm. It was the "list of lines" that I was trying to avoid. Because
my files (several files, opened simultaneously) of row,col pairs are
very long (e.g., the locations of many things at every second for a
very long time), I was trying to get the effect of xreadlines, but
with the newer idiom used both "in and out" -- that is, used by the
super to actually read the data, and made available to myFile
instances as well. And since the only behavior of the file class that
I wanted to change was to, in effect, "filter" the input, subclassing
file seemed reasonable.

I could (in fact, I did, so as to not halt all progress) make a
solution using the even older "while True:/readline/test and break"
idiom. So I can use the idiom "outside" of my implementation. But my
curiosity remains: Is it possilbe to use the idiom "inside" (by the
super) to make the idiom available "outside" (by subclasses's
instances)?

Another poster has suggested that I subclass file and redefine
__iter__(self), possibly even point to self. The problem I have when
doing this is that it seems the break the idiom "for line in file"
that is provide by the superclass, file. And I would still like to use
this idiom for doing the actual reading within my .

So, what I was trying to achieve was something like the following
generator, but without the use of the deprecated xreadlines:

for line in xreadlines.xreadlines():
<extract numjber and transform>
yield x,y

Thanks, all, for your answers!
dave
 
S

Shalabh Chaturvedi

David said:
Hmmm. It was the "list of lines" that I was trying to avoid. Because
my files (several files, opened simultaneously) of row,col pairs are
very long (e.g., the locations of many things at every second for a
very long time), I was trying to get the effect of xreadlines, but
with the newer idiom used both "in and out" -- that is, used by the
super to actually read the data, and made available to myFile
instances as well. And since the only behavior of the file class that
I wanted to change was to, in effect, "filter" the input, subclassing
file seemed reasonable.

I could (in fact, I did, so as to not halt all progress) make a
solution using the even older "while True:/readline/test and break"
idiom. So I can use the idiom "outside" of my implementation. But my
curiosity remains: Is it possilbe to use the idiom "inside" (by the
super) to make the idiom available "outside" (by subclasses's
instances)?

Another poster has suggested that I subclass file and redefine
__iter__(self), possibly even point to self. The problem I have when
doing this is that it seems the break the idiom "for line in file"
that is provide by the superclass, file. And I would still like to use
this idiom for doing the actual reading within my .

So, what I was trying to achieve was something like the following
generator, but without the use of the deprecated xreadlines:

for line in xreadlines.xreadlines():
<extract numjber and transform>
yield x,y

Did you mean myfile.xreadlines()? In which case your code is quite correct.
You could even get rid of the .xreadlines() part and have it work the 'new'
way (and identical to what Terry Reedy suggested :).

Now I'm not sure what you want but note that the example from Terry Reedy
does not have a 'list of lines' - lines is an iterable, which means it
could be a file object. So, you could call that example as:

for x,y in xycoords(file('..')):
<do something>

Now, each time the above for loops once, the for inside xycoords() loops
once. There is never a list of lines created anywhere.

HTH,
Shalabh
 
P

Peter Otten

David said:
How does one overide the iterator implied by the construct "for line
in file:"?

For example, suppose I have a file containing row,col pairs on each
line, and I wish to write a subclass of file that will transform these
to x,y coordinates using an affine transform. I'd like it to look
something like this (but this clearly doesn't work):

class myFile(file):
def __init__(self,name,affine):
self.affine = affine
file.__init__(name)

def next(self):
for line in file.__iter__(self):
r,c = ",".split(line[:-1]
yield self.affine(r,c)

Here are two ways to make the above work:

# add an additional iterator
class myFile(file):
def __init__(self, name, affine):
self.affine = affine
file.__init__(self, name) # oops, self was missing

def pairs(self): #oops, name clash
for line in self:
r,c = line[:-1].split(",") #oops, wrong use of split()
yield self.affine(r,c)

# change the original iterator's behaviour
class myFile2(file):
def __init__(self, name, affine):
self.affine = affine
file.__init__(self, name)

def next(self):
line = file.next(self)
r, c = line[:-1].split(",")
return self.affine(r,c)

#create sample data
r = iter(range(10))
sample = "".join(["%d,%d\n" % (i,j) for (i,j) in zip(r,r)])
file("tmp.txt", "w").write(sample)

def op(a, b):
return int(a)*int(b)

for v in myFile("tmp.txt", op).pairs():
print v,
print

for v in myFile2("tmp.txt", op):
print v,
print


Peter
 
D

David Morgenthaler

# change the original iterator's behaviour
class myFile2(file):
def __init__(self, name, affine):
self.affine = affine
file.__init__(self, name)

def next(self):
line = file.next(self)
r, c = line[:-1].split(",")
return self.affine(r,c)

#create sample data
r = iter(range(10))
sample = "".join(["%d,%d\n" % (i,j) for (i,j) in zip(r,r)])
file("tmp.txt", "w").write(sample)

def op(a, b):
return int(a)*int(b)


for v in myFile2("tmp.txt", op):
print v,
print

Thank you, all!

Shalabh points out the failure of my understanding ["...note that the
example from Terry Reedy does not have a 'list of lines' - lines is an
iterable, which means it could be a file object.], and your second
example [class myfile2] is exactly what I had been trying to do.

dave
 
H

Harry George

Shalabh Chaturvedi said:
'for' uses __iter__().

Subclass file and redefine __iter__(self). It should return an object that
has the next() method which returns the next item, or raises StopIteration
if finished. Returning self from __iter__() is ok (then you can put the
next() in the same class).

An easier alternative is to make __iter__() a generator function (so calling
it automatically returns a 'generator' object, which has next() on it, and
also raises StopIteration when done).

HTH,
Shalabh


Slightly off topic, how do you "reset" or backup an iter? Someone
asked me a couple of days ago how to use yield and then backup if the
yielded line failed to meet certain criteria. Essentailly a
lookahead(1) situation.

Historically, I've made a list, kept track of the line index, and
decremented that when I needed to backup. Does iter have something
equiv?
 
T

Terry Reedy

Harry George said:
Slightly off topic, how do you "reset" or backup an iter?

Write your own bidirectional iterator class with a backup or prev method
and use it within a while loop instead of for loop. Iterators produced by
iter() and generator functions are meant to be once through. This
simplicity allows very efficient yield/next control tranfers. Some
iterables, such as actual sequences, can be reiterated from the beginning.
And generator functions can to re-called, with same or different params.
Historically, I've made a list, kept track of the line index, and
decremented that when I needed to backup. Does iter have something
equiv?

If you mean builtin iter(), no.

Terry J. Reedy
 
D

Duncan Booth

Slightly off topic, how do you "reset" or backup an iter? Someone
asked me a couple of days ago how to use yield and then backup if the
yielded line failed to meet certain criteria. Essentailly a
lookahead(1) situation.

Python defines iterables as having a method '__iter__' which returns an
iterator, and iterators as having methods 'next' to return the next value,
and '__iter__' which returns the iterator itself.

That's it. What you see on the tin is what you get.

Of course, if you need a resettable iterator, or one that can backtrack,
then there is no any reason why you can't program it for yourself, but this
functionality isn't part of the definition of an iterator.
Historically, I've made a list, kept track of the line index, and
decremented that when I needed to backup. Does iter have something
equiv?

You could almost certainly implement this kind of functionality once and
then use it to wrap any iterators that might need to backtrack:
def __init__(self, it):
self.__iter = iter(it)
self.__next = []
def __iter__(self):
return self
def next(self):
if self.__next:
return self.__next.pop()
return self.__iter.next()
def pushback(self, value):
self.__next.append(value)
if i==3:
myIter.pushback('a')
myIter.pushback('z')
print i


0
1
2
3
z
a
4
This has unbounded pushback, and because you have to say what you are
pushing back it doesn't need to hold any unneeded values (at the point
where you are pushing values back you probably know what values you just
got so you can push them back it you want the same values, or push back
something different if you prefer).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top