Strange behavior with iterables - is this a bug?

akameswaran · May 30, 2006

Ok, I am confused about this one. I'm not sure if it's a bug or a
feature.. but

================================ RESTART
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3] [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
l1 = ['a\n','b\n','c\n']
l2 = ['a\n','b\n','c\n']

l3 = ['a\n','b\n','c\n']
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]

explanation of code: the files word1.txt, word2.txt and word3.txt are
all identical conataining the letters a,b and c one letter per line.
The lists I've added the "\n" so that the lists are identical to what
is returned by the file objects. Just eliminating any possible
differences.

If you notice, when using the file objects I don't get the proper set
of permutations. I was playing around with doing this via recursion,
etc. But nothing was working so I made a simplest case nesting. Still
no go.
Why does this not work with the file objects? Or any other class I''ve
made which implements __iter__ and next?

Seems like a bug to me, but maybe I am missing something. Seems to
happen in 2.3 and 2.4.

Terry Reedy · May 30, 2006

Ok, I am confused about this one. I'm not sure if it's a bug or a
feature.. but

================================ RESTART
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2
for i3 in f3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]

A file is something like an iterator and something like an iterable. At
this point, the internal cursur for f3 points at EOF. To reiterate thru
the file, you must rewind in the inner loops. So try (untest by me)

def initf(fil):
f.seek(0)
return f

and ...for i2 in initf(f2) for i3 in initf(f3)

l1 = ['a\n','b\n','c\n']
l2 = ['a\n','b\n','c\n']

l3 = ['a\n','b\n','c\n']
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2
for i3 in l3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]

explanation of code: the files word1.txt, word2.txt and word3.txt are
all identical conataining the letters a,b and c one letter per line.
The lists I've added the "\n" so that the lists are identical to what
is returned by the file objects. Just eliminating any possible
differences.

But lists are not file objects and you did not eliminate the crucial
difference in reiterability. Try your experiment with StringIO objects,
which are more nearly identical to file objects.

Terry Jan Reedy

Inyeol Lee · May 30, 2006

]

================================ RESTART
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3] [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]
l1 = ['a\n','b\n','c\n']
l2 = ['a\n','b\n','c\n']

l3 = ['a\n','b\n','c\n']
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]

explanation of code: the files word1.txt, word2.txt and word3.txt are
all identical conataining the letters a,b and c one letter per line.
The lists I've added the "\n" so that the lists are identical to what
is returned by the file objects. Just eliminating any possible
differences.

You're comparing file, which is ITERATOR, and list, which is ITERABLE,
not ITERATOR. To get the result you want, use this instead;

print [(i1.strip(),i2.strip(),i3.strip(),)

Click to expand...

Click to expand...

for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

FIY, to get the same buggy(?) result using list, try this instead;

l1 = iter(['a\n','b\n','c\n'])
l2 = iter(['a\n','b\n','c\n'])
l3 = iter(['a\n','b\n','c\n'])
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3] [('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]

Click to expand...

Click to expand...

-Inyeol Lee

Gary Herron · May 30, 2006

Ok, I am confused about this one. I'm not sure if it's a bug or a
feature.. but

List comprehension is a great shortcut, but when the shortcut starts
causing trouble, better to go with the old ways. You need to reopen each
file each time you want to iterate through it. You should be able to
understand the difference between these two bits of code.

The first bit opens each file but uses (two of them) multiple times.
Reading from a file at EOF returns an empty sequence.

The second bit opened the file each time you want to reuse it. That
works correctly.

And that suggest the third bit of correctly working code which uses list
comprehension.

# Fails because files are opened once but reused
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
for i1 in f1:
for i2 in f2:
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Works because files are reopened for each reuse:
f1 = open('word1.txt')
for i1 in f1:
f2 = open('word2.txt')
for i2 in f2:
f3 = open('word3.txt')
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Also works because files are reopened for each use:
print [(i1.strip(),i2.strip(),i3.strip())
for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

Hope that's clear!

Gary Herron

================================ RESTART
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in f1 for i2 in f2 for i3 in f3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c')]

l1 = ['a\n','b\n','c\n']
l2 = ['a\n','b\n','c\n']

l3 = ['a\n','b\n','c\n']
print [(i1.strip(),i2.strip(),i3.strip(),) for i1 in l1 for i2 in l2 for i3 in l3]

Click to expand...

Click to expand...

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'a'),
('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'a'), ('a', 'c', 'b'),
('a', 'c', 'c'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'a', 'c'),
('b', 'b', 'a'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'c', 'a'),
('b', 'c', 'b'), ('b', 'c', 'c'), ('c', 'a', 'a'), ('c', 'a', 'b'),
('c', 'a', 'c'), ('c', 'b', 'a'), ('c', 'b', 'b'), ('c', 'b', 'c'),
('c', 'c', 'a'), ('c', 'c', 'b'), ('c', 'c', 'c')]

explanation of code: the files word1.txt, word2.txt and word3.txt are
all identical conataining the letters a,b and c one letter per line.
The lists I've added the "\n" so that the lists are identical to what
is returned by the file objects. Just eliminating any possible
differences.

If you notice, when using the file objects I don't get the proper set
of permutations. I was playing around with doing this via recursion,
etc. But nothing was working so I made a simplest case nesting. Still
no go.
Why does this not work with the file objects? Or any other class I''ve
made which implements __iter__ and next?

Seems like a bug to me, but maybe I am missing something. Seems to
happen in 2.3 and 2.4.

akameswaran · May 30, 2006

DOH!!

thanks a lot. had to be something stupid on my part.

Now I get it

akameswaran · May 31, 2006

Gary said:
List comprehension is a great shortcut, but when the shortcut starts
causing trouble, better to go with the old ways. You need to reopen each
file each time you want to iterate through it. You should be able to
understand the difference between these two bits of code.

The first bit opens each file but uses (two of them) multiple times.
Reading from a file at EOF returns an empty sequence.

The second bit opened the file each time you want to reuse it. That
works correctly.

And that suggest the third bit of correctly working code which uses list
comprehension.

# Fails because files are opened once but reused
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
for i1 in f1:
for i2 in f2:
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Works because files are reopened for each reuse:
f1 = open('word1.txt')
for i1 in f1:
f2 = open('word2.txt')
for i2 in f2:
f3 = open('word3.txt')
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Also works because files are reopened for each use:
print [(i1.strip(),i2.strip(),i3.strip())
for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

Hope that's clear!

Gary Herron

My original problem was with recursion. I explicitly nested it out to
try and understand the behavior - and foolishly looked in the wrong
spot for the problem, namely that file is not reitreable. In truth I
was never concerned about file objects, the problem was failing with my
own custom iterators (wich also were not reiterable) and I switched to
file, to eliminate possible code deficiencies on my own part. I was
simply chasing down the wrong problem. As was pointed out to me in a
nother thread - the cleanest implementation which would allow me to use
one copy of the file (in my example the files are identical) would be
to use a trivial iterator class that opens the file, uses tell to track
position and seek to set position, and returns the appropriate line for
that instance - thus eliminating unnecessary file opens and closes.

Gary Herron · May 31, 2006

Gary Herron wrote:

List comprehension is a great shortcut, but when the shortcut starts
causing trouble, better to go with the old ways. You need to reopen each
file each time you want to iterate through it. You should be able to
understand the difference between these two bits of code.

The first bit opens each file but uses (two of them) multiple times.
Reading from a file at EOF returns an empty sequence.

The second bit opened the file each time you want to reuse it. That
works correctly.

And that suggest the third bit of correctly working code which uses list
comprehension.

# Fails because files are opened once but reused
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
for i1 in f1:
for i2 in f2:
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Works because files are reopened for each reuse:
f1 = open('word1.txt')
for i1 in f1:
f2 = open('word2.txt')
for i2 in f2:
f3 = open('word3.txt')
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Also works because files are reopened for each use:
print [(i1.strip(),i2.strip(),i3.strip())
for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

Hope that's clear!

Gary Herron

Click to expand...

My original problem was with recursion. I explicitly nested it out to
try and understand the behavior - and foolishly looked in the wrong
spot for the problem, namely that file is not reitreable. In truth I
was never concerned about file objects, the problem was failing with my
own custom iterators (wich also were not reiterable) and I switched to
file, to eliminate possible code deficiencies on my own part. I was
simply chasing down the wrong problem. As was pointed out to me in a
nother thread - the cleanest implementation which would allow me to use
one copy of the file (in my example the files are identical) would be
to use a trivial iterator class that opens the file, uses tell to track
position and seek to set position, and returns the appropriate line for
that instance - thus eliminating unnecessary file opens and closes.

I see.

I wouldn't call "tell" and "seek" clean. Here's another suggestion. Use
l1 = open(...).readlines()
to read the whole file into a (nicely reiterable) list residing in
memory, and then iterate through the list as you wish. Only if your
files are MANY megabytes long would this be a problem with memory
consumption. (But if they were that big, you wouldn't be trying to find
all permutations would you!)

Gary Herron

akameswaran · May 31, 2006

My original concern and reason for goint the iterator/generator route
was exactly for large large lists

Unnecessary in this example, but
exactly what I was exploring. I wouldn't be using list comprehension
for generating the permutiations. Where all this came from was
creating a generator/iterator to handle very large permutations.

Gary said:
Gary Herron wrote:

List comprehension is a great shortcut, but when the shortcut starts
causing trouble, better to go with the old ways. You need to reopen each
file each time you want to iterate through it. You should be able to
understand the difference between these two bits of code.

The first bit opens each file but uses (two of them) multiple times.
Reading from a file at EOF returns an empty sequence.

The second bit opened the file each time you want to reuse it. That
works correctly.

And that suggest the third bit of correctly working code which uses list
comprehension.

# Fails because files are opened once but reused
f1 = open('word1.txt')
f2 = open('word2.txt')
f3 = open('word3.txt')
for i1 in f1:
for i2 in f2:
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Works because files are reopened for each reuse:
f1 = open('word1.txt')
for i1 in f1:
f2 = open('word2.txt')
for i2 in f2:
f3 = open('word3.txt')
for i3 in f3:
print (i1.strip(),i2.strip(),i3.strip())

and

# Also works because files are reopened for each use:
print [(i1.strip(),i2.strip(),i3.strip())
for i1 in open('word1.txt')
for i2 in open('word2.txt')
for i3 in open('word3.txt')]

Hope that's clear!

Gary Herron

Click to expand...

My original problem was with recursion. I explicitly nested it out to
try and understand the behavior - and foolishly looked in the wrong
spot for the problem, namely that file is not reitreable. In truth I
was never concerned about file objects, the problem was failing with my
own custom iterators (wich also were not reiterable) and I switched to
file, to eliminate possible code deficiencies on my own part. I was
simply chasing down the wrong problem. As was pointed out to me in a
nother thread - the cleanest implementation which would allow me to use
one copy of the file (in my example the files are identical) would be
to use a trivial iterator class that opens the file, uses tell to track
position and seek to set position, and returns the appropriate line for
that instance - thus eliminating unnecessary file opens and closes.

Click to expand...

I see.

I wouldn't call "tell" and "seek" clean. Here's another suggestion. Use
l1 = open(...).readlines()
to read the whole file into a (nicely reiterable) list residing in
memory, and then iterate through the list as you wish. Only if your
files are MANY megabytes long would this be a problem with memory
consumption. (But if they were that big, you wouldn't be trying to find
all permutations would you!)

Gary Herron

Need help with this script	4	Mar 12, 2023
About a value error called 'ValueError: A value in x_new is below theinterpolation range'	0	Feb 6, 2013
Python3: Is this a bug in urllib?	5	Oct 19, 2010
Strange bug with iterators	3	May 18, 2012
Help needed with nested parsing of file into objects	12	Jun 4, 2012
speed up a numpy code with huge array	5	May 25, 2010
Is this a Ruby bug in Dir on Windows?	1	Oct 25, 2007
Export data from python to a txt file	5	Mar 29, 2013

Strange behavior with iterables - is this a bug?

akameswaran

Terry Reedy

Inyeol Lee

Gary Herron

akameswaran

akameswaran

Gary Herron

akameswaran

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads