how do I "peek" into the next line?

L

les_ander

Hi,
suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().

Is there a way to do this?
for example:
while 1:
line=stdin.peek_nextline()
if not line: break
if line[0]=="|":
break:
else:
x=stdin.nextline()
# do something with x

thanks
 
S

Skip Montanaro

les> suppose I am reading lines from a file or stdin. I want to just
les> "peek" in to the next line, and if it starts with a special
les> character I want to break out of a for loop, other wise I want to
les> do readline().

Create a wrapper around the file object:

class PeekFile:
def __init__(self, f):
self.f = f
self.last = None

def push(self, line):
self.last = line

def readline(self):
if self.last is not None:
line = self.last
self.last = None
return line
return self.f.readline()

def __getattr__(self, attr):
return getattr(self.f, attr)

Then use it like so:

stdin = PeekFile(stdin)
while 1:
line = stdin.readline()
if not line:
break
if line[0] == "|":
stdin.push(line)
break
else:
# do something with line

Of course, this could be tested (which my example hasn't been), PeekFile can
probably can be written as a subclass of the file class, and made more
generally robust (handle filenames as inputs instead of file objects,
support write modes, etc), but this should give you the general idea.

Skip
 
J

Jeffrey Maitland

Hi,
suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().

Is there a way to do this?
for example:
while 1:
line=stdin.peek_nextline()
if not line: break
if line[0]=="|":
break:
else:
x=stdin.nextline()
# do something with x

thanks

Well what you can do is read the line regardless into a testing variable.

here is some sample code (writting this off the topof my head so syntax
might be off some)

import re

file = open("test.txt", 'r')

variablestr = '' #this would be your object.. in my example using a string
for the file data

file.seek(0,2)
eof = file.tell() #what this is the position of the end of the file.
file.seek(0,0)

while file.tell() != eof:
testline = file.readline()
if re.match("#", testline) == True:
break
else:
variablestr += testline

file.close()

now if I was concerned with being at the beging of the testline that it read
in what you can do is in the if is something like:
file.seek((file.tell() - len(testline)), 0)
and that will move you back to the beginging of that line which is where the
readline from the previous call left you before the "peek".

hope that helps some..

Jeff
 
J

Jeffrey Maitland

Jeffrey Maitland writes:
Hi,
suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().

Is there a way to do this?
for example:
while 1:
line=stdin.peek_nextline()
if not line: break
if line[0]=="|":
break:
else:
x=stdin.nextline()
# do something with x

thanks

Well what you can do is read the line regardless into a testing variable.

here is some sample code (writting this off the topof my head so syntax
might be off some)

import re

file = open("test.txt", 'r')

variablestr = '' #this would be your object.. in my example using a string
for the file data

file.seek(0,2)
eof = file.tell() #what this is the position of the end of the file.
file.seek(0,0)

while file.tell() != eof:
testline = file.readline()
if re.match("#", testline) == True:
break
else:
variablestr += testline

file.close()

now if I was concerned with being at the beging of the testline that it
read in what you can do is in the if is something like:
file.seek((file.tell() - len(testline)), 0)
and that will move you back to the beginging of that line which is where
the readline from the previous call left you before the "peek".

hope that helps some..

Jeff

I noticed something in my code.

re.match("#", testline) == True isn't possible it would be more like.
re.match("#", testline) != None
 
F

Fredrik Lundh

Jeffrey said:
file.seek(0,2)
eof = file.tell() #what this is the position of the end of the file.
file.seek(0,0)

while file.tell() != eof:

no no no. that's not how you read all lines in a text file -- in any
programming language.

in recent versions of Python, use:

for testline in file:

to loop over all lines in a given file.
if re.match("#", testline) == True:
break

ahem. RE's might be nice, but using them to check if a string
starts with a given string literal is a pretty lousy idea.

if testline[0] == "#":
break

works fine in this case (when you use the for-loop, at least).

if testline might be empty, use startswith() or slicing:

if testline.startswith("#"):
break

if testline[:1] == "#":
break

(if the thing you're looking for is longer than one character, startswith
is always almost the best choice)

if you insist on using a RE, you should use a plain if statement:

if re.match(pattern, testline):
break # break if it matched

</F>
 
C

Craig Ringer

Hi,
suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().

Assuming there's a good reason, such as monster lines, not to just read
the next line anyway, I'd suggest read()ing the next character then
seek()ing back by one character to restore the file position.

def peekChar(fileobj):
ch = fileobj.read(1)
fileobj.seek(-1,1)
return ch
 
P

Peter Otten

Hi,
suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().

Is there a way to do this?
for example:
while 1:
line=stdin.peek_nextline()
if not line: break
if line[0]=="|":
break:
else:
x=stdin.nextline()
# do something with x

thanks

The itertools approach:

(some preparations)
.... a
.... b
.... c
.... |
.... d
.... e
.... f
.... """)

(1) You are only interested in lines after and including the "special" line:
.... return not line.startswith("|")
........ print line,
....
|
d
e
f

(2) You want both the "head" and the "tail", where the tail includes the
"special" line:
.... if line.startswith("|"):
.... lines = itertools.chain([line], lines)
.... break
.... print "head", line,
....
head a
head b
head c.... print "tail", line,
....
tail |
tail d
tail e
tail f
Peter
 
S

Steven Bethard

Skip said:
les> suppose I am reading lines from a file or stdin. I want to just
les> "peek" in to the next line, and if it starts with a special
les> character I want to break out of a for loop, other wise I want to
les> do readline().

Create a wrapper around the file object:
[snip]

Of course, this could be tested (which my example hasn't been)

If you'd like something that works similarly and has been tested a bit,
try one of my recipes for peeking into an iterator:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/304373

You also might consider writing an iterator wrapper for a file object:
.... def __init__(self, file):
.... self.itr = iter(file)
.... def __iter__(self):
.... while True:
.... next = self.itr.next()
.... if not next or next.rstrip('\n') == "|":
.... break
.... yield next
........ some text
.... some more
.... |
.... not really text""").... print repr(line)
....
'some text\n'
'some more\n'.... some text
.... some more
....
.... really text""").... print repr(line)
....
'some text\n'
'some more\n'
'\n'
'really text'


Steve
 
L

les_ander

OK, I am sorry , I did not explain my problem completely.
I can easily break from the loop when I see the character in a line
that I just read; however my problem involves putting back the line I
just read since if I have seen this special character, I have read one
line too many. Let me illustrate
suppose the file has 3 lines

line1.line1.line1
line2.line2.line
line3.line3.line3

now suppose I have read the first line already.
then I read the second line and notice
that there is a ">" in front (my special character)
then I want the put back the second line into the
file or the stdin.

An easy way is if i read all the lines in to an array
and then I can put back the line with the special
character back into this array. However,
this file I am readding is huge! and I can read
it all in one swoop (not enough memory).

for x in stdin:
if x[0]==">":
#### put back the x some how... <-----
break
else:
print x

I hope this is clear
thanks
 
S

Steven Bethard

now suppose I have read the first line already.
then I read the second line and notice
that there is a ">" in front (my special character)
then I want the put back the second line into the
file or the stdin.

Amended iterator class example using my peekable recipe:
.... def __init__(self, file):
.... self.iter = peekable(file)
.... def __iter__(self):
.... while True:
.... next = self.iter.peek()
.... if not next or next.rstrip('\n') == "|":
.... break
.... yield self.iter.next()
........ some text
.... some more
.... |
.... not really text""").... print repr(line)
....
'some text\n'
'some more\n'.... print repr(line)
....
'|\n'
'not really text'

Note that because I wrap the file iterator with a peekable[1] object,
the lines aren't consumed when I test them. So to access the remaining
lines, I just use the iter field of the strangefileiter object.

Steve

[1] http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/304373
 
F

fuzzylollipop

reads are not destructive there is nothing to "put back" becuase it is
never removed to begin with.
just change your logic and thing of about what you really need to do
 
F

Fredrik Lundh

now suppose I have read the first line already.
then I read the second line and notice
that there is a ">" in front (my special character)
then I want the put back the second line into the
file or the stdin.

the line doesn't disappear from the file just because you
read it. so who do you want to pass it to? someone else?
some other part of your program? what exactly are you
doing?

</F>
 
S

Steven Bethard

I said:
... while True:
... next = self.iter.peek()
... if not next or next.rstrip('\n') == "|":
... break
... yield self.iter.next()
...

Actually, the 'not next' test is not necessary since I'm using an
iterator over the file (end of file is signified by StopIteration, not a
value of '' returned by the iterator). The amended version:

.... while True:
.... next = self.iter.peek()
.... if next.rstrip('\n') == "|":
.... break
.... yield self.iter.next()

Steve
 
S

Steven Bethard

now suppose I have read the first line already.
then I read the second line and notice
that there is a ">" in front (my special character)
then I want the put back the second line into the
file or the stdin.

Another possibility -- have each call to __iter__ produce the next
"section" of your data:
.... def __init__(self, file):
.... self.iter = peekable(file)
.... self.yield_last = False
.... def __iter__(self):
.... if self.yield_last:
.... yield self.iter.next()
.... self.yield_last = False
.... while True:
.... next = self.iter.peek()
.... if next.rstrip('\n') == "|":
.... self.yield_last = True
.... break
.... yield self.iter.next()
........ text
.... before first |
.... |
.... text after
.... first |
.... |
.... final text""").... print repr(line)
....
'text\n'
'before first |\n'.... print repr(line)
....
'|\n'
'text after\n'
'first |\n'.... print repr(line)
....
'|\n'
'final text'

I'm not sure I'd do it this way -- it makes me a little nervous that
each call to __iter__ iterates over something different. But it might
be closer to what you were trying to do...

Steve
 
A

Adam DePrince

reads are not destructive there is nothing to "put back" becuase it is
never removed to begin with.
just change your logic and thing of about what you really need to do

Not true.

Character special devices in Unix
Named pipes in Windows NT
Pipes in Unix
Comx: where x is your com port in DOS
Sockets
gzip and zlib file objects; this file was described as being *so* large
that it should not be read into memory. IIRC seek operations are
linear; while it is correct you can seek backwards, I don't think you
would really want to.


Adam DePrince
 
A

Adam DePrince

OK, I am sorry , I did not explain my problem completely.
I can easily break from the loop when I see the character in a line
that I just read; however my problem involves putting back the line I
just read since if I have seen this special character, I have read one
line too many. Let me illustrate
suppose the file has 3 lines

line1.line1.line1
line2.line2.line
line3.line3.line3

now suppose I have read the first line already.
then I read the second line and notice
that there is a ">" in front (my special character)
then I want the put back the second line into the
file or the stdin.

An easy way is if i read all the lines in to an array
and then I can put back the line with the special
character back into this array. However,
this file I am readding is huge! and I can read
it all in one swoop (not enough memory).

for x in stdin:
if x[0]==">":
#### put back the x some how... <-----
break
else:
print x

I hope this is clear
thanks

class stackfile:
def __init__( self, f ):
self.f = f
self.stack = []

def readline( self ):
if len( self.stack ):
return self.stack.pop()
return self.f.readline()

def unreadline( self, lastline ):
self.stack.append( lastline )



if __name__=="__main__":
import StringIO

f = stackfile(StringIO.StringIO("a\nb\nc\nd\ne\nf\ng\n"))
# You would say: myfile = stackfile( stdin )

line = f.readline()
print line,

line = f.readline()
print line,

line = f.readline()
print line,

f.unreadline( line )

line = f.readline()
print line,

line = f.readline()
print line,

Running this prints:

a
b
c
c
d

Hope this helps.


Adam DePrince
 
L

les_ander

I should have also said that I am using the same file handle.
So suppose the file handle is fp
then I read some lines from fp and I hand it over to other
functions that read other things from fp (in an ordered manner).
les
 
F

Fredrik Lundh

I should have also said that I am using the same file handle.
So suppose the file handle is fp
then I read some lines from fp and I hand it over to other
functions that read other things from fp (in an ordered manner).

another part of your program? so why not use one of the wrappers
that people have posted? or just pass the extra line to the next step:

while 1:
s = file.readline()
if not s or s[0] == "#":
break
# process lines that belong to section 1

while 1:
# process lines that belong to section 2
if not s or s[0] == "#":
break
s = file.readline()

(etc)

</F>
 
P

Peter Hansen

suppose I am reading lines from a file or stdin.
I want to just "peek" in to the next line, and if it starts
with a special character I want to break out of a for loop,
other wise I want to do readline().
[snip example]

Neither your description above nor your example actually
requires looking ahead. Are you sure you need to?

You would only need it if *after* breaking out of the
loop, you wanted to continue reading from the file
*and* wanted the line which triggered the loop termination
still to be available for reading.

-Peter
 
F

fuzzylollipop

read the orginal poster's SPECIFIC question he make only reference to a
FILE or STDIN which can't be destructively read, either way he don't
not requre the look-ahead to do what he wants. So my advice is still
valid. Think about it in a different way.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top