Reading just a few lines from a text file

T

tkpmep

I have a text file with many hundreds of lines of data. The data of
interest to me, however, resides at the bottom of the file, in the last
20 lines. Right now, I read the entire file and discard the stuff I
don't need. I'd like to speed up my program by reading only the last 20
lines. How do I do this?

Thomas Philips
 
R

rafi

I have a text file with many hundreds of lines of data. The data of
interest to me, however, resides at the bottom of the file, in the last
20 lines. Right now, I read the entire file and discard the stuff I
don't need. I'd like to speed up my program by reading only the last 20
lines. How do I do this?

Thomas Philips

If you are using a Unix:

tail -20 file.txt
 
D

draghuram

I just did strace on "tail -20 <filename>". Apparently, it does seek to
the end and reads enough data to cover 20 lines. I guess it is
calculating this "size" by counting 20 new lines .You may try to do the
same thing.

Thanks,
Raghu.
 
P

Paul McGuire

Are you sure this is really slowing down your program? "Many hundreds
of lines" is not nearly enough to start Python breathing hard. I have
been really impressed with just how quickly Python is able to do file
input and processing, zipping through whole megs of data in just
seconds.

How are you currently reading the file in? A character at a time?
That *will* be slow. Try file.readlines(), or xreadlines(), and spin
off the last 20 in the list.

-- Paul
 
T

tkpmep

Right now my code reads as follows:

infile=file(FileName)
for line in reversed(infile.readlines()): #Search from the bottom up
if int(line.split()[0]) == MyDate:
Data= float(line.split()[-1])
break
infile.close()

I have to read about 10,000 files, each with data. I'm looking to speed
up each individual file open/close cycle.

Thomas Philips
 
D

Diez B. Roggisch

Right now my code reads as follows:

infile=file(FileName)
for line in reversed(infile.readlines()): #Search from the bottom up

Not sure if python does some tricks here - but for me that seems to be
uneccesary shuffling around of data. Better do

for line in reversed(infile.readlines()[:-10]):
...


Diez
 
J

John Machin

I have a text file with many hundreds of lines of data. The data of
interest to me, however, resides at the bottom of the file, in the last
20 lines. Right now, I read the entire file and discard the stuff I
don't need. I'd like to speed up my program by reading only the last 20
lines. How do I do this?

Thomas Philips

What discernible speed increase are you talking about? How long does it
take to read the "many hundreds" of lines?

For "many hundreds", IMHO it's not worth the bother, the complexity, the
documentation, ...

Just do this, it's about as fast as you'll get in pure python for a
smallish file:

def last_n_lines(filename, n):
return open(filename, 'r').readlines()[-n:]

For many hundreds of thousands of lines, one approach might be to open
the file in binary mode, seek to the end of the file, then loop reading
chunks backwards and unpacking the chunks until you've found 21 line
terminators. Or perhaps 20 line separators :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top