Best method for seeking to n lines before end of file [ implementation of tail ]

G

graemenewlands

Hello,

Perhaps someone can help me with a problem I have at the moment and is
causing some debate with my colleagues:

What is the best technique to seek to the end of a file and load in
say, the last 10 lines like the tail command does.

My implementation simply seeks to 100 bytes before the end of file, as
I can be fairly sure for the purpose i'm using it for, that this will
give me the data I require. This method is obviously limited as a
generic solution to that problem.

Any ideas would be greatly appreciated,

Graeme Newlands.
 
B

Brian McCauley

Perhaps someone can help me with a problem I have at the moment and is
causing some debate with my colleagues:

What is the best technique to seek to the end of a file and load in
say, the last 10 lines like the tail command does.

For small numbers I just read the last 10 lines with
File::ReadBackwards and reverse() them.

I probably wouldn't use File::Tail (although at first choice it seems
the natural choice) because although it could do this it's really
concerned with implementing 'tail -f'.
 
X

xhoster

Hello,

Perhaps someone can help me with a problem I have at the moment and is
causing some debate with my colleagues:

What is the best technique

best according to what criteria?
to seek to the end of a file and load in
say, the last 10 lines like the tail command does.

I like:

open my $fh, "-|", "tail", "-10", $file or die $!;
my @foo = <$fh>;

Perhaps not best in terms of portability, though.

Xho
 
J

John Bokma

best according to what criteria?


I like:

open my $fh, "-|", "tail", "-10", $file or die $!;
my @foo = <$fh>;

Perhaps not best in terms of portability, though.

if you have to install Perl, installing a tail port shouldn't be a big
issue :)
 
C

Charles DeRykus

Hello,

Perhaps someone can help me with a problem I have at the moment and is
causing some debate with my colleagues:

What is the best technique to seek to the end of a file and load in
say, the last 10 lines like the tail command does.

My implementation simply seeks to 100 bytes before the end of file, as
I can be fairly sure for the purpose i'm using it for, that this will
give me the data I require. This method is obviously limited as a
generic solution to that problem.

Not fast, but Tie::File is another option.
 
G

graemenewlands

Hi, thanks for your answers,

The criteria we have is speed - a routine maybe called several thousand
times a day on files that are between 1kB and 100MB in size.

One of our limitations is that we're using ActiveState perl, version
5.005_03 - our servers have "stable" builds that fortunately includes
the mks toolkit that has an implementation of tail. Unfortunately, this
means that the modules Tie:File and the File: modules are not
implemented.

I guess another method would be to seek backwards byte by byte and look
of 0D 0A pairs, just in case anyone's written a whole bunch of spaces
(or tabs etc) at EOL?

Thanks again,

Graeme,.
 
X

xhoster

Hi, thanks for your answers,

The criteria we have is speed - a routine maybe called several thousand
times a day on files that are between 1kB and 100MB in size.

I can pipe-open an external "tail" on a 100MB file, and read from it,
about 500 times a second. So I don't think speed is going to be a problem
with that method. (Of course, the size of the file is irrelevant,
as long as it is seekable. It is the size of the last 10
lines that matters, in this case 1670 bytes.) Of course, this was on a
modern machine with Linux and YMMV.

$ time perl -le 'foreach (1..1_000) {open my $fh, "-|", "tail foo.txt" \
or die $!; my @x= <$fh>}'

0.374u 1.278s 0:02.21 74.2% 0+0k 0+0io 0pf+0w

One of our limitations is that we're using ActiveState perl, version
5.005_03 - our servers have "stable" builds that fortunately includes
the mks toolkit that has an implementation of tail. Unfortunately, this
means that the modules Tie:File and the File: modules are not
implemented.

You don't want Tie::File anyway, it will be slow if you only want the end
of a large file. I got File::ReadBackwards to work on 5.004. All I had to
do was hack it to manually define SEEK_SET and SEEK_END.
I guess another method would be to seek backwards byte by byte and look
of 0D 0A pairs, just in case anyone's written a whole bunch of spaces
(or tabs etc) at EOL?

I don't quite understand this question. Do your people have a known habit
of writing a whole bunch of spaces or tabs at (before?) EOL? We certainly
can't answer that question for you. If so, what's to keep them from
writing any other kind of gibberish to the file?

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top