Reading a file twice, back to back ?

martin · Apr 16, 2006

Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

Thanks. Martin

it_says_BALLS_on_your_forehead · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

i don't understand why you need a second pass at all...

John W. Krahn · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

perldoc -f seek

John

it_says_BALLS_on_your_forehead · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

i don't understand why you need a second pass at all...

it_says_BALLS_on_your_forehead · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

i don't understand why you need a second pass at all...

it_says_BALLS_on_your_forehead · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

i don't understand why you need a second pass at all...

martin · Apr 16, 2006

Thanks, Actually I had read about

seek (MYFILEHANLDE, 0, 0).

but I was wondering if "seek" is safer than opening and closing and
re-opening. Or if there was any other way. In one of the posts there
was a suggestion that opening twice could potentially modify the
content of a file and is not considered safe or reliable practice.

Martin

Jürgen Exner · Apr 16, 2006

it_says_BALLS_on_your_forehead said:
i don't understand why you need a second pass at all...

Yes, we heard you the first time already.

This can be a totally valid scenario if the file is too large to keep in
memory and e.g. the first pass is to determine the most frequent key and the
second pass to collect all lines containing that key.

jue

Charles DeRykus · Apr 16, 2006

martin said:
Hi, I have a question regarding reading a file.

I like to read a file (let's say FILE1) line by line and print the
lines that match a certain criteria to another file.

The question is, on second pass do I need to first close FILE1 and then
reopen it (with read access). Or is there a better way to do it, to
basically go back to the beginning of the file as if I am opening it
for reading for the second time.

Tie::File could help since you'd have an array containing
all the lines of the file without slurping the whole file
into memory.

If it's a huge file and you're doing lots of searching,
Tie::File might be a bit slow. If that's an issue, you
might try File::Slurp, which despite its name, does try
avoid over-gulping memory if I remember correctly.

hth,

it_says_BALLS_on_your_forehead · Apr 16, 2006

Jürgen Exner said:
Yes, we heard you the first time already.

This can be a totally valid scenario if the file is too large to keep in
memory and e.g. the first pass is to determine the most frequent key and the
second pass to collect all lines containing that key.

sorry, google groups was acting up.

Xicheng Jia · Apr 16, 2006

=> Tie::File could help since you'd have an array containing
=> all the lines of the file without slurping the whole file
=> into memory.

dont think this could be much better than "seek", Tied things intriduce
extra implementation overhead, while by using seek, you can keep
handling the file in line-mode or whatever you previously set.

Xicheng

Peter J. Holzer · Apr 16, 2006

[Rearranged and trimmed quoting for better readability. Humans are used
to read from top to bottom, so please quote relevant context first and
add your comments after that.]

Thanks, Actually I had read about

seek (MYFILEHANLDE, 0, 0).

but I was wondering if "seek" is safer than opening and closing and
re-opening.

(If you were wondering, why didn't you ask that?)

Depends on what you mean by "safe". For a regular file, seek always
rewinds to the beginning of the same file, while reopening the file may
not do that. OTOH, seek doesn't work on some special files (like pipes
or sockets).

Or if there was any other way.

I can't think of a third way at the moment.

In one of the posts there was a suggestion that opening twice could
potentially modify the content of a file and is not considered safe or
reliable practice.

Closing and reopening the file doesn't modify the content of the file.
But when you open a file with the same name twice you are not guaranteed
to open the same file. Consider the following scenario:

1) You open file "foo" and start reading it.

2) Some other process renames "foo" to "foo.old" and creates a new file
"foo".

3) You continue to read from the file you have opened (which is now
called "foo.old").

4) You close the file.

5) You open the file "foo". Oops! This is now a different file than you
read in steps 1, 3 and 4.

hp

Charles DeRykus · Apr 16, 2006

Xicheng said:
=> Tie::File could help since you'd have an array containing
=> all the lines of the file without slurping the whole file
=> into memory.

dont think this could be much better than "seek", Tied things intriduce
extra implementation overhead, while by using seek, you can keep
handling the file in line-mode or whatever you previously set.

Yes, that's the inference I expected to be drawn when I said "if its
a huge file and you're doing lots of searching... might be a bit slow."

For convenience and ease of use though, it'd be much easier to make a
2nd pass by looping through an array instead of a seek to rewind and
re-reading...

martin · Apr 16, 2006

But this could be avoided by locking the file before reading (the first
pass). Can't one do that?

Martin

[Rearranged and trimmed quoting for better readability. Humans are used
to read from top to bottom, so please quote relevant context first and
add your comments after that.]

Thanks, Actually I had read about

seek (MYFILEHANLDE, 0, 0).

but I was wondering if "seek" is safer than opening and closing and
re-opening.

Click to expand...

(If you were wondering, why didn't you ask that?)

Depends on what you mean by "safe". For a regular file, seek always
rewinds to the beginning of the same file, while reopening the file may
not do that. OTOH, seek doesn't work on some special files (like pipes
or sockets).

Or if there was any other way.

Click to expand...

I can't think of a third way at the moment.

In one of the posts there was a suggestion that opening twice could
potentially modify the content of a file and is not considered safe or
reliable practice.

Click to expand...

Closing and reopening the file doesn't modify the content of the file.
But when you open a file with the same name twice you are not guaranteed
to open the same file. Consider the following scenario:

1) You open file "foo" and start reading it.

2) Some other process renames "foo" to "foo.old" and creates a new file
"foo".

3) You continue to read from the file you have opened (which is now
called "foo.old").

4) You close the file.

5) You open the file "foo". Oops! This is now a different file than you
read in steps 1, 3 and 4.

hp

--
_ | Peter J. Holzer | Löschung von at.usenet.schmankerl?
|_|_) | Sysadmin WSR/LUGA |
| | | (e-mail address removed) | Diskussion derzeit in at.usenet.gruppen
__/ | http://www.hjp.at/ |

Tad McClellan · Apr 16, 2006

martin said:
But this could be avoided by locking the file before reading (the first
pass). Can't one do that?

Peter said:

[Rearranged and trimmed quoting for better readability. Humans are used
to read from top to bottom, so please quote relevant context first and
add your comments after that.]

Click to expand...

[ snip TOFU]

Your rudeness is now seen as being intentional.

Off to the killfile you go.

Joe Smith · Apr 18, 2006

martin said:
But this could be avoided by locking the file before reading (the first
pass). Can't one do that?

You're supposed to put your question *AFTER* the text you are referring
to, and should cut the quoted text to the bare essentials.

But this could be avoided by locking the file before reading (the first
pass). Can't one do that?

No, it can't be avoided by locking the file. That's not the sort of
thing that locking guards against.
-Joe

jgraber · Apr 18, 2006

Joe Smith said:
You're supposed to put your question *AFTER* the text you are referring
to, and should cut the quoted text to the bare essentials.

No, it can't be avoided by locking the file. That's not the sort of
thing that locking guards against.
-Joe

You can limit the time-window of vulnerability
by opening the same file twice immediately,
then read one filehandle through,
then start over with the second filehandle.

I thought about using "<&" version of open
to dup the first filehandle, but that wont keep the
second file pointer independent.

#tested on linux with system commands echo and mv
use strict; use warnings;
system("echo 'file1' >file1"); # create file1
open ( my $FH1, '<', 'file1') or die "Cant open FH1 file1 : $!\n";
open ( my $FH2, '<', 'file1') or die "Cant open FH2 file1 : $!\n"; # dual
#open (my $FH2, '<&', $FH1 ) or die "Cant open FH2 dup FH1 : $!\n"; # dup
system("mv file1 file1.old"); # some other process
system("echo 'file2' > file1"); # some other process
while(<$FH1>){ print "FH1 ", $_; }
while(<$FH2>){ print "FH2 ", $_; }

output:
FH1 file1
FH2 file1

If you move the # some other process
lines in between the two opens,
you will see the vulnerability as output
FH1 file1
FH2 file2

Reading/writing a dictionary to file problem :(	1	Mar 31, 2020
Need help reading .wav file in C#	0	Jun 18, 2019
nice parallel file reading	14	Apr 26, 2013
Define a class containing methods for reading a file and then storelines in external variable	1	Aug 15, 2013
PHP failed to create file	13	Dec 12, 2023
How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	1	Mar 22, 2024
python adds an extra half space when reading froma string or list --back to the question	1	Jul 1, 2013
reading file round and round	6	Mar 19, 2010

Reading a file twice, back to back ?

martin

it_says_BALLS_on_your_forehead

John W. Krahn

it_says_BALLS_on_your_forehead

it_says_BALLS_on_your_forehead

it_says_BALLS_on_your_forehead

martin

Jürgen Exner

Charles DeRykus

it_says_BALLS_on_your_forehead

Xicheng Jia

Peter J. Holzer

Charles DeRykus

martin

Tad McClellan

Joe Smith

jgraber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads