Opening large data file with Perl

P

pppe

Hi

I am posting this from my post to Perl Programming.
--

Is someone able to tell me if when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?

To expand further, I currently use a perl script to open a data file,
search each record for a match to a user input and then store the
matched data in an array. I stop the search if there are more than 200
results so the array never exceeds this, but as the data file is so
large I wonder if there is any load on server memory, especially if
numerous users were accessing it at once.

Also, should I expect the server load to be greater with such a large
file due to execution time? I already run this type of script on
smaller files (5mb+) but an 80mb file concerns me on a shared server.

Thanks
pppe
 
R

robic0

I am posting this from my post to Perl Programming.
--
Is someone able to tell me if when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?
[snipe]

No problem, post some code. This is the Perl code group.
Post your Perl sample code. If you have some other concept constructs
you might not get a response here...........
 
A

Anno Siegel

pppe said:
Hi

I am posting this from my post to Perl Programming.
--

Is someone able to tell me if when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?

The size of the file doesn't matter when you open it. How much of the
file is in memory during processing depends on how you do the processing.

Anno
 
J

Joe Smith

pppe said:
Is someone able to tell me if when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?

Depends entirely on what code you are using to do the reading.

@entire_file = <>; # Reads entire file all at once
whereas
while( said:
Also, should I expect the server load to be greater with such a large
file due to execution time?

If the average number of lines required to get the data you want is
the same, then it does not matter how big the unread portion of
the file is.
-Joe
 
B

Bart Van der Donck

pppe said:
Is someone able to tell me if when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?

You should process a 80MB file line by line and directly operate on it
in the read loop, without using extra variables.
To expand further, I currently use a perl script to open a data file,
search each record for a match to a user input and then store the
matched data in an array. I stop the search if there are more than 200
results so the array never exceeds this, but as the data file is so
large I wonder if there is any load on server memory, especially if
numerous users were accessing it at once.

I think the most memory-efficient way is the following:

#!/usr/bin/perl
use strict;
use warnings;

my @matched = ();
my $file = 'file.dat';

open my $F, '<', $file || die "Cant open $file: $!";
flock($F, 1) || die "Cant get LOCK_SH on $file: $!";
while(<$F>) {
if (/pattern/) { push(@matched, $_); }
}
close $F || die "Cant close $file: $!";

# small report utility
print for @matched;

See also PerfFAQ 3.16: " How can I make my Perl program take less
memory? "

Hope this helps,
 
J

Jürgen Exner

pppe said:
I am posting this from my post to Perl Programming.

Why did you separate the bulk of your post with a signature separator?
My Newsreader automatically snipes signatures, so now I have to manually
copy and paste the text I want to quote.

<quote>
when using the open command to read data
from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?
</quote>

Neither, nor. open() doesn't load any content into memory. It just opens the
file.
How you read it is totally up to you:

To read the next line:
while (<MYFILE>)
or
$line = <MYFILE>

To read the whole file:
@everything = <MYFILE>


jue
 
B

Brian Wakem

Jürgen Exner said:
Why did you separate the bulk of your post with a signature separator?
My Newsreader automatically snipes signatures, so now I have to manually
copy and paste the text I want to quote.


Your newsreader is broken, that was not a sig seperator.
 
T

Tad McClellan

pppe said:
Is someone able to tell me if when using the open command to read data


The open() function does not read data!

"opening a file" and "reading from a file" are distinct operations.

from a file, if the file was large, say 80mb, would it load much of the
file into memory or would it only take a record at a time?


It would not load _any_ of the file contents into memory.

How much memory will be used depends on how you are *reading* the file.

How are you reading the file?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top