-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have the same problem.
I have two txt files (txta and texb) each with 15-20000 lines. Each
line contains a username + info. Now what I want to do is take a line
from texta, retrieve the username from this line, search for the
username in txtb and return the info that's related to the username i
txtb. My code looks like:
open (txta, "txta");
@txta= <txta>;
close(txta);
open (txtb, "txtb");
@txtb= <txtb>;
close(txtb);
foreach $txta_line (@txta)
{
[Eric gets out the Holy Baseball Bat of Education]
Do NOT [whap!] read [whap!] an entire *file* [whap!] into an *array*
[whap!] just so you can [whap!] loop [whap!] over it!!! [whap-whap-whap!]
[Eric puts the baseball bat away]
Okay, now that we're in the proper frame of mind for some education:
One: You read an entire file (txta) into an array, then apparently do
nothing more than loop over that array. This is Bad. [don't make me get
the baseball bat out again.] It's a waste of memory, and it doesn't buy
you anything. It's foolish. It means that you copied a bad habit from
some bad book or bad instructor.
Two: You read another entire file (txtb) into memory, and repeatedly loop
over it in search of user ids. In this case, it's not entirely hopeless
that you slurped the entire file into memory, because you're referencing
it frequently, but there's a better way. Read through the txtb file,
finding userids, and put them into a hash, and reference them while going
through txta. That way you will only have to scan file txtb once.
Something like this:
my %uid_lookup;
while (<txtb>)
{
chomp; # (?)
$uid_lookup{$1} = $2 if /(.*);(.*)/;
}
while (<txta>)
{
get userid;
$userinfo = $uid_lookup{$userid};
...
}
There. One pass per file. No huge arrays in memory (although, who knows
how large %uid_lookup can get?).
- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <
http://www.pgp.com>
iQA/AwUBP4LxsWPeouIeTNHoEQIGtQCfXsffnYbsXe1F5ph2Z21mUBU/HwQAnA8W
05GqIlnA5zusx6yRqyX7PyP4
=Wk1v
-----END PGP SIGNATURE-----