To update one file with the another file's data..

C

clearguy02

Hi,

I have two text files: each one has has four fields (delimited by a
space) on each line: id, group, email and manager_id. First file is a
small file with 50 entries and the second one is a huge file with
5,000 entries. The "id" field is same in both files, but the
manager_id's may be different. By comparing all the entries in the
second file (that has the correct manager id), I need to update the
manager_id field in the first file.

Here is the code, I am thinking of:

-----------------------------------------
open (INPUT1,"smallFile.txt") or die "Cannot open the file: $!";
open (INPUT2,"bigFile.txt") or die "Cannot open the file: $!";


while $line1 (<INPUT1>)
{
@small_arr = split /\s+/, $line1;
}

while $line2 (<INOUT2>)
{
@big_arr = split /\s+/, $line2;
}


foreach (@small_arr)
{
if ($small_arr[0] == $big_arr[0] ) # first ID is same in both
files
{
$small_arr[3] = $big_arr[3];
print "$_\n";
}
}

-------------------------------------------------

I know some thing is certainly wrong here. can some one tell me pl.?

Thanks,
J
 
D

davidfilmer

while $line1 (<INPUT1>)
{
@small_arr = split /\s+/, $line1;
}

You are reading the entire file and assigning the fields of each line
to an array. But the array only holds the fields of a single line.
By the time the while loop is done, the array holds only the fields
for the last line of the file. You have thrown away all the rest of
the file. Same thing for your second loop. That's the main reason
why your program doesn't work.

I would approach it by building a hash of the id/manager values from
the big file and then running through the second file to make the
corrections. Here's an example suitable for a newsgroup posting which
uses a DATA block for the big file and a hardcoded array for the small
file. In reality these would both be files, and you would read the
second file one-line-at-a-time with a while loop.

#!/usr/bin/perl
use strict; use warnings;

my @small = (
"id2 group1 email1 WRONG1",
"id3 group3 email3 mgr3",
"id5 group5 email5 WRONG2",
);

my %big_file_mgr;
while (<DATA>) {
my ($id, $mgr) = (split(/\s+/, $_))[0,3];
$big_file_mgr{$id} = $mgr;
}

foreach (@small) { #this would be: while (<SMALL>) {
my ($id, $group, $email, $mgr) = (split(/\s+/, $_));
$mgr = $big_file_mgr{$id} if $big_file_mgr{$id};
print "$id $group $email $mgr\n";
}

__DATA__
id1 group1 email1 mgr1
id2 group2 email2 mgr2
id3 group3 email3 mgr3
id4 group4 email4 mgr4
id5 group5 email5 mgr5
 
G

Gunnar Hjalmarsson

I have two text files: each one has has four fields (delimited by a
space) on each line: id, group, email and manager_id. First file is a
small file with 50 entries and the second one is a huge file with
5,000 entries. The "id" field is same in both files, but the
manager_id's may be different. By comparing all the entries in the
second file (that has the correct manager id), I need to update the
manager_id field in the first file.

This approach is similar to the one posted by David, both making use of
a hash for 'bigFile.txt'. I chose to use 'Tie::File' for updating
'smallFile.txt'.

use Tie::File;
my %bighash;

# Put the data of 'bigFile.txt' into %bighash;
open my $BIG, '<', 'bigFile.txt' or die $!;
while ( <$BIG> ) {
chomp;
my ( $key, $value ) = split ' ', $_, 2;
$bighash{ $key } = $value;
}

# Update manager_id of 'smallFile.txt' if needed
tie my @small, 'Tie::File', 'smallFile.txt' or die $!;
foreach my $line ( @small ) {
my @fields = split ' ', $line;
my $man_id_big = ( split ' ', $bighash{ $fields[0] } )[2];
if ( $fields[3] ne $man_id_big ) {
$line = join ' ', @fields[ 0..2 ], $man_id_big;
}
}
untie @small or die $!;

__END__
 
G

Gunnar Hjalmarsson

--------------------------------------------^
Please note the LIMIT argument.
FWIW, if the order of the fields in the input file is the same order
that the OP stipulated then this will not grab the correct fields (the
hash needs the first and fourth fields, not the first two fields).

The above does not grab only the first two fields; it assigns the first
field to $key and all the other fields to $value.

In the foreach loop I have:

my $man_id_big = ( split ' ', $bighash{ $fields[0] } )[2];

Please feel free to claim that it's a clumsy solution, but it does
address the OP's problem. :) Furthermore, my solution makes it easier
to adapt the code for the case the OP would be interested in altering
also the 'group' and/or 'email' fields.
 
C

clearguy02

Ah. I had noted it, but had forgotten exactly how it behaved. I was
thinking it split everything and then returned only the first two
splits. Thanks for refreshing my memory on how this limit (which I
seldom use) actually behaves.

Thnks to every one.

J
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top