if exsist hash key.. why not finding it???

J

jason

Hello.

I'm trying to merge two files by a common key field.

file1.csv format:
B1,B2,B3,B4,BK

file2.csv format:
A1,AK,A3,A4


Need to produce:
file3.csv:
A3,A4,B2,B3,B4,BK

Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
file1.csv must end up in file3.csv

Here's the code I have :



#!/usr/bin/perl
open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
my %a;
open FILEA, 'filea.csv' "could not open 'filea' $!";
while ( <FILEA> ) {
$_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;
my @f = split /,/;
my $key = $f[4];
#print $key;
if ( exists $a{$key} )
{
print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
}
else
{
$a{$key} = \@f;
}
}

open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
while ( <FILEB> ) {
my @b = split /,/;
my $key = $b[1];
#BELOW IS THE SUSPECT LINE
if ( exists $a{$key} ) {
my $m = join ',', $key,
$a{$key}[2],
$b[1],
$b[2],
$b[3];
print FILEC "$m\n";
print "entry found\n";
}
else
{
# print "key not found ",$key,"\n";
}

}

close FILEA;
close FILEB;
close FILEC;

sub comma_fixer {
$string = @_[0];
$string =~ s/,/ /g; ## replace , with blank,
return $string;


=====

Every single line spits out "key not found".
What am I doing wrong and is there a better way to approach this?
 
M

Mumia W.

Hello.

I'm trying to merge two files by a common key field.

file1.csv format:
B1,B2,B3,B4,BK

file2.csv format:
A1,AK,A3,A4


Need to produce:
file3.csv:
A3,A4,B2,B3,B4,BK

Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
file1.csv must end up in file3.csv

Here's the code I have :



#!/usr/bin/perl
open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
[...]

That's a deceptive error message. There are a number of
warnings and errors in your program.

Please put these lines at the top of your program and modify
your program to work with them:

use strict;
use warnings;

And please try again to explain what you are trying to do. You
need a few more data records in filea and fileb to demonstrate.

Of course you know that AK != BK unless you mean that
Ameriking == Burger King ;-)
 
P

Paul Lalli

Mumia said:
Of course you know that AK != BK unless you mean that
Ameriking == Burger King ;-)

Just to be pedantic...

'AK' == 'BK' is actually true.
'AK' eq 'BK', on the other hand, is false. :)

Paul Lalli
 
A

anno4000

Hello.

I'm trying to merge two files by a common key field.

file1.csv format:
B1,B2,B3,B4,BK

file2.csv format:
A1,AK,A3,A4


Need to produce:
file3.csv:
A3,A4,B2,B3,B4,BK

Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
file1.csv must end up in file3.csv

Here's the code I have :



#!/usr/bin/perl

Missing:

use strict;
use warnings;
open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";

You never use that file anywhere. Why is it there?
my %a;
open FILEA, 'filea.csv' "could not open 'filea' $!";
while ( <FILEA> ) {
$_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;

If you need that kind of fixing you'd better use one of the CSV modules
to read the file. Use a direct split on comma only if that's all
there's to it.
my @f = split /,/;
my $key = $f[4];
#print $key;
if ( exists $a{$key} )
{
print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
}
else
{
$a{$key} = \@f;
}
}

open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
while ( <FILEB> ) {

No comma-fixing for FILEB?
my @b = split /,/;
my $key = $b[1];
#BELOW IS THE SUSPECT LINE
if ( exists $a{$key} ) {
my $m = join ',', $key,
$a{$key}[2],
$b[1],
$b[2],
$b[3];
print FILEC "$m\n";
print "entry found\n";
}
else
{
# print "key not found ",$key,"\n";
}

}

close FILEA;
close FILEB;
close FILEC;

sub comma_fixer {
$string = @_[0];
$string =~ s/,/ /g; ## replace , with blank,
return $string;


=====

Every single line spits out "key not found".
What am I doing wrong and is there a better way to approach this?

I don't know why exists() doesn't find the strings you expect. You
haven't provided runnable code including test data. I'm not going
to make them up for you, so I can't debug it.

If your files aren't huge, I'd use a different approach. Write a sub
that reads a CSV file into a hash, keying on a given field in each
record. That is more or less the code you've written for FILEA. Use
it to read both files, and the build the output from the hashes.
Roughly, making some simplifying assumptions, and untested:

my $f1 = read_csv( 'filea.csv', 4);
my $f2 = read_csv( 'fileb.csv', 1);

my $out;
open $out, '>', $_ or die "Can't overwrite '$_': $!" for 'filec.csv';

while ( my ( $key, $reca) = each %$f1 ) {
my $recb = $f2->{ $key} or warn "key '$key' not found", next;
print $out join( ',' => @$f2{ 2, 3}, @$f1{ 1 .. 4}), "\n";
}

sub read_csv {
my ( $file, $ikey) = @_;
open my $in, $file or die "Can't read '$file': $!";
my %h;
while ( <$in> ) {
chomp;
my @rec = split /,/;
warn "Duplicate key '$rec[ $ikey]'" if exists $h{ $rec[ $ikey]};
$h{ $rec[ $ikey]} = \ @rec;
}
\ %h;
}

The output file will be in arbitrary order (not input order). If that
matters, modify or use a different approach.

Anno
 
G

Gary E. Ansok

Hello.

I'm trying to merge two files by a common key field.

file1.csv format:
B1,B2,B3,B4,BK

file2.csv format:
A1,AK,A3,A4

Any time you have two strings that you think should be identical
but aren't, the first thing you should do is to look at the strings.
Often, one or the other will have extra whitespace (such as a trailing
newline) or otherwise won't contain exactly what you think it does.

If the output will appear on a Web page or any other place where
output might get re-wrapped or where spaces will be difficult to see,
take extra care (I sometimes print out the string length in this case).
#!/usr/bin/perl
open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
my %a;
open FILEA, 'filea.csv' "could not open 'filea' $!";
while ( <FILEA> ) {
$_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;
my @f = split /,/;
my $key = $f[4];
#print $key;
if ( exists $a{$key} )
{
print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
}
else
{
$a{$key} = \@f;
}
}

Here, you could insert

use Data::Dumper;
print Dumper \%a;

to see exactly what keys have been stored in %a.
open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
while ( <FILEB> ) {
my @b = split /,/;
my $key = $b[1];

Here, you could insert something like

print "Looking for key >>>$key<<<\n";
#BELOW IS THE SUSPECT LINE
if ( exists $a{$key} ) {
my $m = join ',', $key,
$a{$key}[2],
$b[1],
$b[2],
$b[3];
print FILEC "$m\n";
print "entry found\n";
}
else
{
# print "key not found ",$key,"\n";
}

}

close FILEA;
close FILEB;
close FILEC;

sub comma_fixer {
$string = @_[0];
$string =~ s/,/ /g; ## replace , with blank,
return $string;

Every single line spits out "key not found".
What am I doing wrong and is there a better way to approach this?

Check the two strings to verify that they are identical.

I notice that you aren't using chomp() on the data you read from
the input files -- this is probably causing one of your strings to
contain a trailing newline. For more information, do "perldoc -f chomp".

Gary
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top