if exsist hash key.. why not finding it???

Discussion in 'Perl Misc' started by jason@cyberpine.com, Sep 7, 2006.

  1. Guest

    Hello.

    I'm trying to merge two files by a common key field.

    file1.csv format:
    B1,B2,B3,B4,BK

    file2.csv format:
    A1,AK,A3,A4


    Need to produce:
    file3.csv:
    A3,A4,B2,B3,B4,BK

    Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
    file1.csv must end up in file3.csv

    Here's the code I have :



    #!/usr/bin/perl
    open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
    my %a;
    open FILEA, 'filea.csv' "could not open 'filea' $!";
    while ( <FILEA> ) {
    $_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;
    my @f = split /,/;
    my $key = $f[4];
    #print $key;
    if ( exists $a{$key} )
    {
    print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
    }
    else
    {
    $a{$key} = \@f;
    }
    }

    open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
    while ( <FILEB> ) {
    my @b = split /,/;
    my $key = $b[1];
    #BELOW IS THE SUSPECT LINE
    if ( exists $a{$key} ) {
    my $m = join ',', $key,
    $a{$key}[2],
    $b[1],
    $b[2],
    $b[3];
    print FILEC "$m\n";
    print "entry found\n";
    }
    else
    {
    # print "key not found ",$key,"\n";
    }

    }

    close FILEA;
    close FILEB;
    close FILEC;

    sub comma_fixer {
    $string = @_[0];
    $string =~ s/,/ /g; ## replace , with blank,
    return $string;


    =====

    Every single line spits out "key not found".
    What am I doing wrong and is there a better way to approach this?
    , Sep 7, 2006
    #1
    1. Advertising

  2. Mumia W. Guest

    On 09/07/2006 06:58 AM, wrote:
    > Hello.
    >
    > I'm trying to merge two files by a common key field.
    >
    > file1.csv format:
    > B1,B2,B3,B4,BK
    >
    > file2.csv format:
    > A1,AK,A3,A4
    >
    >
    > Need to produce:
    > file3.csv:
    > A3,A4,B2,B3,B4,BK
    >
    > Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
    > file1.csv must end up in file3.csv
    >
    > Here's the code I have :
    >
    >
    >
    > #!/usr/bin/perl
    > open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
    > [...]


    That's a deceptive error message. There are a number of
    warnings and errors in your program.

    Please put these lines at the top of your program and modify
    your program to work with them:

    use strict;
    use warnings;

    And please try again to explain what you are trying to do. You
    need a few more data records in filea and fileb to demonstrate.

    Of course you know that AK != BK unless you mean that
    Ameriking == Burger King ;-)
    Mumia W., Sep 7, 2006
    #2
    1. Advertising

  3. Paul Lalli Guest

    Mumia W. wrote:

    > Of course you know that AK != BK unless you mean that
    > Ameriking == Burger King ;-)


    Just to be pedantic...

    'AK' == 'BK' is actually true.
    'AK' eq 'BK', on the other hand, is false. :)

    Paul Lalli
    Paul Lalli, Sep 7, 2006
    #3
  4. -berlin.de Guest

    <> wrote in comp.lang.perl.misc:
    > Hello.
    >
    > I'm trying to merge two files by a common key field.
    >
    > file1.csv format:
    > B1,B2,B3,B4,BK
    >
    > file2.csv format:
    > A1,AK,A3,A4
    >
    >
    > Need to produce:
    > file3.csv:
    > A3,A4,B2,B3,B4,BK
    >
    > Where file1.csv is using BK=AK to lookup A3 and A4. Every record in
    > file1.csv must end up in file3.csv
    >
    > Here's the code I have :
    >
    >
    >
    > #!/usr/bin/perl


    Missing:

    use strict;
    use warnings;

    > open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";


    You never use that file anywhere. Why is it there?

    > my %a;
    > open FILEA, 'filea.csv' "could not open 'filea' $!";
    > while ( <FILEA> ) {
    > $_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;


    If you need that kind of fixing you'd better use one of the CSV modules
    to read the file. Use a direct split on comma only if that's all
    there's to it.

    > my @f = split /,/;
    > my $key = $f[4];
    > #print $key;
    > if ( exists $a{$key} )
    > {
    > print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
    > }
    > else
    > {
    > $a{$key} = \@f;
    > }
    > }
    >
    > open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
    > while ( <FILEB> ) {


    No comma-fixing for FILEB?

    > my @b = split /,/;
    > my $key = $b[1];
    > #BELOW IS THE SUSPECT LINE
    > if ( exists $a{$key} ) {
    > my $m = join ',', $key,
    > $a{$key}[2],
    > $b[1],
    > $b[2],
    > $b[3];
    > print FILEC "$m\n";
    > print "entry found\n";
    > }
    > else
    > {
    > # print "key not found ",$key,"\n";
    > }
    >
    > }
    >
    > close FILEA;
    > close FILEB;
    > close FILEC;
    >
    > sub comma_fixer {
    > $string = @_[0];
    > $string =~ s/,/ /g; ## replace , with blank,
    > return $string;
    >
    >
    > =====
    >
    > Every single line spits out "key not found".
    > What am I doing wrong and is there a better way to approach this?


    I don't know why exists() doesn't find the strings you expect. You
    haven't provided runnable code including test data. I'm not going
    to make them up for you, so I can't debug it.

    If your files aren't huge, I'd use a different approach. Write a sub
    that reads a CSV file into a hash, keying on a given field in each
    record. That is more or less the code you've written for FILEA. Use
    it to read both files, and the build the output from the hashes.
    Roughly, making some simplifying assumptions, and untested:

    my $f1 = read_csv( 'filea.csv', 4);
    my $f2 = read_csv( 'fileb.csv', 1);

    my $out;
    open $out, '>', $_ or die "Can't overwrite '$_': $!" for 'filec.csv';

    while ( my ( $key, $reca) = each %$f1 ) {
    my $recb = $f2->{ $key} or warn "key '$key' not found", next;
    print $out join( ',' => @$f2{ 2, 3}, @$f1{ 1 .. 4}), "\n";
    }

    sub read_csv {
    my ( $file, $ikey) = @_;
    open my $in, $file or die "Can't read '$file': $!";
    my %h;
    while ( <$in> ) {
    chomp;
    my @rec = split /,/;
    warn "Duplicate key '$rec[ $ikey]'" if exists $h{ $rec[ $ikey]};
    $h{ $rec[ $ikey]} = \ @rec;
    }
    \ %h;
    }

    The output file will be in arbitrary order (not input order). If that
    matters, modify or use a different approach.

    Anno
    -berlin.de, Sep 7, 2006
    #4
  5. In article <>,
    <> wrote:
    >Hello.
    >
    >I'm trying to merge two files by a common key field.
    >
    >file1.csv format:
    >B1,B2,B3,B4,BK
    >
    >file2.csv format:
    >A1,AK,A3,A4


    Any time you have two strings that you think should be identical
    but aren't, the first thing you should do is to look at the strings.
    Often, one or the other will have extra whitespace (such as a trailing
    newline) or otherwise won't contain exactly what you think it does.

    If the output will appear on a Web page or any other place where
    output might get re-wrapped or where spaces will be difficult to see,
    take extra care (I sometimes print out the string length in this case).

    > #!/usr/bin/perl
    >open FILEC, '>filec.csv' or die "could not open 'matched.csv' $!";
    >my %a;
    >open FILEA, 'filea.csv' "could not open 'filea' $!";
    >while ( <FILEA> ) {
    > $_ =~ s/"([^"]*)"/&comma_fixer($1)/ge;
    > my @f = split /,/;
    > my $key = $f[4];
    > #print $key;
    > if ( exists $a{$key} )
    > {
    > print "Duplicate key found in ", $key ," ",$_," ", @f,"\n";
    > }
    > else
    > {
    > $a{$key} = \@f;
    > }
    >}


    Here, you could insert

    use Data::Dumper;
    print Dumper \%a;

    to see exactly what keys have been stored in %a.

    >open FILEB, 'fileb.csv' or die "could not open 'fileb' $!";
    >while ( <FILEB> ) {
    > my @b = split /,/;
    > my $key = $b[1];


    Here, you could insert something like

    print "Looking for key >>>$key<<<\n";

    > #BELOW IS THE SUSPECT LINE
    > if ( exists $a{$key} ) {
    > my $m = join ',', $key,
    > $a{$key}[2],
    > $b[1],
    > $b[2],
    > $b[3];
    > print FILEC "$m\n";
    > print "entry found\n";
    > }
    > else
    > {
    > # print "key not found ",$key,"\n";
    > }
    >
    >}
    >
    >close FILEA;
    >close FILEB;
    >close FILEC;
    >
    >sub comma_fixer {
    > $string = @_[0];
    > $string =~ s/,/ /g; ## replace , with blank,
    > return $string;
    >
    >Every single line spits out "key not found".
    >What am I doing wrong and is there a better way to approach this?


    Check the two strings to verify that they are identical.

    I notice that you aren't using chomp() on the data you read from
    the input files -- this is probably causing one of your strings to
    contain a trailing newline. For more information, do "perldoc -f chomp".

    Gary
    --
    The recipe says "toss lightly," but I suppose that depends
    on how much you eat and how bad the cramps get. - J. Lileks
    Gary E. Ansok, Sep 7, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,756
    Smokey Grindel
    Dec 2, 2006
  2. Hongyu
    Replies:
    4
    Views:
    7,811
    Hongyu
    Jul 30, 2008
  3. Paw
    Replies:
    8
    Views:
    386
  4. Une bévue
    Replies:
    5
    Views:
    141
    Une bévue
    Aug 10, 2006
  5. WHY NOT FINDING HASH KEY????

    , Sep 7, 2006, in forum: Perl Misc
    Replies:
    5
    Views:
    89
    A. Sinan Unur
    Sep 7, 2006
Loading...

Share This Page