Identifying common elements from multiple hashes

Discussion in 'Perl Misc' started by Neil, Dec 15, 2005.

  1. Neil

    Neil Guest

    Hello,

    I'm new to this group and I greatly need and would appreciate your
    help. I am trying to write a program that will compare multiple
    hashes. Each hash is basically a table of 'x' and 'y' values. For
    example, Table 1 could look something like this:

    Table 1
    x-values y-values
    1 10
    5 21
    11 1000
    17 43
    21 10000

    First, the program needs to identify the values of 'x' that appear in
    all 'n' hashes. Once identified, it needs to compute the average of
    the y-values that correspond to that x-value.
    This is better illustrated by an example. Consider, Table 2:

    Table 2
    x-values y-values
    1 8
    7 21
    12 1000
    17 45
    22 10000

    Both Table 1 and Table 2 (hashes) contain the x-values: 1 and 17.
    Therefore, Table 3 (generated by the program) would be:

    Table 3
    x-values y-values
    1 9 # as (10 + 8)/2 = 9
    17 44 # as (43 + 45)/2 = 44

    This program needs to be able to be extended to any number of hashes
    though.
    If someone could just outline the general approach, I would be greatly
    indebted. Thank you so much.

    Sincerely, Neal
     
    Neil, Dec 15, 2005
    #1
    1. Advertisements

  2. Neil

    usenet Guest

    Welcome to comp.lang.perl.misc. Being new to the group, you may not be
    aware that many of the regular posters here encourage you to read and
    abide by the group posting guidelines, which you may read on the web
    here:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
    These guidelines are for YOUR benefit, because they show you how to ask
    a good question which is highly likely to get a good answer.
    You are speaking English. It is hard to understand exactly what your
    data structure looks like. That's why the posting guidelines encourage
    you to speak Perl (ie, show us the code that creates your hashes, or
    show us a Data::Dumper representation of the hash).
    I hope you're familiar with CPAN; I would refer you to the
    List::Compare module to easily compute the intersection of your keys:

    http://search.cpan.org/~jkeenan/List-Compare-0.32/lib/List/Compare.pm

    This works (but it may not mimic your data structure, since you didn't
    really specify your data structure):

    #!/usr/bin/perl
    use warnings; use strict;
    use List::Compare;

    my %hash1 = qw/1 10 5 21 11 1000 17 43 21 10000/;
    my %hash2 = qw/1 8 7 21 12 1000 17 45 22 10000/;

    my $lc = List::Compare -> new([keys %hash1], [keys %hash2]);

    my%hash3;
    for ($lc->get_intersection) {
    $hash3{$_} = ($hash1{$_} + $hash2{$_}) /2;
    print "$_\t$hash3{$_}\n";
    }

    __END__
     
    usenet, Dec 15, 2005
    #2
    1. Advertisements

  3. $ perl -e'
    my %hash1 = qw/
    1 10
    5 21
    11 1000
    17 43
    21 10000
    /;
    my %hash2 = qw/
    1 8
    7 21
    12 1000
    17 45
    22 10000
    /;

    use Data::Dumper;
    my @hashes = \( %hash1, %hash2 );

    my %common_keys;
    for my $hash_ref ( @hashes ) {
    $common_keys{ $_ }++ for keys %$hash_ref;
    }

    my %averages;
    for my $hash_ref ( @hashes ) {
    $averages{ $_ } += $hash_ref->{ $_ } for grep $common_keys{ $_ } ==
    @hashes, keys %common_keys;
    }
    $_ /= @hashes for values %averages;

    print Dumper \%averages;
    '
    $VAR1 = {
    '1' => '9',
    '17' => '44'
    };



    John
     
    John W. Krahn, Dec 16, 2005
    #3
  4. Neil

    Neil Guest

    Hi everyone,

    Thank you to those who gave me your excellent suggestions. My problem
    is that I want to analyze 'n' number of tables that are composed of 'x'
    and 'y' values and make a new hash that consists of only 'x' values
    that are common to all 'n' hashes and that values that correpond to the
    key 'x' is an average of all 'n' y values. The code that I am
    including below illustrates how I generate the data structure that I
    want to analyze. I tried to use the module List::Compare, but this
    module requires that you use the format as given:

    @Al = qw(abel abel baker camera delta edward fargo golfer);
    @Bob = qw(baker camera delta delta edward fargo golfer hilton);
    @Carmen = qw(fargo golfer hilton icon icon jerky kappa);
    @Don = qw(fargo icon jerky);
    @Ed = qw(fargo icon icon jerky);

    $lcm = List::Compare->new(\@Al, \@Bob, \@Carmen, \@Don, \@Ed);

    However, if you will look at the way that I am generating my data
    structure and how the number of lists can be totally different each
    time, it is apparent that this method will not work. I would really
    appreciate it if someone could please offer me some advice. Thank you
    so much.


    #!/usr/bin/perl
    use List::Compare;
    print "Please enter the number of related tables that you want to
    analyze.";
    print "\n";

    my $num_of_spectra = <STDIN>;
    chomp $num_of_spectra;

    my @dtafilename_array;
    $c1=0;
    $c2=0;
    $c3=0;

    while ($c1 < $num_of_spectra) {
    print "Please enter the name of the .dta file.\n";
    $dtafilename_array[$c1] = <STDIN>;
    chomp $dtafilename_array[$c1];
    $c1++;
    }

    foreach $dta (@dtafilename_array) {

    $dtafile = $dtafilename_array[$c2];

    unless ( -e $dtafile) {
    print "File \"$dtafile\" doesn\'t seem to exist!!\n";
    exit;
    }
    unless ( open(DTAFILE, $dtafile) ) {
    print "Cannot open file \"$dtafile\"\n\n";
    exit;
    }

    $c2++;

    while ($dtafileline = <DTAFILE>) {
    chomp $dtafileline;
    @columns = split("\t", $dtafileline);
    $ms_data[$c3]{$columns[0]} = $columns[1];
    }

    $c3++;

    close DTAFILE;

    }

    # Here is some sample data.

    File 1:

    6 100
    7 95
    8 96

    File 2:

    6 109
    11 87
    12 45

    File 3:

    6 103
    7 87
    15 43
     
    Neil, Dec 16, 2005
    #4
  5. Neil

    Neil Guest

    Thank you very much, David, John and Jim!
     
    Neil, Dec 16, 2005
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.