Identifying common elements from multiple hashes

Neil · Dec 15, 2005

Hello,

I'm new to this group and I greatly need and would appreciate your
help. I am trying to write a program that will compare multiple
hashes. Each hash is basically a table of 'x' and 'y' values. For
example, Table 1 could look something like this:

Table 1
x-values y-values
1 10
5 21
11 1000
17 43
21 10000

First, the program needs to identify the values of 'x' that appear in
all 'n' hashes. Once identified, it needs to compute the average of
the y-values that correspond to that x-value.
This is better illustrated by an example. Consider, Table 2:

Table 2
x-values y-values
1 8
7 21
12 1000
17 45
22 10000

Both Table 1 and Table 2 (hashes) contain the x-values: 1 and 17.
Therefore, Table 3 (generated by the program) would be:

Table 3
x-values y-values
1 9 # as (10 + 8)/2 = 9
17 44 # as (43 + 45)/2 = 44

This program needs to be able to be extended to any number of hashes
though.
If someone could just outline the general approach, I would be greatly
indebted. Thank you so much.

Sincerely, Neal

usenet · Dec 15, 2005

Neil said:
Hello,

I'm new to this group and I greatly need and would appreciate your help.

Welcome to comp.lang.perl.misc. Being new to the group, you may not be
aware that many of the regular posters here encourage you to read and
abide by the group posting guidelines, which you may read on the web
here:
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
These guidelines are for YOUR benefit, because they show you how to ask
a good question which is highly likely to get a good answer.

I am trying to write a program that will compare multiple hashes.
Each hash is basically a table of 'x' and 'y' values. For example...

You are speaking English. It is hard to understand exactly what your
data structure looks like. That's why the posting guidelines encourage
you to speak Perl (ie, show us the code that creates your hashes, or
show us a Data:

umper representation of the hash).

First, the program needs to identify the values of 'x' that appear in
all 'n' hashes. Once identified, it needs to compute the average of
the y-values that correspond to that x-value.

I hope you're familiar with CPAN; I would refer you to the
List::Compare module to easily compute the intersection of your keys:

http://search.cpan.org/~jkeenan/List-Compare-0.32/lib/List/Compare.pm

This works (but it may not mimic your data structure, since you didn't
really specify your data structure):

#!/usr/bin/perl
use warnings; use strict;
use List::Compare;

my %hash1 = qw/1 10 5 21 11 1000 17 43 21 10000/;
my %hash2 = qw/1 8 7 21 12 1000 17 45 22 10000/;

my $lc = List::Compare -> new([keys %hash1], [keys %hash2]);

my%hash3;
for ($lc->get_intersection) {
$hash3{$_} = ($hash1{$_} + $hash2{$_}) /2;
print "$_\t$hash3{$_}\n";
}

__END__

John W. Krahn · Dec 16, 2005

Neil said:
I'm new to this group and I greatly need and would appreciate your
help. I am trying to write a program that will compare multiple
hashes. Each hash is basically a table of 'x' and 'y' values. For
example, Table 1 could look something like this:

Table 1
x-values y-values
1 10
5 21
11 1000
17 43
21 10000

First, the program needs to identify the values of 'x' that appear in
all 'n' hashes. Once identified, it needs to compute the average of
the y-values that correspond to that x-value.
This is better illustrated by an example. Consider, Table 2:

Table 2
x-values y-values
1 8
7 21
12 1000
17 45
22 10000

Both Table 1 and Table 2 (hashes) contain the x-values: 1 and 17.
Therefore, Table 3 (generated by the program) would be:

Table 3
x-values y-values
1 9 # as (10 + 8)/2 = 9
17 44 # as (43 + 45)/2 = 44

This program needs to be able to be extended to any number of hashes
though.
If someone could just outline the general approach, I would be greatly
indebted. Thank you so much.

$ perl -e'
my %hash1 = qw/
1 10
5 21
11 1000
17 43
21 10000
/;
my %hash2 = qw/
1 8
7 21
12 1000
17 45
22 10000
/;

use Data:

umper;
my @hashes = \( %hash1, %hash2 );

my %common_keys;
for my $hash_ref ( @hashes ) {
$common_keys{ $_ }++ for keys %$hash_ref;
}

my %averages;
for my $hash_ref ( @hashes ) {
$averages{ $_ } += $hash_ref->{ $_ } for grep $common_keys{ $_ } ==
@hashes, keys %common_keys;
}
$_ /= @hashes for values %averages;

print Dumper \%averages;
'
$VAR1 = {
'1' => '9',
'17' => '44'
};

John

Neil · Dec 16, 2005

Hi everyone,

Thank you to those who gave me your excellent suggestions. My problem
is that I want to analyze 'n' number of tables that are composed of 'x'
and 'y' values and make a new hash that consists of only 'x' values
that are common to all 'n' hashes and that values that correpond to the
key 'x' is an average of all 'n' y values. The code that I am
including below illustrates how I generate the data structure that I
want to analyze. I tried to use the module List::Compare, but this
module requires that you use the format as given:

@Al = qw(abel abel baker camera delta edward fargo golfer);
@Bob = qw(baker camera delta delta edward fargo golfer hilton);
@Carmen = qw(fargo golfer hilton icon icon jerky kappa);
@Don = qw(fargo icon jerky);
@Ed = qw(fargo icon icon jerky);

$lcm = List::Compare->new(\@Al, \@Bob, \@Carmen, \@Don, \@Ed);

However, if you will look at the way that I am generating my data
structure and how the number of lists can be totally different each
time, it is apparent that this method will not work. I would really
appreciate it if someone could please offer me some advice. Thank you
so much.

#!/usr/bin/perl
use List::Compare;
print "Please enter the number of related tables that you want to
analyze.";
print "\n";

my $num_of_spectra = <STDIN>;
chomp $num_of_spectra;

my @dtafilename_array;
$c1=0;
$c2=0;
$c3=0;

while ($c1 < $num_of_spectra) {
print "Please enter the name of the .dta file.\n";
$dtafilename_array[$c1] = <STDIN>;
chomp $dtafilename_array[$c1];
$c1++;
}

foreach $dta (@dtafilename_array) {

$dtafile = $dtafilename_array[$c2];

unless ( -e $dtafile) {
print "File \"$dtafile\" doesn\'t seem to exist!!\n";
exit;
}
unless ( open(DTAFILE, $dtafile) ) {
print "Cannot open file \"$dtafile\"\n\n";
exit;
}

$c2++;

while ($dtafileline = <DTAFILE>) {
chomp $dtafileline;
@columns = split("\t", $dtafileline);
$ms_data[$c3]{$columns[0]} = $columns[1];
}

$c3++;

close DTAFILE;

}

# Here is some sample data.

File 1:

6 100
7 95
8 96

File 2:

6 109
11 87
12 45

File 3:

6 103
7 87
15 43

Neil · Dec 16, 2005

Thank you very much, David, John and Jim!

Taskcproblem calendar	4	Aug 31, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Problem with codewars.	5	Dec 4, 2023
Minimum Total Difficulty	0	Nov 15, 2023
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Parsing an Array of Hashes	3	Sep 22, 2008
2d array of hashes	3	May 25, 2011
How to output accept items for repair	1	Nov 18, 2021

Identifying common elements from multiple hashes

Neil

usenet

John W. Krahn

Neil

Neil

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads