A
Amer Neely
I'm stuck trying to get a count of duplicate emails in a HoH.
Input is a pipe-delimited file of users with their email addresses.
Element 0 is a unique ID number; 10 is their email.
The problem is some users are present multiple times, so that their ID
number is unique but their email remains the same.
I'm trying to prepend a string to the duplicated emails, so I can use
them to populate a MySQL table along with the ID numbers. The email
addresses will not (likely) work. Not my problem.
What I have is a working script from the 'Perl Cookbook' (recipe #4.6)
that shows me which emails are duplicated.
I'm trying to get a count of those emails ALONG with their respective IDs.
My attempts so far:
#! /usr/bin/perl
use strict;
use warnings;
my ($ID,$Email);
my %SeenEmails=();
open IN,"<",$InFile or die "Can't open $InFile: $!\n";
foreach my $line (<IN>)
{
($ID,$Email) = ((split /\|/, $line)[0,10]);
$SeenEmails{$Email}++;
}
close IN or die "Can't close $InFile: $!\n";;
for (sort keys %SeenEmails)
{
if ($SeenEmails{$_} > 1)
{
print "$_ -> $SeenEmails{$_}\n";
for my $i (1..($SeenEmails{$_}-1))
{
my $thisone=$_;
$thisone = 'dupe' . $i . '_' . $_;
print "\t$thisone\n";
}
}
}
This prepends the email with a dynamically generated string, but I'm
missing the ID number when I populate the hash initially.
If I use
$SeenEmails{$ID}{$Email}++;
I get the ID number, but I'm counting the wrong thing.
How can I get the duplicated emails (and how many there are) with their
respective IDs, so I can prepend my string?
Input is a pipe-delimited file of users with their email addresses.
Element 0 is a unique ID number; 10 is their email.
The problem is some users are present multiple times, so that their ID
number is unique but their email remains the same.
I'm trying to prepend a string to the duplicated emails, so I can use
them to populate a MySQL table along with the ID numbers. The email
addresses will not (likely) work. Not my problem.
What I have is a working script from the 'Perl Cookbook' (recipe #4.6)
that shows me which emails are duplicated.
I'm trying to get a count of those emails ALONG with their respective IDs.
My attempts so far:
#! /usr/bin/perl
use strict;
use warnings;
my ($ID,$Email);
my %SeenEmails=();
open IN,"<",$InFile or die "Can't open $InFile: $!\n";
foreach my $line (<IN>)
{
($ID,$Email) = ((split /\|/, $line)[0,10]);
$SeenEmails{$Email}++;
}
close IN or die "Can't close $InFile: $!\n";;
for (sort keys %SeenEmails)
{
if ($SeenEmails{$_} > 1)
{
print "$_ -> $SeenEmails{$_}\n";
for my $i (1..($SeenEmails{$_}-1))
{
my $thisone=$_;
$thisone = 'dupe' . $i . '_' . $_;
print "\t$thisone\n";
}
}
}
This prepends the email with a dynamically generated string, but I'm
missing the ID number when I populate the hash initially.
If I use
$SeenEmails{$ID}{$Email}++;
I get the ID number, but I'm counting the wrong thing.
How can I get the duplicated emails (and how many there are) with their
respective IDs, so I can prepend my string?