R
Randy
Hello,
I have a text file that stores names and email addresses. This data is built
from a feedback form on my website. Here is the format of my textfile
entries:
Dan Smith,[email protected]
Mike Roberts,[email protected]
Steve Anderson,[email protected]
and so on.
As you can see, it's pretty much a standard CSV textfile. Overtime, this
database has grown very big, and there are several duplicate email addresses
in the data. Until recently I have had to visually go through the data and
remove duplicate email addresses I can find, regardless of what is found in
the name field. I am seeking assistance on how I could write a script that
would scan each line, separate the names field from the email address field,
then scan and remove duplicates. So far all I have is the following:
#!/usr/bin/perl
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use strict;
my @data, $data, $name, $email;
open (FH, "<data.txt") or die "Can't open file: $!";
@data=<FH>;
close(FH);
foreach $data (@data) {
chomp ($data);
($name,$email)=split(/\,/,$data);
\\ Missing scan for duplicates and removal code here \\
}
open (FH, ">data.txt") or die "Can't open file: $!";
print FH @data;
close(FH);
Yes I am a newbie Perl programmer. I'm not very good at brainstorming an
approach to sorting/matching routines. I would very much appreciate some
help understanding and building the final element. Another complication is
what if there are two identical email addresses but one is all caps and the
other isn't. I'm not looking for someone to write me the code I need,
instead to point me in the right direction so that I actually learn
something and forward my Perl skills. Thankx everyone.
Robert
I have a text file that stores names and email addresses. This data is built
from a feedback form on my website. Here is the format of my textfile
entries:
Dan Smith,[email protected]
Mike Roberts,[email protected]
Steve Anderson,[email protected]
and so on.
As you can see, it's pretty much a standard CSV textfile. Overtime, this
database has grown very big, and there are several duplicate email addresses
in the data. Until recently I have had to visually go through the data and
remove duplicate email addresses I can find, regardless of what is found in
the name field. I am seeking assistance on how I could write a script that
would scan each line, separate the names field from the email address field,
then scan and remove duplicates. So far all I have is the following:
#!/usr/bin/perl
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use strict;
my @data, $data, $name, $email;
open (FH, "<data.txt") or die "Can't open file: $!";
@data=<FH>;
close(FH);
foreach $data (@data) {
chomp ($data);
($name,$email)=split(/\,/,$data);
\\ Missing scan for duplicates and removal code here \\
}
open (FH, ">data.txt") or die "Can't open file: $!";
print FH @data;
close(FH);
Yes I am a newbie Perl programmer. I'm not very good at brainstorming an
approach to sorting/matching routines. I would very much appreciate some
help understanding and building the final element. Another complication is
what if there are two identical email addresses but one is all caps and the
other isn't. I'm not looking for someone to write me the code I need,
instead to point me in the right direction so that I actually learn
something and forward my Perl skills. Thankx everyone.
Robert