n00b needs help pls.

Koncept · May 1, 2004

Sorry for asking, but I give up on this situation. I am a total n00b at
Perl and have only used it seriously for about 1 week now. I would
really appreciate somebody's help here because I am really feeling
stuck.

** THIS IS NOT A SPAM LIST FIRST AND FOREMOST **
I have a list of email accounts. Many of the email accounts have are
from the same domain.

I want to seperate this huge list into seperate lists, where each list
only contains one address from each domain.

Example:

If my addresses in the source file are as follows:

bill at one.com
jane at one.com
frank at two.com
ted at one.com
jess at three.com

My first run should return:
--------------------
bill at one.com
frank at two.com
jess at three.com

2nd run:
------
jane at one.com

3rd run:
------
ted at one.com

I came up with this to grep unique domains once, but the problem is
that the script only runs once.

#!/usr/bin/perl -w

sub div() { "+","-" x 50, "+\n"; }

die "Usage: $0 emailList" if (@ARGV!=1);

open( EML, $ARGV[0] ) || die "Can't open file : $!\n";

while(<EML>){
chomp;
push(@addys, $_) if $_ =~
/^[a-zA-Z0-9_\.\-]+\@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$/;
}

close( EML );

foreach $email( @addys ) {
@parts = split( "@", $email );
$domain = $parts[1];
unless( $seen{$domain} ) {
push( @users, $email );
$seen{$domain} = 1;
}
}

if(@users>0){
print &div, "The following are uniq users per domain:\n", &div;
print join( "\n", sort( @users ) ), "\n", &div;
} else {
print "Sorry. I could not find any email addresses.\n";
}

Koncept · May 2, 2004

Titus A said:
As a follow up surely you mean (e-mail address removed), (e-mail address removed) (e-mail address removed)
rather than (e-mail address removed) and b @y.co.uk etc...

1) re: emails => Yes. I just didn't want to make active links.
2) re: excel => No. Don't intend to use microsoft products. I would
really like to know how to do this in Perl. This is why I posted here.
Thanks for your advice though.

Bob Walton · May 2, 2004

Koncept wrote:

....

I want to seperate this huge list into seperate lists, where each list
only contains one address from each domain.

Example:

If my addresses in the source file are as follows:

bill at one.com
jane at one.com
frank at two.com
ted at one.com
jess at three.com

My first run should return:
--------------------
bill at one.com
frank at two.com
jess at three.com

2nd run:
------
jane at one.com

3rd run:
------
ted at one.com

I came up with this to grep unique domains once, but the problem is
that the script only runs once.

I added some commentary:

#!/usr/bin/perl -w

use strict; #let Perl help you all it can
use warnings; #preferable to -w switch

sub div() { "+","-" x 50, "+\n"; }

die "Usage: $0 emailList" if (@ARGV!=1);

open( EML, $ARGV[0] ) || die "Can't open file : $!\n";

----------------------------------------------------^^
That suppresses the line number of the error. Usually you don't want that.

my @addys; #with strict, you need to declare variables prior to use.

my %seen;

my @users;

while(<EML>){
chomp;
push(@addys, $_) if $_ =~
/^[a-zA-Z0-9_\.\-]+\@[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-\.]+$/;

---------------^-^---^-----------^---------------^-^
It is not necessary to escape the indicated characters. It makes your
program harder to read and understand when unnecessary quoting is used.

}

close( EML );

foreach $email( @addys ) {
my--------^

@parts = split( "@", $email );

--------------------^-^
split takes a pattern. It is better written as:

my @parts = split( /@/, $email );

$domain = $parts[1];
my--^

unless( $seen{$domain} ) {
push( @users, $email );
$seen{$domain} = 1;
}
}

if(@users>0){
print &div, "The following are uniq users per domain:\n", &div;
print join( "\n", sort( @users ) ), "\n", &div;
} else {
print "Sorry. I could not find any email addresses.\n";
}

Well, it doesn't look like you really have much of a Perl problem, just
a minor logistics problem. I take it you want to run the program
multiple times, and want to have a list of email addresses, just one per
domain, come out each time. In order to do that you will need to save
what remains of the list of emails each time you run your program. It
would be convenient to save it back to the same file, providing you
don't need that file later. If you don't want to destroy the file, then
have the program copy it to a temporary file, and have the program
automatically take from the temporary file if it exists (and delete the
temporary file if it ends up empty). So you will need to close the
input file, open it again for output, and build and output to that file
a list of the email addresses that were not output by the program on the
current pass. Based on the program you've already written, you should
be able to handle that.

Koncept · May 2, 2004

Bob Walton said:
Based on the program you've already written, you should
be able to handle that.

Thanks for the advice Bob. Great tips.

Need help with getting unique values in a file.	0	Apr 30, 2004
HOWTO: Parsing email using Python part1	2	Jul 3, 2011
Perl Newbie Needs Help ASAP!	0	May 14, 2007
How do i resolve this error message Please! I need help	1	Mar 30, 2013
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007
[HELP] code modification	5	Dec 16, 2003
Is it a gtk2 problem or Threads problem	0	Mar 19, 2006
Could someone help me with this source code?	5	Jan 20, 2007

n00b needs help pls.

Koncept

Koncept

Bob Walton

Koncept

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads