Can this be more efficient?

Discussion in 'Perl Misc' started by Colin chaplin, Mar 15, 2005.

  1. Hi All

    I've got a little quick and dirty perl program that scans through logfiles
    for certain words and counts them up. Trouble is, it's quite slow and I was
    wondering if you smart chaps could suggest a way of speeding it up. The
    logfiles are about 50MB in size, and I'm running it on a reasonably powerful
    PC

    The Psuedo code is

    For each logfile in the directory:
    read into an @rray
    close file
    for each line of array
    do text processing stuff

    print results on screen

    I've played about with not reading into an array just reading the file line
    by line, no obvious change either way

    As I said its written using basic perl data types, so I was wondering if
    there is a quicker way to store and process this data ?
     
    Colin chaplin, Mar 15, 2005
    #1
    1. Advertising

  2. Colin chaplin

    Paul Lalli Guest

    "Colin chaplin" <> wrote in message
    news:d16qmq$46t$1$...
    > Hi All
    >
    > I've got a little quick and dirty perl program that scans through

    logfiles
    > for certain words and counts them up. Trouble is, it's quite slow and

    I was
    > wondering if you smart chaps could suggest a way of speeding it up.

    The
    > logfiles are about 50MB in size, and I'm running it on a reasonably

    powerful
    > PC
    >
    > The Psuedo code is
    >
    > For each logfile in the directory:
    > read into an @rray
    > close file
    > for each line of array
    > do text processing stuff
    >
    > print results on screen
    >
    > I've played about with not reading into an array just reading the file

    line
    > by line, no obvious change either way


    I don't know about speed efficiency, but this is certainly more
    efficient space-wise.

    How are you determining "no obvious change"? Have you used the
    Benchmark module?

    Also, is there any reason you can't just use grep, rather than entire
    perl script?

    if you really want Perl...

    perl -ne '$words++ if /\bsearchedWord\b/; END { print "$words words
    found\n" }' *.log

    (note that this assumes the word will only appear a maximum of once per
    line.

    Paul Lalli
     
    Paul Lalli, Mar 15, 2005
    #2
    1. Advertising

  3. > >
    > > I've got a little quick and dirty perl program that scans through

    > logfiles
    > > for certain words and counts them up. Trouble is, it's quite slow and

    >
    > > by line, no obvious change either way

    >
    > I don't know about speed efficiency, but this is certainly more
    > efficient space-wise.
    >
    > How are you determining "no obvious change"? Have you used the
    > Benchmark module?
    >

    By the very scientific "gosh this is taking a long time" make-a-cup-of-tea,
    tidy-up, "oh it's still going" approach. I was looking for a magic bullet
    perhaps that'd make it work in a matter of seconds. Not so much to speed
    this up as it's only going to run a few times, but so in future when I do
    similar it's better.


    > Also, is there any reason you can't just use grep, rather than entire
    > perl script?
    >


    I was being a little brief with my description, there's quite a lot of
    pattern matching going on there that I use a function for. Almost do-able in
    grep I guess but apart from speed the program works pretty well so I'm happy
    to keep with it. Id post it here but I'd probably only get slated for coding
    style >:)

    > if you really want Perl...
    >
    > perl -ne '$words++ if /\bsearchedWord\b/; END { print "$words words
    > found\n" }' *.log
    >


    I can't get that syntax to run from the command line, not sure why (NT
    platform "can't find string terminator ' before EOF") but if I understand
    the -ne switch it's effectively openinging every file, going through it line
    by line, which is pretty much what I'm doing now ?

    I did just have a look at $/ but not sure that would help me out.

    Thanks for your time!

    Colin
     
    Colin chaplin, Mar 15, 2005
    #3
  4. Colin chaplin <> wrote:

    > for certain words and counts them up. Trouble is, it's quite slow



    > The Psuedo code is
    >
    > For each logfile in the directory:
    > read into an @rray
    > close file
    > for each line of array
    > do text processing stuff
    >
    > print results on screen



    The Psuedo answer is: improve the "do text processing stuff" part.

    Of course we can't help you with that, since we cannot see that part...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Mar 15, 2005
    #4
  5. Colin chaplin

    Paul Lalli Guest

    "Colin chaplin" <> wrote in message
    news:d16uk2$9h5$1$...
    > Paul Lalli wrote
    >
    > > Also, is there any reason you can't just use grep, rather than

    entire
    > > perl script?
    > >

    >
    > I was being a little brief with my description,


    You should always specify your description as completely as possible.
    How else can you expect people to help you do what you want to do?

    > there's quite a lot of
    > pattern matching going on there that I use a function for. Almost

    do-able in
    > grep I guess but apart from speed the program works pretty well so I'm

    happy
    > to keep with it. Id post it here but I'd probably only get slated for

    coding
    > style >:)


    You say that as though it's a bad thing...

    > > if you really want Perl...
    > >
    > > perl -ne '$words++ if /\bsearchedWord\b/; END { print "$words words
    > > found\n" }' *.log
    > >

    >
    > I can't get that syntax to run from the command line, not sure why (NT
    > platform "can't find string terminator ' before EOF")


    Windows requires args to be delimited by " instead of '. Which means
    you'll have to backslash all the " in the code.

    > but if I understand
    > the -ne switch it's effectively openinging every file, going through

    it line
    > by line, which is pretty much what I'm doing now ?


    No. What you said you were currently doing is reading the file line by
    line, storing each line onto an array, and then going through the array
    line by line. Quite different.


    Paul Lalli
     
    Paul Lalli, Mar 15, 2005
    #5
  6. Colin chaplin

    Guest

    "Colin chaplin" <> wrote:
    > Hi All
    >
    > I've got a little quick and dirty perl program that scans through
    > logfiles for certain words and counts them up. Trouble is, it's quite
    > slow and I was wondering if you smart chaps could suggest a way of
    > speeding it up. The logfiles are about 50MB in size, and I'm running it
    > on a reasonably powerful PC
    >
    > The Psuedo code is
    >
    > For each logfile in the directory:
    > read into an @rray
    > close file
    > for each line of array
    > do text processing stuff
    >
    > print results on screen


    So, how about not doing the text processing stuff? If it is still slow,
    then you know the slow step is one of the earlier steps. If not still
    slow, you know it is the text processing stuff. Then you can show is the
    real code used to do the part that is slow. I give us real numbers, so we
    can try to replicate the problem.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 15, 2005
    #6
  7. Colin chaplin

    Ala Qumsieh Guest

    Colin chaplin wrote:
    > Id post it here but I'd probably only get slated for coding
    > style >:)


    What you're saying is akin to me going to a car mechanic and telling her
    "my car is broken. how do I fix it?" without me allowing her to see the car.

    You won't get any useful replies like that. Plus, a little critique of
    your coding style should be a Good Thing (tm).

    --Ala
     
    Ala Qumsieh, Mar 15, 2005
    #7
  8. "Ala Qumsieh" <> wrote in message
    news:g4GZd.21999$...
    > Colin chaplin wrote:
    > > Id post it here but I'd probably only get slated for coding
    > > style >:)

    >
    > What you're saying is akin to me going to a car mechanic and telling her
    > "my car is broken. how do I fix it?" without me allowing her to see the

    car.
    >
    > You won't get any useful replies like that. Plus, a little critique of
    > your coding style should be a Good Thing (tm).



    Ok, here goes, warts and all (I removed names to protect the guilty). The
    program reads through Mailsweeper log files and spits out how many emails
    for certain domains are recorded in the logs, and also a list of recipients
    and number of emails they receive. The log is a trimmed version of an SMTP
    communication.


    Please note:
    * I'm not a programmer, be gentle
    * This isnt a production program
    * Clever regular expressions make my head explode when I look at them a
    while after writing the code
    * I'm barely a techy these days
    * Don't run this scissors

    In answer to another post, I had the program setup so that it read line by
    line, and changed it to gobble the entire file at once to see if would have
    a performance impact

    Thanks to all for taking the time!

    &opendir (".");
    $DOMS{'thisdom'}=12;

    sub opendir
    {
    # Recursively goes through directories and process all mailsweeper files
    my $dir=$_[0];
    my $name;
    opendir(DIRHANDLE, $dir) || die "Cannot opendir $dir: $!";
    foreach $name (sort readdir(DIRHANDLE))
    {
    if (($name eq '.') || ($name eq '..') || (lc($name) eq lc($currentlog)))
    {



    }
    else
    {

    # Only open the file if it looks like a MSW Logfile
    if ($name =~ /\Aopr[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.log/i)
    {

    &processFile($dir . "\\" . $name)

    }
    if (-d $dir . "\\" . $name)
    {
    print "opening " .$dir . "\\" . $name ."\n";
    &opendir($dir . "\\" . $name);
    }
    }
    }

    closedir(DIRHANDLE);
    }



    sub processFile
    {
    my $filename=$_[0];
    my @filetext;

    print " Opening $filename\n";
    open (INFILE,$filename) || die ("EK $!: $filename");
    while (<INFILE>)
    {
    push(@filetext,$_);
    }
    close (INFILE);
    foreach (@filetext)
    {
    if (/RCPT\sTO\:/i)
    {
    #print "[$_]\n";

    if (/(\<.*?\>)/)
    {

    $bits=$1;
    $bits =~ s/\<//g;
    $bits =~ s/\>//g;
    $bits =~ s/RCPT TO\://g;
    $bits =~ s/\s//g;
    $bits = lc ($bits);
    if (&valid_domain($bits))
    {
    $DOMS{$bits}++;
    }
    # print $bits;
    }
    }
    # else {print "NO: $_\n";}
    }

    }
    $DOMMER{'testdom'}=5;

    open (OUTFILE,">allemails.csv") || die ("eeK:$!");
    open (OUTF,">alldoms.csv") || die ("yikes:$!");
    foreach $key ( sort by_this keys %DOMS )
    {

    ($pre,$aft) = split(/\@/,$key);
    $DOMMER{$aft}=$DOMMER{$aft}+1;
    print OUTFILE "\"$key\",\"$DOMS{$key}\"\n";

    }

    foreach $key ( sort by_this keys %DOMMER )
    {
    print OUTF "\"$key\",\"$DOMMER{$key}\"\n";

    }


    sub by_this
    {
    ($tat,$abit)=split (/\@/,$a);
    ($tat,$bbit)=split (/\@/,$b);
    ($abit cmp $bbit) || ($a cmp $b);
    }

    sub valid_domain
    {
    #decide if this is a xxxxx doman

    my $dom=$_[0];
    ($tat,$dom)=split(/\@/,$dom);
    if ($dom =~ (/thisdomain\.co\.uk/)) {return 1;}
    if ($dom =~(/thatdomain.co\.uk/)) {return 1;}
    if ($dom =~(/otherdomain\.com/)) {return 1;}

    return (0);
    }
     
    Colin chaplin, Mar 15, 2005
    #8
  9. Colin chaplin wrote :

    > I was being a little brief with my description, there's quite a lot of
    > pattern matching going on there that I use a function for.


    The regexes might be the reason, why the script takes such a long time
    to run.
    I am pretty new to perl and I am reading the Camel Book.
    In the Chapter 5 Pattern-Matching there are examples which are said to
    take years (actually millions of years) because of backtracking.

    So one approach to improve the performance of your script might be the
    optimization of you regxes.

    HTH
    Martin

    --
    perl -e '$S=[[73,116,114,115,31,96],[108,109,114,102,99,112],
    [29,77,98,111,105,29],[100,93,95,103,97,110]];
    for(0..3){for$s(0..5){print(chr($S->[$_]->[$s]+$_+1))}}'
     
    Martin Kissner, Mar 15, 2005
    #9
  10. * Colin chaplin wrote:
    >
    > Please note:
    > * I'm not a programmer, be gentle


    .... but learning something new cannot be wrong, isn't?

    > * This isnt a production program


    .... but speeding up a program is mostly interesting.

    > * Clever regular expressions make my head explode when I look at them a
    > while after writing the code


    .... when clever regexes would be necessary I'll took them (and write a
    short comment aside to remember me after a while).

    > * I'm barely a techy these days
    > * Don't run this scissors
    >
    > In answer to another post, I had the program setup so that it read line by
    > line, and changed it to gobble the entire file at once to see if would have
    > a performance impact
    >
    > Thanks to all for taking the time!
    >
    > &opendir (".");
    > $DOMS{'thisdom'}=12;
    >
    > sub opendir {
    > # Recursively goes through directories and process all mailsweeper files


    [...]

    > }


    For me it looks smarter, doing such a recursive directory traversal with
    a standard perl module like File::Find. Your sub could look like:

    use File::Find;
    sub my_opendir {
    my $dir = shift;
    find( sub {
    # $_ contains the current filename only, while
    # $File::Find::name contains the complete pathname
    if ( $_ =~ /opr\d{8}\.log$/i ) {
    processFile( $File::Find::name )
    }
    }, $dir );
    }

    Btw, it's always a bad idea to name your own functions like perl ones.
    Hence I named your sub »my_opendir« instead of »opendir«.

    >
    > sub processFile {
    > my $filename=$_[0];
    > my @filetext;
    > print " Opening $filename\n";
    > open (INFILE,$filename) || die ("EK $!: $filename");
    > while (<INFILE>){
    > push(@filetext,$_);
    > }
    > close (INFILE);


    Eh, this will read the complete file into the array @filetext. That can
    be made easier by »my @filetext = <INFILE>;« -- but that's not what some
    others in this thread meaning with "use a while loop". See below.

    > foreach (@filetext){
    > if (/RCPT\sTO\:/i){
    > #print "[$_]\n";
    > if (/(\<.*?\>)/){
    > $bits=$1;
    > $bits =~ s/\<//g;
    > $bits =~ s/\>//g;


    If you don't catch those angle brackets in $1 you haven't delete them.

    > $bits =~ s/RCPT TO\://g;
    > $bits =~ s/\s//g;
    > $bits = lc ($bits);


    The search pattern is modified by »i« previously, i.e. case-insensitive..
    Here you delete all uppercased occurrences of "RCPT TO:" but not the
    lowercased ones. Perhaps you should do the lc() before.

    > if (&valid_domain($bits)){
    > $DOMS{$bits}++;
    > }
    > # print $bits;
    > }
    > }
    > # else {print "NO: $_\n";}
    > }
    > }


    Using a while loop to read in a file line by line:

    sub processFile {
    my $filename = shift;
    print " Opening $filename\n";
    open INFILE, $filename or die "EK $!: $filename";
    while ( <INFILE> ) {
    if ( /RCPT\sTO\:/i and /<(.*?)>/ ) {
    my $bits = lc $1;
    $bits =~ s/RCPT TO\://g;
    $bits =~ s/\s//g;
    $DOMS{$bits}++ if valid_domain($bits);
    }
    }
    close INFILE;
    }

    >
    > $DOMMER{'testdom'}=5;
    >
    > open (OUTFILE,">allemails.csv") || die ("eeK:$!");
    > open (OUTF,">alldoms.csv") || die ("yikes:$!");
    > foreach $key ( sort by_this keys %DOMS )
    > {
    >
    > ($pre,$aft) = split(/\@/,$key);
    > $DOMMER{$aft}=$DOMMER{$aft}+1;
    > print OUTFILE "\"$key\",\"$DOMS{$key}\"\n";


    Before using such backslashed quotes, learn more about perl's quoting
    operators like q// and qq//.

    print OUTFILE qq{"$key","$DOMS{$key}"\n};

    >
    > }
    >
    > foreach $key ( sort by_this keys %DOMMER )
    > {
    > print OUTF "\"$key\",\"$DOMMER{$key}\"\n";
    >
    > }
    >
    >
    > sub by_this
    > {
    > ($tat,$abit)=split (/\@/,$a);
    > ($tat,$bbit)=split (/\@/,$b);
    > ($abit cmp $bbit) || ($a cmp $b);
    > }


    If a function returns values which aren't interesting this moment, just
    throw them away:

    ( undef, $abit ) = split /@/, $a;

    >
    > sub valid_domain
    > {
    > #decide if this is a xxxxx doman
    >
    > my $dom=$_[0];
    > ($tat,$dom)=split(/\@/,$dom);


    See above and throw $tat away ;-)

    >
    > if ($dom =~ (/thisdomain\.co\.uk/)) {return 1;}
    > if ($dom =~(/thatdomain.co\.uk/)) {return 1;}
    > if ($dom =~(/otherdomain\.com/)) {return 1;}


    In Perl you can note if-statements behind. In cases like this I think
    it's more readable due to fewer parenthesis ;-)

    return 1 if $dom =~ /thisdomain\.co\.uk/;
    return 1 if $dom =~ /thatdomain\.co\.uk/;
    return 1 if $dom =~ /otherdomain\.com/;

    You could do it with »or« also:

    return 1 if $dom =~ /thisdomain\.co\.uk/
    or $dom =~ /thatdomain\.co\.uk/
    or $dom =~ /otherdomain\.com/;

    I hope you like my suggestions.

    regards,
    fabian
     
    Fabian Pilkowski, Mar 15, 2005
    #10
  11. Colin chaplin

    Ala Qumsieh Guest

    Colin chaplin wrote:

    > &opendir (".");


    In general, it's not a good idea to use & when calling a subroutine.
    This was necessary for Perl4 and before, but that is at least 10 years
    old. Now, they are not needed, and using them has some side effects
    which are described in perlsub.

    > $DOMS{'thisdom'}=12;
    >
    > sub opendir
    > {
    > # Recursively goes through directories and process all mailsweeper files


    As suggested by another poster, it's better to use File::Find in this case.

    > my $dir=$_[0];
    > my $name;
    > opendir(DIRHANDLE, $dir) || die "Cannot opendir $dir: $!";
    > foreach $name (sort readdir(DIRHANDLE))
    > {
    > if (($name eq '.') || ($name eq '..') || (lc($name) eq lc($currentlog)))
    > {
    >
    >
    >
    > }


    An empty if() body is never a nice thing to look at. I would change this to:

    next if $name eq '.'
    || $name eq '..'
    || lc($name) eq lc($currentlog);

    This also allows you to save one level of indentation, which can make
    the code a bit more aesthetically pleasing. But that's subjective.

    > else
    > {
    >
    > # Only open the file if it looks like a MSW Logfile
    > if ($name =~ /\Aopr[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.log/i)


    [0-9] is identical to \d. And you can match a specific number of \d's by
    using curly braces:

    if ($name =~ /\Aopr\d{8}\.log/i) {

    > {
    >
    > &processFile($dir . "\\" . $name)


    Perl allows you to use unix-style dir hierarchy delimiters, even on
    Windows. This is more readable (IMO):

    processFile("$dir/$name");

    > }
    > if (-d $dir . "\\" . $name)


    That should've been an 'elsif' .. but I don't think it will improve much
    in your case.

    [snip]

    > open (INFILE,$filename) || die ("EK $!: $filename");
    > while (<INFILE>)
    > {
    > push(@filetext,$_);
    > }


    The following is more efficient:
    @filetext = <INFILE>;

    But you don't need to do that ... see below.

    > close (INFILE);
    > foreach (@filetext)


    Why not execute this loop while reading the file in the first place?

    while (<INFILE>) {

    > {
    > if (/RCPT\sTO\:/i)
    > {
    > #print "[$_]\n";
    >
    > if (/(\<.*?\>)/)
    > {
    >
    > $bits=$1;
    > $bits =~ s/\<//g;
    > $bits =~ s/\>//g;
    > $bits =~ s/RCPT TO\://g;
    > $bits =~ s/\s//g;
    > $bits = lc ($bits);


    All of the above can probably be optimized to this:

    if (/RCPT TO:\s*<(.*?)\>/) {
    my $bits = lc $1;

    [more snips]

    > sub valid_domain
    > {
    > #decide if this is a xxxxx doman
    >
    > my $dom=$_[0];
    > ($tat,$dom)=split(/\@/,$dom);
    > if ($dom =~ (/thisdomain\.co\.uk/)) {return 1;}
    > if ($dom =~(/thatdomain.co\.uk/)) {return 1;}
    > if ($dom =~(/otherdomain\.com/)) {return 1;}


    If you are matching literal strings, using 'eq' is faster than a regexp:

    return 1 if $dom eq 'thisdomain.co.uk'
    || $dom eq 'thatdomain.co.uk'
    || $dom eq 'otherdomain.co.uk';

    > return (0);
    > }


    --Ala
     
    Ala Qumsieh, Mar 15, 2005
    #11
  12. Colin chaplin <> wrote:

    > Please note:
    > * I'm not a programmer, be gentle
    > * This isnt a production program



    Then there is no need to optimize it. :)


    > * Clever regular expressions make my head explode when I look at them a
    > while after writing the code



    Join the club.


    > sub opendir


    [snip]

    > if (($name eq '.') || ($name eq '..') || (lc($name) eq lc($currentlog)))
    > {
    >
    >
    >
    > }



    An empty block should be a red flag.

    You can use next to eliminate an entire level of indent:

    next if $name eq '.' or $name eq '..' or lc $name eq lc $currentlog;


    > if ($name =~ /\Aopr[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\.log/i)



    if ( $name =~ /\Aopr\d{8}\.log/i)

    Now we can see that is expecting 8 digit chars without getting
    fingerprints on the screen counting them.


    > sub processFile


    [snip]

    > while (<INFILE>)
    > {
    > push(@filetext,$_);
    > }



    You can replace all of that with:

    @filetext = <INFILE>;


    > if (/(\<.*?\>)/)


    Angle brackets are not special in regexes, there is no need to backslash them:

    if (/(<.*?>)/)


    > $bits =~ s/\<//g;
    > $bits =~ s/\>//g;



    tr/// will always be faster that s///g, so:

    $bits =~ tr/<>//d;


    > sub by_this



    That is a horridly poor choice of sub name...


    > {
    > ($tat,$abit)=split (/\@/,$a);
    > ($tat,$bbit)=split (/\@/,$b);
    > ($abit cmp $bbit) || ($a cmp $b);
    > }



    That can be sped up with a Schwartzian Transform, as described
    in one of the sorting FAQs:

    perldoc -q sort


    > if ($dom =~ (/thisdomain\.co\.uk/)) {return 1;}



    Pattern matching is not the Right Tool when what you want to match
    is a constant string rather than a pattern:

    if ( index $dom, 'thisdomain.co.uk' >= 0 ) { return 1 }

    index() will be faster than m// too.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Mar 15, 2005
    #12
  13. Colin chaplin wrote:
    > "Ala Qumsieh" <> wrote in message
    > news:g4GZd.21999$...
    >
    >>Colin chaplin wrote:
    >>
    >>>Id post it here but I'd probably only get slated for coding
    >>>style >:)

    >>
    >>What you're saying is akin to me going to a car mechanic and telling her
    >>"my car is broken. how do I fix it?" without me allowing her to see the

    >
    > car.
    >
    >>You won't get any useful replies like that. Plus, a little critique of
    >>your coding style should be a Good Thing (tm).

    >
    > Ok, here goes, warts and all (I removed names to protect the guilty). The
    > program reads through Mailsweeper log files and spits out how many emails
    > for certain domains are recorded in the logs, and also a list of recipients
    > and number of emails they receive. The log is a trimmed version of an SMTP
    > communication.
    >
    >
    > Please note:
    > * I'm not a programmer, be gentle
    > * This isnt a production program
    > * Clever regular expressions make my head explode when I look at them a
    > while after writing the code
    > * I'm barely a techy these days
    > * Don't run this scissors
    >
    > In answer to another post, I had the program setup so that it read line by
    > line, and changed it to gobble the entire file at once to see if would have
    > a performance impact
    >
    > Thanks to all for taking the time!
    >
    > [snip code]
    >


    Perhaps this will work better: (UNTESTED)

    use warnings;
    use strict;
    use File::Find;

    my @valid_domains = qw(
    thisdomain.co.uk
    thatdomain.co.uk
    otherdomain.com
    );

    my %DOMS = ( thisdom => 12 );

    find( sub {
    return unless /\Aopr\d{8}\.log\z/i
    print " Opening $_\n";
    open INFILE, '<', $_ or die "EK $!: $_";
    while ( my $line = <INFILE> ) {
    next unless $line =~ /RCPT\sTO:/i and $line =~ /</ and $line =~ />/;
    for my $valid_domain ( @valid_domains ) {
    if ( $line =~ /<(\S+\@\S*\Q$valid_domain\E)>/ ) {
    $DOMS{ lc $1 }++;
    }
    }
    }
    close INFILE;
    }, '.' );


    sub by_this {
    ( split /\@/, $a )[ 1 ] cmp ( split /\@/, $b )[ 1 ] || $a cmp $b
    }

    open OUTFILE, '>', 'allemails.csv' or die "eeK:$!";
    open OUTF, '>', 'alldoms.csv' or die "yikes:$!";

    my %DOMMER = ( testdom => 5 );

    for my $key ( sort by_this keys %DOMS ) {
    $DOMMER{ ( split /\@/, $key )[ 1 ] }++;
    print OUTFILE qq("$key","$DOMS{$key}"\n);
    }

    for my $key ( sort by_this keys %DOMMER ) {
    print OUTF qq("$key","$DOMMER{$key}"\n);
    }

    __END__




    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Mar 15, 2005
    #13
  14. Ala Qumsieh <> wrote:
    > Colin chaplin wrote:
    >
    >> &opendir (".");

    >
    > In general, it's not a good idea to use & when calling a subroutine.
    > This was necessary for Perl4 and before, but that is at least 10 years
    > old. Now, they are not needed, and using them has some side effects
    > which are described in perlsub.



    Yes, but this happens to be the one case where using ampersand _is_
    a good idea, namely when you have a collision between a user-defined
    sub and a builtin and you want to call the user-defined one.


    >> &processFile($dir . "\\" . $name)

    >
    > Perl allows you to use unix-style dir hierarchy delimiters, even on
    > Windows.



    It isn' Perl that is OK with forward slashes it is the Operating
    System (apart from the command interpreter) that is OK with them.

    You can use forward slashes on Windows even if you are not in
    Perl (provided you are not in the command interpreter).


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Mar 15, 2005
    #14
  15. Colin chaplin

    Thomas Kratz Guest

    Tad McClellan wrote:

    > You can use forward slashes on Windows even if you are not in
    > Perl (provided you are not in the command interpreter).


    Even cmd.exe can handle forward slashes, provided you enclose the path or
    filename in double quotes.

    Thomas

    --
    $/=$,,$_=<DATA>,s,(.*),$1,see;__END__
    s,^(.*\043),,mg,@_=map{[split'']}split;{#>J~.>_an~>>e~......>r~
    $_=$_[$%][$"];y,<~>^,-++-,?{$/=--$|?'"':#..u.t.^.o.P.r.>ha~.e..
    '%',s,(.),\$$/$1=1,,$;=$_}:/\w/?{y,_, ,,#..>s^~ht<._..._..c....
    print}:y,.,,||last,,,,,,$_=$;;eval,redo}#.....>.e.r^.>l^..>k^.-
     
    Thomas Kratz, Mar 16, 2005
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roy
    Replies:
    6
    Views:
    394
    Scott Allen
    Feb 16, 2005
  2. Man Alive
    Replies:
    6
    Views:
    10,433
    Joshua Cranmer
    May 20, 2008
  3. Ken Fine
    Replies:
    3
    Views:
    538
    Steven Cheng [MSFT]
    Jul 23, 2008
  4. PerlFAQ Server
    Replies:
    0
    Views:
    105
    PerlFAQ Server
    Jan 28, 2011
  5. PerlFAQ Server
    Replies:
    0
    Views:
    103
    PerlFAQ Server
    Feb 22, 2011
Loading...

Share This Page