Regex for finding email addresses inside text file

Discussion in 'Perl Misc' started by Doug Wells, Jan 27, 2005.

  1. Doug Wells

    Doug Wells Guest

    Can anyone help me with a regex that looks through an entire text file
    which might have multiple email addresses in it, and writes those email
    addresses out to a second file?

    Thanks for the help
    Doug
     
    Doug Wells, Jan 27, 2005
    #1
    1. Advertising

  2. Doug Wells <> wrote:

    > Can anyone help me with a regex



    Sure.

    Show us the regex in question, and we will help you fix it.


    > that looks through an entire text file
    > which might have multiple email addresses in it, and writes those email
    > addresses out to a second file?



    Regexes do not read/write files.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jan 27, 2005
    #2
    1. Advertising

  3. Doug Wells wrote:
    > Can anyone help me with a regex that looks through an entire text file
    > which might have multiple email addresses in it, and writes those
    > email addresses out to a second file?


    You may want to read the FAQ "How do I check a valid mail address?".

    While in theory using REs to identify email addresses may be possible, just
    like parsing HTML no sane person would try to do it that way.

    jue
     
    Jürgen Exner, Jan 27, 2005
    #3
  4. On Thu, 27 Jan 2005, Doug Wells wrote:

    > Can anyone help me with a regex that looks through an entire text file
    > which might have multiple email addresses in it, and writes those email
    > addresses out to a second file?
    >


    just to give you some idea of how difficult that would be consider for a
    moment just how many top level domains there are.

    hint: 200+
    there are 247 ccTLD
    from .ac - Ascension Island through .zw - Zimbabwe
    consider the ccTLD .us
    there are numerous 2nd-level sub domains. the 50 states and the numerous
    terroritories.
    then there are the generic TLDs.
    ..aero, .biz, .com, .coop, .info, .museum, .name, .net, .org, .pro,
    ..gov, .edu, .mil, and .int

    to learn more please refer to http://www.iana.org and look under domain
    name services.

    this is a nontrivial task.

    >
    > Thanks for the help
    > Doug
    >


    --
    terry l. ridder ><>
     
    terry l. ridder, Jan 27, 2005
    #4
  5. Doug Wells

    Tore Aursand Guest

    Doug Wells wrote:
    > Can anyone help me with a regex that looks through an entire text file
    > which might have multiple email addresses in it, and writes those email
    > addresses out to a second file?


    Take a look at the Mail::Address module on CPAN. It will let you can
    text for email addresses;

    #!/usr/bin/perl
    #
    use strict;
    use warnings;
    use Mail::Address;

    my $text = '...';
    my @addresses = Mail::Address->parse( $text );

    my %addresses;
    foreach ( @addresses ) {
    $addresses{ $_->address() }++;
    }

    The rest is up to you, as it really is very simple.


    --
    Tore Aursand <>
    "Those people who think they know everything are a great annoyance to
    those of us who do." (Isaac Asimov)
     
    Tore Aursand, Jan 27, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Duke of Hazard
    Replies:
    0
    Views:
    760
    Duke of Hazard
    Jan 23, 2004
  2. Kun
    Replies:
    1
    Views:
    332
    Arne Ludwig
    Mar 25, 2006
  3. namespace1
    Replies:
    3
    Views:
    908
  4. Dennis
    Replies:
    40
    Views:
    1,210
    CBFalconer
    Jul 3, 2008
  5. divya
    Replies:
    1
    Views:
    112
    divya
    Aug 24, 2006
Loading...

Share This Page