Extracting Message body from email using POP3Client

Discussion in 'Perl Misc' started by Eadmund@letterbee.com, Jan 17, 2007.

  1. Guest

    Hi,

    I'm using pop3Client to succeessfully extract e-mail from my mail
    server, BUT depending on the format (ie plain text, richt text or HTML)
    that they are sent, I end up with a body that requres "massaging" with
    regular expressions to get a clean message. I am concerned that
    different systems will send me mails wilh "slightly" different formats
    and wont work with my tidy routines.

    Question: Has anyone got any code or can recomend a module that will
    extrcat a "clean message body" from the email regardless of format /
    system sent from?

    Ta

    Eadmund

     
    , Jan 17, 2007
    #1
    1. Advertising

  2. Guest

    On 17 Jan, 18:25, wrote:
    > Hi,
    >
    > I'm using pop3Client to succeessfullyextracte-mail from my mail
    > server, BUT depending on the format (ie plain text, richt text or HTML)
    > that they are sent, I end up with abodythat requres "massaging" with
    > regular expressions to get a clean message. I am concerned that
    > different systems will send me mails wilh "slightly" different formats
    > and wont work with my tidy routines.
    >
    > Question: Has anyone got any code or can recomend a module that will
    > extrcat a "clean messagebody" from theemailregardless of format /
    > system sent from?
    >
    > Ta
    >
    > Eadmund
    >
    >



    Hi,

    I'm in a similar position and haven't quite figured this one out, did
    you manage to find something?

    I too am using the Mail::pOP3Client module but by this stage I've
    already dumped the email into a MySQL database.

    Here's what I have:

    $bodystr=index($message,"quoted-printable");
    $bodyend=index($message,"</body");

    if($bodystr >0) #If it's -1 then it is a plain text message
    {
    $bodytxt=substr($message,$bodystr+1,$bodyend-$bodystr-length("------
    _=_NextPart_001_01C759B8.536E5E3B--")-2);
    $bodystr=index($bodytxt,"quoted-printable");
    $bodytxt2=substr($bodytxt,$bodystr+length("quoted-
    printable"),length($bodytxt)-$bodystr);
    $pibody.=$bodytxt2;
    }else{
    $pibody.=$message;
    }

    You can probably tell from the code, I'm new to this.

    I'm still getting some extra "=" in the body of an HTML email which I
    haven't investigated yet.

    Thanks,
    Simon.
     
    , Feb 27, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paul
    Replies:
    4
    Views:
    181
    A. Sinan Unur
    Jul 11, 2005
  2. Replies:
    1
    Views:
    137
  3. Replies:
    0
    Views:
    130
  4. Replies:
    5
    Views:
    121
    odhiseo
    Apr 13, 2007
  5. W. Trevor King
    Replies:
    0
    Views:
    140
    W. Trevor King
    Jan 24, 2013
Loading...

Share This Page