Parsing Email

D

Dan

What is the best way to get the body of the following email message
into a file? The following code gets the subject and from fields
nicely, but I can't figure out how to get the body:

my ($summary, $i);

for (file_read "$f_email_html") {
print "$i";
if (/<b>From\:<\/b> <a href\=\'mailto\: \&quot(.+)\&quot/) {
$i++;
$summary .= "From: $1\;\n ";
}
elsif (/<b>Subject\:<\/b>(.+)<br>/) {
$i++;
$summary .= "Subject: $1\;\n ";
}

}

file_write "$f_email_summary", $summary;


Here is the .html file I am trying to parse:


(01) <a name='10962432060' href='#top'>Back to Index</a> , <a
href='#top'>Previous</a> , <br><b>Date:</b> Sun 09/26/04 19:00:06<br>
<b>To:</b> &lt;[email protected]&gt;<br>
<b>From:</b> <a href='mailto: &quot;Dan Hoffard&quot;
&lt;[email protected]&gt;'>Dan Hoffard</a><br>
<b>Reply to:</b> <a href='mailto:'></a><br>
<b>Subject:</b> test<br>
<blockquote><pre>This is a multi-part message in MIME format.

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

test
asdf1
asdf2
asdf3
asdf4
asdff
Dan Hoffard
(e-mail address removed)

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
</pre>
<html><p>
<HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1106" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>test</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Dan Hoffard<BR><A=20
href=3D"mailto:[email protected]">[email protected]</A><BR>=
</FONT></DIV></BODY></HTML>

------=_NextPart_000_0039_01C4A3F7.606C81F0--

</blockquote><br><hr>
Post a follow-up to this message
 
D

Dan

I think MIME::parser may be overkill for what I am doing.. All I need
to do is get the body of the message.. Isn't there an easy way to do
it with file_read?

Thanks,
Dan
 
J

Joe Smith

Dan said:
I think MIME::parser may be overkill for what I am doing.. All I need
to do is get the body of the message.. Isn't there an easy way to do
it with file_read?

Maybe, if you're parsing a simple plain-text message.

But if you're parsing a multi-part message with boundariess like
"------=_NextPart_000_0039_01C4A3F7.606C81F0" you will need MIME::parser
or the equivalent.
-Joe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top