A
afrinspray
I'm writing a MIME::Tools email parsing engine. This utility rocks by
the way... the whole package makes mime processing very easy.
My problem however is with Outlook emails and they're horrible styling.
While normal people will use the <br> tag for line breaks, outlook
likes to do stuff like this:
<DIV dir=ltr align=left><FONT size=2><SPAN
class=3D671020819-14062006></SPAN></FONT> </DIV>
They like to use these weird css classes as well, like
3D671020819-14062006 (which isn't defined anywhere in the document) and
MsoNormal. Also, they like to use random garbage pseudo-breaks here
and there that don't show up in outlook, but show up in every other
html parser I've seen... so I'm using the HTML::Tree class to remove
empty breaks. Uggg... it's just a total mess.
Is there a reliable perl module for converting Outlook garbage into
real HTML?
Thanks,
Mike
the way... the whole package makes mime processing very easy.
My problem however is with Outlook emails and they're horrible styling.
While normal people will use the <br> tag for line breaks, outlook
likes to do stuff like this:
<DIV dir=ltr align=left><FONT size=2><SPAN
class=3D671020819-14062006></SPAN></FONT> </DIV>
They like to use these weird css classes as well, like
3D671020819-14062006 (which isn't defined anywhere in the document) and
MsoNormal. Also, they like to use random garbage pseudo-breaks here
and there that don't show up in outlook, but show up in every other
html parser I've seen... so I'm using the HTML::Tree class to remove
empty breaks. Uggg... it's just a total mess.
Is there a reliable perl module for converting Outlook garbage into
real HTML?
Thanks,
Mike