P
Phillip B Oldham
Is there a standard library for parsing emails that can cope with the
different way email clients quote?
different way email clients quote?
Phillip said:Is there a standard library for parsing emails that can cope with the
different way email clients quote?
Phillip said:Is there a standard library for parsing emails that can cope with the
different way email clients quote?
What do you mean with "quote" here?
2. Prefix of quoted text like your text above in my mail
If there isn't a standard library for parsing emails, is there one for
connecting to a pop/imap resource and reading the mailbox?
The search [1] yielded these results:Phillip said:If there isn't a standard library for parsing emails, is there one for
connecting to a pop/imap resource and reading the mailbox?
For parsing the mails I would recommend pyparsing.
Maric said:Le Wednesday 30 July 2008 17:55:35 Aspersieman, vous avez écrit :
Why ? email module is a great parser IMO.
Basically, just be able to parse an email into its actual and "quoted"
parts - lines which have been prefixed to indent from a previous
email.
Most clients use ">" which is easy to check for, but I've seen some
which use "|" and some which *don't* quote at all. Its causing us
nightmares in parsing responses to system-generated emails. I was
hoping someone might've seen the problem previously and released some
code.
He talks about parsing the *content*, not the email envelope and possible
mime-body.
Most clients use ">" which is easy to check for, but I've seen some
which use "|" and some which *don't* quote at all. Its causing us
nightmares in parsing responses to system-generated emails. I was hoping
someone might've seen the problem previously and released some code.
Something quoted.
[end quote]
But I'm writing for a human audience, not for a program.
The simple answer is that you can catch 90% of cases by checking for ">",
and another 1% by checking for "|". If the email contains HTML, I have
found that quoted text is sometimes in another colour. As for the rest,
well, sometimes even human beings can't easily determine what's quoted
and what isn't. Good luck getting a program to do it.
(Percentages are plucked out of thin air. YMMV.)
My sympathies.
I've even seen clients that prefix new (unquoted) text with the quote
character ">".
character ">".
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.