translating MS Word codes using regexps

N

neologist

We have a Web form where the input on certain textarea boxes
is filled in by some of our users by cutting and pasting
text from MS Word documents. They think it's plain text, but
it's not. We would like to parse that input before saving it
to the database, e.g., to translate the funny Word quotes to
plain double and single quotes using regexps such as by

s/$Some_MS_Word_Code/"/g;

I'm sure there's a table of standard codes and some examples
of how to make typical substitutions somewhere, possibly
even a CPAN package to assist with this, but I'm not able to
find it because I don't know exactly what I'm looking for.

Can someone point me in the right direction?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top