UTF-8, LWP and http-equiv meta tags

D

Donald Gordon

Hi

I'm trying to retrieve an HTML document in UTF-8 format using LWP, but
have hit a snag: the document redefines the Content-type: header from
"text/html" to "text/html; charset=UTF-8" using a <meta
http-equiv="Content-type"... /> tag. LWP doesn't pick this up, and I
seem to be ending up with a string with UTF-8 in it, but perl thinks
it's already been decoded.

Is there anyway to tell perl to turn a string with bytes in it that look
like UTF-8 into a string with real wide characters? Or a way to get LWP
to make the problem go away?

thanks in advance

donald
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,017
Latest member
GreenAcreCBDGummiesReview

Latest Threads

Top