D
Donald Gordon
Hi
I'm trying to retrieve an HTML document in UTF-8 format using LWP, but
have hit a snag: the document redefines the Content-type: header from
"text/html" to "text/html; charset=UTF-8" using a <meta
http-equiv="Content-type"... /> tag. LWP doesn't pick this up, and I
seem to be ending up with a string with UTF-8 in it, but perl thinks
it's already been decoded.
Is there anyway to tell perl to turn a string with bytes in it that look
like UTF-8 into a string with real wide characters? Or a way to get LWP
to make the problem go away?
thanks in advance
donald
I'm trying to retrieve an HTML document in UTF-8 format using LWP, but
have hit a snag: the document redefines the Content-type: header from
"text/html" to "text/html; charset=UTF-8" using a <meta
http-equiv="Content-type"... /> tag. LWP doesn't pick this up, and I
seem to be ending up with a string with UTF-8 in it, but perl thinks
it's already been decoded.
Is there anyway to tell perl to turn a string with bytes in it that look
like UTF-8 into a string with real wide characters? Or a way to get LWP
to make the problem go away?
thanks in advance
donald