D
Dave
How can I translate this:
gi
to this:
"gi"
I've tried urllib.unencode and it doesn't work.
Thanks!
gi
to this:
"gi"
I've tried urllib.unencode and it doesn't work.
Thanks!
Dave said:How can I translate this:
gi
to this:
"gi"
I've tried urllib.unencode and it doesn't work.
How can I translate this:
gi
to this:
"gi"
I've tried urllib.unencode and it doesn't work.
Fredrik said:Dave said:How can I translate this:
gi
to this:
"gi"
the easiest way is to run it through an HTML or XML parser (depending on
what the source is). or you could use something like this:
import re
def fix_charrefs(text):
def fixup(m):
text = m.group(0)
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
'gi'
also see:
http://effbot.org/zone/re-sub.htm#strip-html
I've tried urllib.unencode and it doesn't work.
those are HTML/XML character references, not encoded URL characters.
</F>
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.