HTML Encoded Translation

D

Dave

How can I translate this:

gi

to this:

"gi"

I've tried urllib.unencode and it doesn't work.

Thanks!
 
F

Fredrik Lundh

Dave said:
How can I translate this:

gi

to this:

"gi"

the easiest way is to run it through an HTML or XML parser (depending on
what the source is). or you could use something like this:

import re

def fix_charrefs(text):
def fixup(m):
text = m.group(0)
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
'gi'

also see:

http://effbot.org/zone/re-sub.htm#strip-html
I've tried urllib.unencode and it doesn't work.

those are HTML/XML character references, not encoded URL characters.

</F>
 
S

Sybren Stuvel

Dave enlightened us with:
How can I translate this:

gi

to this:

"gi"

I've tried urllib.unencode and it doesn't work.

As you put so nicely in the subject: it is HTML encoding, not URL
encoding. Those are two very different things! Try a HTML decoder,
you'll have more luck with that...

Sybren
 
D

Dave

Got it, great. This worked like a charm. I knew I was barking up the
wrong tree with urllib, but I didn't know which tree to bark up...

Thanks!

Fredrik said:
Dave said:
How can I translate this:

gi

to this:

"gi"

the easiest way is to run it through an HTML or XML parser (depending on
what the source is). or you could use something like this:

import re

def fix_charrefs(text):
def fixup(m):
text = m.group(0)
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
'gi'

also see:

http://effbot.org/zone/re-sub.htm#strip-html
I've tried urllib.unencode and it doesn't work.

those are HTML/XML character references, not encoded URL characters.

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,679
Members
48,796
Latest member
Greg L.

Latest Threads

Top