Convert xml symbol notation

D

dumbkiwi

Hi,

I'm working on a script to download and parse a web page, and it
includes xml symbol notation, such as ' for the ' character. Does
anyone know of a pre-existing python script/lib to convert the xml
notation back to the actual symbol it represents?
 
G

Gabriel Genellina

dumbkiwi said:
I'm working on a script to download and parse a web page, and it
includes xml symbol notation, such as ' for the ' character. Does
anyone know of a pre-existing python script/lib to convert the xml
notation back to the actual symbol it represents?

Try the htmlentitydefs module.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

I'm working on a script to download and parse a web page, and it
Try the htmlentitydefs module.

That won't help: this is a character reference, not an entity reference.
htmlentitydefs only contains the definitions of entities.

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

I'm working on a script to download and parse a web page, and it
includes xml symbol notation, such as ' for the ' character. Does
anyone know of a pre-existing python script/lib to convert the xml
notation back to the actual symbol it represents?

If you have this given in an XML file (rather than an HTML file which
is not well-formed XML), you could use an XML parser for the entire
file. This would automatically unescape character references. Likewise,
you can parse it with HTMLParser, which will invoke the handle_charref
method for these.

If you just want to unescape references, you can use the code in

http://effbot.org/zone/re-sub.htm

HTH,
Martin
 
G

Gabriel Genellina

Martin said:
That won't help: this is a character reference, not an entity reference.
htmlentitydefs only contains the definitions of entities.

Ouch! Sorry.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top