How *build* new elements and *replace* elements with xml.dom.minidom?

C

Chris Seberino

How build new elements to replace existing ones using xml.dom.minidom?

Specifically, I have an HTML table of numbers. I want to replace
those numbers with hyperlinks to create a table of hyperlinks.

So I need to build hyperlinks (a elements) with href attribute and
replace the text elements (numbers) somehow.

How do that?

chris
 
S

Stefan Behnel

Chris said:
How build new elements to replace existing ones using xml.dom.minidom?

Specifically, I have an HTML table of numbers. I want to replace
those numbers with hyperlinks to create a table of hyperlinks.

So I need to build hyperlinks (a elements) with href attribute and
replace the text elements (numbers) somehow.

Try lxml.html instead. It makes it really easy to do these things. For
example, you can use XPath to find all table cells that contain numbers:

td_list = doc.xpath("//td[number() >= 0]")

or maybe using regular expressions to make sure it's an int:

td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
namespaces={'re':'http://exslt.org/regular-expressions'})

and then replace them by a hyperlink:

# assuming links = ['http://...', ...]

from lxml.html.builder import A
for td in td_list:
index = int(td.text)
a = A("some text", href=links[index])
td.getparent().replace(td, a)

Stefan
 
J

Johannes Bauer

Stefan said:
So I need to build hyperlinks (a elements) with href attribute and
replace the text elements (numbers) somehow.

Try lxml.html instead. It makes it really easy to do these things. For
example, you can use XPath to find all table cells that contain numbers:

td_list = doc.xpath("//td[number() >= 0]")

or maybe using regular expressions to make sure it's an int:

td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
namespaces={'re':'http://exslt.org/regular-expressions'})

and then replace them by a hyperlink:

# assuming links = ['http://...', ...]

from lxml.html.builder import A
for td in td_list:
index = int(td.text)
a = A("some text", href=links[index])
td.getparent().replace(td, a)

Oh no! I was looking for something like this for *ages* but always
fought with minidom - where this is a real pain :-(

Had I only known before that such a wonderful library exists. I'll
definitely use lxml from now on. Does it compile with Python3?

Kind regards,
Johannes
 
S

Stefan Behnel

Johannes said:
Stefan said:
So I need to build hyperlinks (a elements) with href attribute and
replace the text elements (numbers) somehow.
Try lxml.html instead. It makes it really easy to do these things. For
example, you can use XPath to find all table cells that contain numbers:

td_list = doc.xpath("//td[number() >= 0]")

or maybe using regular expressions to make sure it's an int:

td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
namespaces={'re':'http://exslt.org/regular-expressions'})

and then replace them by a hyperlink:

# assuming links = ['http://...', ...]

from lxml.html.builder import A
for td in td_list:
index = int(td.text)
a = A("some text", href=links[index])
td.getparent().replace(td, a)

Oh no! I was looking for something like this for *ages* but always
fought with minidom - where this is a real pain :-(

Had I only known before that such a wonderful library exists. I'll
definitely use lxml from now on.

Yep, I keep advertising it all over the place, but there are still so many
references to minidom on the web that it's hard to become the first hit in
Google when you search for "Python XML". ;)

Actually, the first hit (for me) is currently PyXML, which is officially
unmaintained. Wasn't there 'some' Python developer working for Google? What
about fixing their database?

Does it compile with Python3?

Sure. :)

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top