How *build* new elements and *replace* elements with xml.dom.minidom?

Discussion in 'Python' started by Chris Seberino, Jun 10, 2009.

  1. How build new elements to replace existing ones using xml.dom.minidom?

    Specifically, I have an HTML table of numbers. I want to replace
    those numbers with hyperlinks to create a table of hyperlinks.

    So I need to build hyperlinks (a elements) with href attribute and
    replace the text elements (numbers) somehow.

    How do that?

    chris
    Chris Seberino, Jun 10, 2009
    #1
    1. Advertising

  2. Chris Seberino wrote:
    > How build new elements to replace existing ones using xml.dom.minidom?
    >
    > Specifically, I have an HTML table of numbers. I want to replace
    > those numbers with hyperlinks to create a table of hyperlinks.
    >
    > So I need to build hyperlinks (a elements) with href attribute and
    > replace the text elements (numbers) somehow.


    Try lxml.html instead. It makes it really easy to do these things. For
    example, you can use XPath to find all table cells that contain numbers:

    td_list = doc.xpath("//td[number() >= 0]")

    or maybe using regular expressions to make sure it's an int:

    td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
    namespaces={'re':'http://exslt.org/regular-expressions'})

    and then replace them by a hyperlink:

    # assuming links = ['http://...', ...]

    from lxml.html.builder import A
    for td in td_list:
    index = int(td.text)
    a = A("some text", href=links[index])
    td.getparent().replace(td, a)

    Stefan
    Stefan Behnel, Jun 11, 2009
    #2
    1. Advertising

  3. Stefan Behnel schrieb:

    >> So I need to build hyperlinks (a elements) with href attribute and
    >> replace the text elements (numbers) somehow.

    >
    > Try lxml.html instead. It makes it really easy to do these things. For
    > example, you can use XPath to find all table cells that contain numbers:
    >
    > td_list = doc.xpath("//td[number() >= 0]")
    >
    > or maybe using regular expressions to make sure it's an int:
    >
    > td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
    > namespaces={'re':'http://exslt.org/regular-expressions'})
    >
    > and then replace them by a hyperlink:
    >
    > # assuming links = ['http://...', ...]
    >
    > from lxml.html.builder import A
    > for td in td_list:
    > index = int(td.text)
    > a = A("some text", href=links[index])
    > td.getparent().replace(td, a)


    Oh no! I was looking for something like this for *ages* but always
    fought with minidom - where this is a real pain :-(

    Had I only known before that such a wonderful library exists. I'll
    definitely use lxml from now on. Does it compile with Python3?

    Kind regards,
    Johannes

    --
    "Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit,
    verlästerung von Gott, Bibel und mir und bewusster Blasphemie."
    -- Prophet und Visionär Hans Joss aka HJP in de.sci.physik
    <48d8bf1d$0$7510$>
    Johannes Bauer, Jun 11, 2009
    #3
  4. Johannes Bauer wrote:
    > Stefan Behnel schrieb:
    >
    >>> So I need to build hyperlinks (a elements) with href attribute and
    >>> replace the text elements (numbers) somehow.

    >> Try lxml.html instead. It makes it really easy to do these things. For
    >> example, you can use XPath to find all table cells that contain numbers:
    >>
    >> td_list = doc.xpath("//td[number() >= 0]")
    >>
    >> or maybe using regular expressions to make sure it's an int:
    >>
    >> td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
    >> namespaces={'re':'http://exslt.org/regular-expressions'})
    >>
    >> and then replace them by a hyperlink:
    >>
    >> # assuming links = ['http://...', ...]
    >>
    >> from lxml.html.builder import A
    >> for td in td_list:
    >> index = int(td.text)
    >> a = A("some text", href=links[index])
    >> td.getparent().replace(td, a)

    >
    > Oh no! I was looking for something like this for *ages* but always
    > fought with minidom - where this is a real pain :-(
    >
    > Had I only known before that such a wonderful library exists. I'll
    > definitely use lxml from now on.


    Yep, I keep advertising it all over the place, but there are still so many
    references to minidom on the web that it's hard to become the first hit in
    Google when you search for "Python XML". ;)

    Actually, the first hit (for me) is currently PyXML, which is officially
    unmaintained. Wasn't there 'some' Python developer working for Google? What
    about fixing their database?


    > Does it compile with Python3?


    Sure. :)

    Stefan
    Stefan Behnel, Jun 12, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Greg Wogan-Browne
    Replies:
    1
    Views:
    791
    Uche Ogbuji
    Jan 28, 2005
  2. Replies:
    3
    Views:
    523
    Stefan Behnel
    Aug 3, 2007
  3. dpapathanasiou

    mod_python and xml.dom.minidom

    dpapathanasiou, May 9, 2009, in forum: Python
    Replies:
    8
    Views:
    549
    dpapathanasiou
    May 15, 2009
  4. Johannes Bauer
    Replies:
    7
    Views:
    1,059
    Johannes Bauer
    Jun 11, 2009
  5. ming
    Replies:
    2
    Views:
    138
Loading...

Share This Page