libxml2 and XPath - Iterate through repeating elements?

Discussion in 'Python' started by nickheppleston@gmail.com, Dec 2, 2005.

  1. Guest

    I'm trying to iterate through repeating elements to extract data using
    libxml2 but I'm having zero luck - any help would be appreciated.

    My XML source is similar to the following - I'm trying to extract the
    line number and product code from the repeating line elements:

    <order xmlns="some-ns">
    <header>
    <orderno>123456</orderno>
    </header>
    <lines>
    <line>
    <lineno>1</lineno>
    <productcode>PENS</productcode>
    </line>
    <line>
    <lineno>2</lineno>
    <productcode>STAPLER</productcode>
    </line>
    <line>
    <lineno>3</lineno>
    <productcode>RULER</productcode>
    </line>
    </lines>
    </order>

    With the following code I can get at the non-repeating elements in the
    header, and get the lines elements, but cannot extract the
    lineno/productcode data via xpath:

    XmlDoc = libxml2.parseFile(XmlFile);
    XPathDoc = XmlDoc.xpathNewContext();
    XPathDoc.xpathRegisterNs('so',"some-ns");


    # Extract data from the order header
    PurchaseOrderNo =
    XPathDoc.xpathEval('//so:eek:rder/so:header/so:eek:rderno');

    # Extract data from the order lines
    for line in XPathDoc.xpathEval('//so:eek:rder/so:lines/so:line'):
    print line.content;

    # Explicitly free Xml document and XPath context
    XmlDoc.freeDoc()
    XPathDoc.xpathFreeContext()

    Ideally, I'd like to select the line data using xpath (similar to an
    XSLT query after a 'for-each' - i.e. xpathEval('so:lineno') and
    xpathEval('so:productcode') once I've got the line element).

    Any suggestions grealty appreciated!

    Cheers, Nick.
     
    , Dec 2, 2005
    #1
    1. Advertising

  2. Paul Boddie Guest

    wrote:
    > I'm trying to iterate through repeating elements to extract data using
    > libxml2 but I'm having zero luck - any help would be appreciated.


    Here's how I attempt to solve the problem using libxml2dom [1] (and I
    imagine others will suggest their own favourite modules, too):

    import libxml2dom
    d = libxml2dom.parseFile(filename)
    order_numbers = d.xpath("//so:eek:rder/so:header/so:eek:rderno",
    namespaces={"so" : "some-ns"})

    At this point, you have a list of nodes. (I imagine that whatever
    object the libxml2 module API produces probably has those previous and
    next attributes to navigate the result list instead.) The nodes in the
    list represent the orderno elements in this case, and in libxml2dom you
    can choose to invoke the usual DOM methods on such node objects, or
    even the toString method if you want the document text. For the line
    items...

    lines = d.xpath("//so:eek:rder/so:lines/so:line", namespaces={"so" :
    "some-ns"})
    for line in lines:
    print line.toString()

    I can't remember what the libxml2 module produces for the content
    attribute of a node, although the underlying libxml2 API produces a
    "text-only" representation of the document text, as opposed to the
    actual document text that the toString method produces in the above
    example. I imagine that an application working with the line item
    information would use additional DOM or XPath processing to get the
    line item index and the product code directly.

    Anyway, I recommend libxml2dom because if you're already using the
    bundled libxml2 module, you should be able to install libxml2dom and
    plug into the same infrastructure that the bundled module depends upon.
    Moreover, libxml2dom is a "pure Python" package that doesn't require
    any extension module compilation, so it should be quite portable to
    whatever platform you're using.

    Paul

    [1] http://www.python.org/pypi/libxml2dom
     
    Paul Boddie, Dec 2, 2005
    #2
    1. Advertising

  3. Le Vendredi 2 Décembre 2005 18:31, a écrit :
    > I'm trying to iterate through repeating elements to extract data using
    > libxml2 but I'm having zero luck - any help would be appreciated.
    >
    > My XML source is similar to the following - I'm trying to extract the
    > line number and product code from the repeating line elements:
    >
    > <order xmlns="some-ns">
    > <header>
    > <orderno>123456</orderno>
    > </header>
    > <lines>
    > <line>
    > <lineno>1</lineno>
    > <productcode>PENS</productcode>
    > </line>
    > <line>
    > <lineno>2</lineno>
    > <productcode>STAPLER</productcode>
    > </line>
    > <line>
    > <lineno>3</lineno>
    > <productcode>RULER</productcode>
    > </line>
    > </lines>
    > </order>

    The result of an xpath evaluation is a list of node, which you can perform
    another xpatheval() on :

    import libxml2
    doc = libxml2.parseFile(XmlFile)
    root = doc.getRootElement()
    line_nodes = root.xpathEval('lines/line')
    for line_node in line_nodes:
    print line_node.xpathEval('lineno')[0].content
    print line_node.xpathEval('productcode')[0].content
    doc.freeDoc()

    --
    Cordially

    Jean-Roch SOTTY
     
    Jean-Roch SOTTY, Dec 5, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Wolfgang Lipp
    Replies:
    1
    Views:
    404
    Patrick TJ McPhee
    Jan 30, 2004
  2. Wolfgang Lipp
    Replies:
    0
    Views:
    482
    Wolfgang Lipp
    Jan 28, 2004
  3. Brian Donovan

    libxml2 w/ xpath python bindings

    Brian Donovan, Jan 23, 2004, in forum: Python
    Replies:
    1
    Views:
    468
    Brian Donovan
    Jan 23, 2004
  4. Maxim Khesin

    libxml2/xpath

    Maxim Khesin, Dec 16, 2004, in forum: Python
    Replies:
    5
    Views:
    7,475
    Frans Englich
    Dec 16, 2004
  5. Steve Ball
    Replies:
    0
    Views:
    2,709
    Steve Ball
    Nov 24, 2008
Loading...

Share This Page