Re: python and parsing an xml file

Discussion in 'Python' started by Stefan Behnel, Feb 21, 2011.

  1. Matt Funk, 21.02.2011 18:30:
    > I want to create a set of xml input files to my code that look as follows:
    > <?xml version="1.0" encoding="UTF-8"?>
    >
    > <!-- Settings for the algorithm to be performed
    > -->
    > <Algorithm>
    >
    > <!-- The algorithm type.
    > -->
    > <!-- The supported options are:
    > -->
    > <!-- - Alg0
    > -->
    > <!-- - Alg1
    > -->
    > <Type>Alg1</Type>
    >
    > <!-- the location/path of the input file for this algorithm
    > -->
    > <path>./Alg1.in</path>
    >
    > </Algorithm>
    >
    >
    > <!-- Relevant information during the processing will be written to a
    > logfile -->
    > <Logfile>
    >
    > <!-- the location/path of the logfile (i.e. where to put the
    > logfile) -->
    > <path>c:\tmp</path>
    >
    > <!-- verbosity level (i.e. how much to print)
    > -->
    > <!-- The supported options are:
    > -->
    > <!-- - 0 (nothing printed)
    > -->
    > <!-- - 1 (print on error)
    > -->
    > <verbosity>1</verbosity>
    >
    > </Logfile>


    That's not XML. XML documents have exactly one root element, i.e. you need
    an enclosing element around these two tags.


    > So there are comments, whitespace etc ... in it.
    > I would like to be able to put everything into some sort of structure


    Including the comments or without them? Note that ElementTree will ignore
    comments.


    > such that i can access it as:
    > structure['Algorithm']['Type'] == Alg1


    Have a look at lxml.objectify. It allows you to write

    alg_type = root.Algorithm.Type.text

    and a couple of other niceties.

    http://lxml.de/objectify.html


    > I was wondering if there is something out there that does this.
    > I found and tried a few things:
    > 1) http://code.activestate.com/recipes/534109-xml-to-python-data-structure/
    > It simply doesn't work. I get the following error:
    > raise exception
    > xml.sax._exceptions.SAXParseException:<unknown>:1:2: not well-formed
    > (invalid token)


    "not well formed" == "not XML".


    > But i removed everything from the file except:<?xml version="1.0"
    > encoding="UTF-8"?>
    > and i still got the error.


    That's not XML, either.


    > Anyway, i looked at ElementTree, but that error out with:
    > xml.parsers.expat.ExpatError: junk after document element: line 19, column 0


    In any case, ElementTree is preferable over a SAX based solution, both for
    performance and maintainability reasons.

    Stefan
    Stefan Behnel, Feb 21, 2011
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jason
    Replies:
    2
    Views:
    621
    Jason
    Apr 28, 2007
  2. Matt Funk

    python and parsing an xml file

    Matt Funk, Feb 21, 2011, in forum: Python
    Replies:
    3
    Views:
    780
    Paul Anton Letnes
    Feb 22, 2011
  3. John Levine
    Replies:
    0
    Views:
    709
    John Levine
    Feb 2, 2012
  4. PL
    Replies:
    2
    Views:
    215
    Brian McCauley
    Dec 14, 2004
  5. Erik Wasser
    Replies:
    5
    Views:
    414
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page