ElementTree.XML(string XML) and ElementTree.fromstring(string XML)not working

Discussion in 'Python' started by Kee Nethery, Jun 26, 2009.

  1. Kee Nethery

    Kee Nethery Guest

    Summary: I have XML as string and I want to pull it into ElementTree
    so that I can play with it but it is not working for me. XML and
    fromstring when used with a string do not do the same thing as parse
    does with a file. How do I get this to work?

    Details:
    I have a CGI that receives XML via an HTTP POST as a POST variable
    named 'theXml'. The POST data is a string that the CGI receives, it is
    not a file on a hard disk.

    The POSTed string looks like this when viewed in pretty format:

    <xml>
    <purchase id="1" lang="en">
    <item id="1" productId="369369">
    <name>Autumn</name>
    <quantity>1</quantity>
    <price>8.46</price>
    </item>
    <javascript>YES</javascript>
    </purchase>
    <customer id="123456" time="1227449322">
    <shipping>
    <street>19 Any Street</street>
    <city>Berkeley</city>
    <state>California</state>
    <zip>12345</zip>
    <country>People's Republic of Berkeley</country>
    <name>Jon Roberts</name>
    </shipping>
    <email></email>
    </customer>
    </xml>


    The pseudocode in Python 2.6.2 looks like:

    import xml.etree.ElementTree as et

    formPostData = cgi.FieldStorage()
    theXmlData = formPostData['theXml'].value
    theXmlDataTree = et.XML(theXmlData)

    and when this runs, theXmlDataTree is set to:

    theXmlDataTree instance <Element xml at 7167b0>
    attrib dict {}
    tag str xml
    tail NoneType None
    text NoneType None

    I get the same result with fromstring:

    formPostData = cgi.FieldStorage()
    theXmlData = formPostData['theXml'].value
    theXmlDataTree = et.fromstring(theXmlData)

    I can put the xml in a file and reference the file by it's URL and use:

    et.parse(urllib.urlopen(theUrl))

    and that will set theXmlDataTree to:

    theXmlDataTree instance <xml.etree.ElementTree.ElementTree instance at
    0x67cb48>

    This result I can play with. It contains all the XML.

    et.parse seems to pull in the entire XML document and give me
    something to play with whereas et.XML and et.fromstring do not.

    Questions:
    How do I get this to work?
    Where in the docs did it give me an example of how to make this work
    (what did I miss from reading the docs)?

    .... and for bonus points ...

    Why isn't et.parse the only way to do this? Why have XML or fromstring
    at all? Why not enhance parse and deprecate XML and fromstring with
    something like:

    formPostData = cgi.FieldStorage()
    theXmlData = formPostData['theXml'].value
    theXmlDataTree =
    et
    ..parse
    (makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))

    Thanks in advance,
    Kee Nethery
    Kee Nethery, Jun 26, 2009
    #1
    1. Advertising

  2. Kee Nethery

    Nobody Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(string XML) not working

    On Thu, 25 Jun 2009 18:02:25 -0700, Kee Nethery wrote:

    > Summary: I have XML as string and I want to pull it into ElementTree
    > so that I can play with it but it is not working for me. XML and
    > fromstring when used with a string do not do the same thing as parse
    > does with a file. How do I get this to work?


    Why do you need an ElementTree rather than an Element? XML(string) returns
    the root element, as if you had used et.parse(f).getroot(). You can turn
    this into an ElementTree with e.g. et.ElementTree(XML(string)).

    > Why isn't et.parse the only way to do this? Why have XML or fromstring
    > at all? Why not enhance parse and deprecate XML and fromstring with
    > something like:
    >
    > formPostData = cgi.FieldStorage()
    > theXmlData = formPostData['theXml'].value
    > theXmlDataTree =
    > et.parse(makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))


    If you want to treat a string as a file, use StringIO.
    Nobody, Jun 26, 2009
    #2
    1. Advertising

  3. Kee Nethery

    unayok Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    On Jun 25, 9:02 pm, Kee Nethery <> wrote:
    > Summary: I have XML as string and I want to pull it into ElementTree
    > so that I can play with it but it is not working for me. XML and
    > fromstring when used with a string do not do the same thing as parse
    > does with a file. How do I get this to work?
    >
    > Details:
    > I have a CGI that receives XML via an HTTP POST as a POST variable
    > named 'theXml'. The POST data is a string that the CGI receives, it is
    > not a file on a hard disk.
    >
    > The POSTed string looks like this when viewed in pretty format:

    [...]
    > et.parse seems to pull in the entire XML document and give me
    > something to play with whereas et.XML and et.fromstring do not.
    >
    > Questions:
    > How do I get this to work?
    > Where in the docs did it give me an example of how to make this work
    > (what did I miss from reading the docs)?
    >

    [skipping bonus points question]

    I'm not sure what you're expecting. It looks to me like things are
    working okay:

    My test script:

    import xml.etree.ElementTree as ET

    data="""<xml>
    <purchase id="1" lang="en">
    <item id="1" productId="369369">
    <name>Autumn</name>
    <quantity>1</quantity>
    <price>8.46</price>
    </item>
    <javascript>YES</javascript>
    </purchase>
    <customer id="123456" time="1227449322">
    <shipping>
    <street>19 Any Street</street>
    <city>Berkeley</city>
    <state>California</state>
    <zip>12345</zip>
    <country>People's Republic of Berkeley</
    country>
    <name>Jon Roberts</name>
    </shipping>
    <email></email>
    </customer>
    </xml>"""

    xml = ET.fromstring( data )

    print xml
    print "attrib ", xml.attrib
    print "tag ", xml.tag
    print "text ", xml.text
    print "contents "
    for element in xml :
    print element
    print "tostring"
    print ET.tostring( xml )

    when run, produces:

    <Element xml at 7f582c2e82d8>
    attrib {}
    tag xml
    text

    contents
    <Element purchase at 7f582c2e8320>
    <Element customer at 7f582c2e85a8>
    tostring
    <xml>
    <purchase id="1" lang="en">
    <item id="1" productId="369369">
    <name>Autumn</name>
    <quantity>1</quantity>
    <price>8.46</price>
    </item>
    <javascript>YES</javascript>
    </purchase>
    <customer id="123456" time="1227449322">
    <shipping>
    <street>19 Any Street</street>
    <city>Berkeley</city>
    <state>California</state>
    <zip>12345</zip>
    <country>People's Republic of Berkeley</
    country>
    <name>Jon Roberts</name>
    </shipping>
    <email></email>
    </customer>
    </xml>

    Which seems to me quite useful (i.e. it has the full XML available).
    Maybe you can explain how you were trying to "play with" the results
    of fromstring() that you can't do from parse().

    The documentation for elementtree indicates:

    > The ElementTree wrapper type adds code to load XML files as trees
    > of Element objects, and save them back again.


    and

    > The Element type can be used to represent XML files in memory.
    > The ElementTree wrapper class is used to read and write XML files.


    In the above case, you should find that the getroot() of your loaded
    ElementTree instance ( parse().getroot() ) to be the same as the
    Element generated by fromstring().
    unayok, Jun 26, 2009
    #3
  4. Kee Nethery

    Carl Banks Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    On Jun 25, 6:02 pm, Kee Nethery <> wrote:
    > Summary: I have XML as string and I want to pull it into ElementTree  
    > so that I can play with it but it is not working for me. XML and  
    > fromstring when used with a string do not do the same thing as parse  
    > does with a file. How do I get this to work?
    >
    > Details:
    > I have a CGI that receives XML via an HTTP POST as a POST variable  
    > named 'theXml'. The POST data is a string that the CGI receives, it is  
    > not a file on a hard disk.
    >
    > The POSTed string looks like this when viewed in pretty format:
    >
    > <xml>
    >         <purchase id="1" lang="en">
    >                 <item id="1" productId="369369">
    >                         <name>Autumn</name>
    >                         <quantity>1</quantity>
    >                         <price>8.46</price>
    >                 </item>
    >                 <javascript>YES</javascript>
    >         </purchase>
    >         <customer id="123456" time="1227449322">
    >                 <shipping>
    >                         <street>19 Any Street</street>
    >                         <city>Berkeley</city>
    >                         <state>California</state>
    >                         <zip>12345</zip>
    >                         <country>People's Republic of Berkeley</country>
    >                         <name>Jon Roberts</name>
    >                 </shipping>
    >                 <email></email>
    >         </customer>
    > </xml>
    >
    > The pseudocode in Python 2.6.2 looks like:
    >
    > import xml.etree.ElementTree as et
    >
    > formPostData = cgi.FieldStorage()
    > theXmlData = formPostData['theXml'].value
    > theXmlDataTree = et.XML(theXmlData)
    >
    > and when this runs, theXmlDataTree is set to:
    >
    > theXmlDataTree  instance        <Element xml at 7167b0>
    >         attrib  dict    {}
    >         tag     str     xml
    >         tail    NoneType        None
    >         text    NoneType        None
    >
    > I get the same result with fromstring:
    >
    > formPostData = cgi.FieldStorage()
    > theXmlData = formPostData['theXml'].value
    > theXmlDataTree = et.fromstring(theXmlData)
    >
    > I can put the xml in a file and reference the file by it's URL and use:
    >
    > et.parse(urllib.urlopen(theUrl))
    >
    > and that will set theXmlDataTree to:
    >
    > theXmlDataTree  instance        <xml.etree.ElementTree.ElementTree instance at  
    > 0x67cb48>
    >
    > This result I can play with. It contains all the XML.


    I believe you are misunderstanding something. et.XML and
    et.fromstring return Elements, whereas et.parse returns an
    ElementTree. These are two different things; however, both of them
    "contain all the XML". In fact, an ElementTree (which is returned by
    et.parse) is just a container for the root Element (returned by
    et.fromstring)--and it adds no important functionality to the root
    Element as far as I can tell.

    Given an Element (as returned by et.XML or et.fromstring) you can pass
    it to the ElementTree constructor to get an ElementTree instance. The
    following line should give you something you can "play with":

    theXmlDataTree = et.ElementTree(et.fromstring(theXmlData))

    Conversely, given an ElementTree (as returned bu et.parse) you can
    call the getroot method to obtain the root Element, like so:

    theXmlRootElement = et.parse(xmlfile).getroot()

    I have no use for ElementTree instances so I always call getroot right
    away and only store the root element. You may prefer to work with
    ElementTrees rather than with Elements directly, and that's perfectly
    fine; just use the technique above to wrap up the root Element if you
    use et.fromstring.


    [snip]
    > Why isn't et.parse the only way to do this? Why have XML or fromstring  
    > at all?


    Because Fredrick Lundh wanted it that way. Unlike most Python
    libraries ElementTree is under the control of one person, which means
    it was not designed or vetted by the community, which means it would
    tend to have some interface quirks. You shouldn't complain: the
    library is superb compared to XML solutions like DOM. A few minor
    things should be no big deal.


    Carl Banks
    Carl Banks, Jun 26, 2009
    #4
  5. Kee Nethery

    Kee Nethery Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    thank you to everyone, I'll play with these suggestions tomorrow at
    work and report back.

    On Jun 25, 2009, at 8:04 PM, Carl Banks wrote:

    > Because Fredrick Lundh wanted it that way. Unlike most Python
    > libraries ElementTree is under the control of one person, which means
    > it was not designed or vetted by the community, which means it would
    > tend to have some interface quirks.


    Yep

    > You shouldn't complain: the
    > library is superb compared to XML solutions like DOM.


    Which is why I want to use it.

    > A few minor
    > things should be no big deal.


    True and I will eventually get past the minor quirks. As a newbie,
    figured I'd point out the difficult portions, things that conceptually
    are confusing. I know that after lots of use I'm not going to notice
    that it is strange that I have to stand on my head and touch my nose 3
    times to open the fridge door. The contortions will seem normal.

    Results tomorrow, thanks everyone for the assistance.

    Kee Nethery
    Kee Nethery, Jun 26, 2009
    #5
  6. Kee Nethery

    Carl Banks Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    On Jun 25, 8:53 pm, Kee Nethery <> wrote:
    > On Jun 25, 2009, at 8:04 PM, Carl Banks wrote:
    > > A few minor
    > > things should be no big deal.

    >
    > True and I will eventually get past the minor quirks. As a newbie,  
    > figured I'd point out the difficult portions, things that conceptually  
    > are confusing. I know that after lots of use I'm not going to notice  
    > that it is strange that I have to stand on my head and touch my nose 3  
    > times to open the fridge door. The contortions will seem normal.


    Well it's not *that* bad.

    (That would be PIL. :)


    Carl Banks
    Carl Banks, Jun 26, 2009
    #6
  7. Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    Carl Banks wrote:
    >> Why isn't et.parse the only way to do this? Why have XML or fromstring
    >> at all?

    >
    > Because Fredrick Lundh wanted it that way. Unlike most Python
    > libraries ElementTree is under the control of one person, which means
    > it was not designed or vetted by the community, which means it would
    > tend to have some interface quirks.


    Just for the record: Fredrik doesn't actually consider it a design "quirk".
    He argues that it's designed for different use cases. While parse() parses
    a file, which normally contains a complete document (represented in ET as
    an ElementTree object), fromstring() and especially the 'literal wrapper'
    XML() are made for parsing strings, which (most?) often only contain XML
    fragments. With a fragment, you normally want to continue doing things like
    inserting it into another tree, so you need the top-level element in almost
    all cases.

    Stefan
    Stefan Behnel, Jun 26, 2009
    #7
  8. Kee Nethery

    Carl Banks Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    On Jun 25, 10:11 pm, Stefan Behnel <> wrote:
    > Carl Banks wrote:
    > >> Why isn't et.parse the only way to do this? Why have XML or fromstring  
    > >> at all?

    >
    > > Because Fredrick Lundh wanted it that way.  Unlike most Python
    > > libraries ElementTree is under the control of one person, which means
    > > it was not designed or vetted by the community, which means it would
    > > tend to have some interface quirks.

    >
    > Just for the record: Fredrik doesn't actually consider it a design "quirk".


    Well of course he wouldn't--it's his library.

    > He argues that it's designed for different use cases. While parse() parses
    > a file, which normally contains a complete document (represented in ET as
    > an ElementTree object), fromstring() and especially the 'literal wrapper'
    > XML() are made for parsing strings, which (most?) often only contain XML
    > fragments. With a fragment, you normally want to continue doing things like
    > inserting it into another tree, so you need the top-level element in almost
    > all cases.


    Whatever, like I said I am not going to nit-pick over small things,
    when all the big things are done right.


    Carl Banks
    Carl Banks, Jun 26, 2009
    #8
  9. Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    Carl Banks wrote:
    > On Jun 25, 10:11 pm, Stefan Behnel wrote:
    >> Carl Banks wrote:
    >>>> Why isn't et.parse the only way to do this? Why have XML or fromstring
    >>>> at all?
    >>> Because Fredrick Lundh wanted it that way. Unlike most Python
    >>> libraries ElementTree is under the control of one person, which means
    >>> it was not designed or vetted by the community, which means it would
    >>> tend to have some interface quirks.

    >> Just for the record: Fredrik doesn't actually consider it a design "quirk".

    >
    > Well of course he wouldn't--it's his library.


    That's not an argument at all. Fredrik put out a alpha of ET 1.3 (long ago,
    actually), which is (or was?) meant as a clean-up release for a number of
    real quirks in the library (lxml also fixes most of them since 2.0). The
    above definitely hasn't changed, simply because it's not considered 'wrong'
    by the author(s).

    Stefan
    Stefan Behnel, Jun 26, 2009
    #9
  10. Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    Hi,

    Kee Nethery wrote:
    > Why isn't et.parse the only way to do this? Why have XML or fromstring
    > at all?


    Well, use cases. XML() is an alias for fromstring(), because it's
    convenient (and well readable) to write

    section = XML('<section id="XYZ"><title>A to Z</title></section>')
    section.append(paragraphs)

    for XML literals in source code. fromstring() is there because when you
    want to parse a fragment from a string that you got from whatever source,
    it's easy to express that with exactly that function, as in

    el = fromstring(some_string)

    If you want to parse a document from a file or file-like object, use
    parse(). Three use cases, three functions. The fourth use case of parsing a
    document from a string does not have its own function, because it is
    trivial to write

    tree = parse(BytesIO(some_byte_string))

    I do not argue that fromstring() should necessarily return an Element, as
    parsing fragments is more likely for literals than for strings that come
    from somewhere else. However, given that the use case of parsing a document
    from a string is so easily handled with parse(), I find it ok to give the
    second use case its own function, simply because

    tree = fromstring(some_string)
    fragment_top_element = tree.getroot()

    absolutely does not catch it.


    > Why not enhance parse and deprecate XML and fromstring with
    > something like:
    >
    > formPostData = cgi.FieldStorage()
    > theXmlData = formPostData['theXml'].value
    > theXmlDataTree =

    et.parse(makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))

    This will not work because ET cannot parse from unicode strings (unless
    they only contain plain ASCII characters and you happen to be using Python
    2.x). lxml can parse from unicode strings, but it requires that the XML
    must not have an encoding declaration (which would render it non
    well-formed). This is convenient for parsing HTML, it's less convenient for
    XML usually.

    If what you meant is actually parsing from a byte string, this is easily
    done using BytesIO(), or StringIO() in Py2.x (x<6).

    Stefan
    Stefan Behnel, Jun 26, 2009
    #10
  11. Kee Nethery

    Carl Banks Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    On Jun 25, 11:20 pm, Stefan Behnel <> wrote:
    > Carl Banks wrote:
    > > On Jun 25, 10:11 pm, Stefan Behnel wrote:
    > >> Carl Banks wrote:
    > >>>> Why isn't et.parse the only way to do this? Why have XML or fromstring  
    > >>>> at all?
    > >>> Because Fredrick Lundh wanted it that way.  Unlike most Python
    > >>> libraries ElementTree is under the control of one person, which means
    > >>> it was not designed or vetted by the community, which means it would
    > >>> tend to have some interface quirks.
    > >> Just for the record: Fredrik doesn't actually consider it a design "quirk".

    >
    > > Well of course he wouldn't--it's his library.

    >
    > That's not an argument at all.


    I can't even imagine what you think I was arguing when I wrote this,
    or what issue you could have with this statement.


    Carl Banks
    Carl Banks, Jun 26, 2009
    #11
  12. Kee Nethery

    Kee Nethery Guest

    Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    First, thanks to everyone who responded. Figured I'd test all the
    suggestions and provide a response to the list. Here goes ...

    On Jun 25, 2009, at 7:38 PM, Nobody wrote:

    > Why do you need an ElementTree rather than an Element? XML(string)
    > returns
    > the root element, as if you had used et.parse(f).getroot(). You can
    > turn
    > this into an ElementTree with e.g. et.ElementTree(XML(string)).


    I tried this:
    et.ElementTree(XML(theXmlData))
    and it did not work.

    I had to modify it to this to get it to work:
    et.ElementTree(et.XML(theXmlData))


    >> formPostData = cgi.FieldStorage()
    >> theXmlData = formPostData['theXml'].value
    >> theXmlDataTree =
    >> et
    >> .parse
    >> (makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))

    >
    > If you want to treat a string as a file, use StringIO.


    I tried this:
    import StringIO
    theXmlDataTree = et.parse(StringIO.StringIO(theXmlData))
    orderXml = theXmlDataTree.findall('purchase')

    and it did work. StringIO converts the string into what looks like a
    file so parse can process it as a file. Cool.

    On Jun 25, 2009, at 7:47 PM, unayok wrote:

    > I'm not sure what you're expecting. It looks to me like things are
    > working okay:
    >
    > My test script:
    >
    > [snip]


    I agree your code works.

    When I tried:
    theXmlDataTree = et.fromstring(theXmlData)
    orderXml = theXmlDataTree.findall('purchase')

    When I modified mine to programmatically look inside using the "for
    element in theXmlDataTree" I was able to see the contents. The
    debugger I am using does not offer me a window into the ElementTree
    data and that was part of the problem. So yes, et.fromstring is
    working correctly. It helps to have someone show me the missing step
    needed to confirm the code works and the IDE does not.



    On Jun 25, 2009, at 8:04 PM, Carl Banks wrote:
    > I believe you are misunderstanding something. et.XML and
    > et.fromstring return Elements, whereas et.parse returns an
    > ElementTree. These are two different things; however, both of them
    > "contain all the XML". In fact, an ElementTree (which is returned by
    > et.parse) is just a container for the root Element (returned by
    > et.fromstring)--and it adds no important functionality to the root
    > Element as far as I can tell.


    Thank you for explaining the difference. I absolutely was
    misunderstanding this.

    > Given an Element (as returned by et.XML or et.fromstring) you can pass
    > it to the ElementTree constructor to get an ElementTree instance. The
    > following line should give you something you can "play with":
    >
    > theXmlDataTree = et.ElementTree(et.fromstring(theXmlData))


    Yes this works.



    On Jun 25, 2009, at 11:39 PM, Stefan Behnel wrote:

    > If you want to parse a document from a file or file-like object, use
    > parse(). Three use cases, three functions. The fourth use case of
    > parsing a
    > document from a string does not have its own function, because it is
    > trivial to write
    >
    > tree = parse(BytesIO(some_byte_string))


    :) Trivial for someone familiar with the language. For a newbie like
    me, that step was non-obvious.

    > If what you meant is actually parsing from a byte string, this is
    > easily
    > done using BytesIO(), or StringIO() in Py2.x (x<6).


    Yes, thanks! Looks like BytesIO is a v.3.x enhancement. Looks like the
    StringIO does what I need since all I'm doing is pulling the unicode
    string into et.parse. Am guessing that either would work equally well.


    >> theXmlDataTree =

    > et
    > .parse
    > (makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))
    >
    > This will not work because ET cannot parse from unicode strings
    > (unless
    > they only contain plain ASCII characters and you happen to be using
    > Python
    > 2.x). lxml can parse from unicode strings, but it requires that the
    > XML
    > must not have an encoding declaration (which would render it non
    > well-formed). This is convenient for parsing HTML, it's less
    > convenient for
    > XML usually.


    Right for my example, if the data is coming in as UTF-8 I believe I
    can do:
    theXmlDataTree = et.parse(StringIO.StringIO(theXmlData), encoding
    ='utf-8')


    Again, as a newbie, thanks to everyone who took the time to respond.
    Very helpful.
    Kee
    Kee Nethery, Jun 26, 2009
    #12
  13. Re: ElementTree.XML(string XML) and ElementTree.fromstring(stringXML) not working

    Kee Nethery wrote:
    > On Jun 25, 2009, at 11:39 PM, Stefan Behnel wrote:
    >> parsing a
    >> document from a string does not have its own function, because it is
    >> trivial to write
    >>
    >> tree = parse(BytesIO(some_byte_string))

    >
    > :) Trivial for someone familiar with the language. For a newbie like
    > me, that step was non-obvious.


    I actually meant the code complexity, not the fact that you need to know
    BytesIO to do the above.


    >> If what you meant is actually parsing from a byte string, this is easily
    >> done using BytesIO(), or StringIO() in Py2.x (x<6).

    >
    > Yes, thanks! Looks like BytesIO is a v.3.x enhancement.


    It should be available in 2.6 AFAIR, simply as an alias for StringIO.


    > Looks like the
    > StringIO does what I need since all I'm doing is pulling the unicode
    > string into et.parse.


    As I said, this won't work, unless you are either

    a) passing a unicode string with plain ASCII characters in Py2.x
    or
    b) confusing UTF-8 and Unicode


    >>> theXmlDataTree =

    >> et.parse(makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))
    >>
    >> This will not work because ET cannot parse from unicode strings (unless
    >> they only contain plain ASCII characters and you happen to be using
    >> Python
    >> 2.x). lxml can parse from unicode strings, but it requires that the XML
    >> must not have an encoding declaration (which would render it non
    >> well-formed). This is convenient for parsing HTML, it's less
    >> convenient for XML usually.

    >
    > Right for my example, if the data is coming in as UTF-8 I believe I can do:
    > theXmlDataTree = et.parse(StringIO.StringIO(theXmlData), encoding
    > ='utf-8')


    Yes, although in this case you are not parsing a unicode string but a UTF-8
    encoded byte string. Plus, passing 'UTF-8' as encoding to the parser is
    redundant, as it is the default for XML.

    Stefan
    Stefan Behnel, Jun 27, 2009
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tlviewer
    Replies:
    3
    Views:
    752
    tlviewer
    May 15, 2005
  2. globophobe

    ElementTree.fromstring(unicode_html)

    globophobe, Jan 26, 2008, in forum: Python
    Replies:
    2
    Views:
    424
    Fredrik Lundh
    Jan 27, 2008
  3. Stefan Behnel
    Replies:
    0
    Views:
    772
    Stefan Behnel
    May 4, 2010
  4. Barak, Ron
    Replies:
    1
    Views:
    1,162
    John Machin
    May 5, 2010
  5. Terry Reedy
    Replies:
    1
    Views:
    526
    John Machin
    May 5, 2010
Loading...

Share This Page