Ignoring XML Namespaces with ElementTree

Discussion in 'Python' started by Pete, Dec 3, 2009.

  1. Pete

    Pete Guest

    Is there anyway to configure ElementTree to ignore the XML namespace?
    For the past couple months, I've been using minidom to parse an XML
    file that is generated by a unit within my organization that can't
    stick with a standard. This hasnt been a problem until recently when
    the script was provided a 30MB file that once parsed, increased the
    python memory footprint by 1.0GB and now I'm running into Memory
    Errors. Based on Google searches and testing it looks like ElementTree
    is much more efficient with memory and I'd like to switch, however I'd
    like to be able to ignore the namespaces. These XML files tend to
    randomly switch the namespace for no reason and ignoring these
    namespaces would help the script adapt to the changes. Any help on
    this would be greatly appreciated. I'm having a hard time finding the
    answer.

    Additionally, anyone know how ElementTree handle's XML elements that
    include Unicode?
     
    Pete, Dec 3, 2009
    #1
    1. Advertising

  2. Pete, 03.12.2009 19:21:
    > Is there anyway to configure ElementTree to ignore the XML namespace?
    > For the past couple months, I've been using minidom to parse an XML
    > file that is generated by a unit within my organization that can't
    > stick with a standard. This hasnt been a problem until recently when
    > the script was provided a 30MB file that once parsed, increased the
    > python memory footprint by 1.0GB and now I'm running into Memory
    > Errors. Based on Google searches and testing it looks like ElementTree
    > is much more efficient with memory and I'd like to switch,


    Make sure you use cElementTree, then that's certainly the right choice to make.


    > however I'd
    > like to be able to ignore the namespaces. These XML files tend to
    > randomly switch the namespace for no reason and ignoring these
    > namespaces would help the script adapt to the changes. Any help on
    > this would be greatly appreciated. I'm having a hard time finding the
    > answer.


    ET uses namespace URIs as part of the tag name, so if you want to ignore
    namespaces, just strip the leading "{...}" (if any) from the tag and work
    with the rest (so-called "local name").


    > Additionally, anyone know how ElementTree handle's XML elements that
    > include Unicode?


    It's an XML parser, so the answer is: without any difficulties.

    Stefan
     
    Stefan Behnel, Dec 3, 2009
    #2
    1. Advertising

  3. Pete

    Pete Guest

    On Dec 3, 2:55 pm, Stefan Behnel <> wrote:
    > Pete, 03.12.2009 19:21:
    >
    > > Is there anyway to configure ElementTree to ignore the XML namespace?
    > > For the past couple months, I've been using minidom to parse an XML
    > > file that is generated by a unit within my organization that can't
    > > stick with a standard. This hasnt been a problem until recently when
    > > the script was provided a 30MB file that once parsed, increased the
    > > python memory footprint by 1.0GB and now I'm running into Memory
    > > Errors. Based on Google searches and testing it looks like ElementTree
    > > is much more efficient with memory and I'd like to switch,

    >
    > Make sure you use cElementTree, then that's certainly the right choice to make.
    >
    > > however I'd
    > > like to be able to ignore the namespaces. These XML files tend to
    > > randomly switch the namespace for no reason and ignoring these
    > > namespaces would help the script adapt to the changes. Any help on
    > > this would be greatly appreciated. I'm having a hard time finding the
    > > answer.

    >
    > ET uses namespace URIs as part of the tag name, so if you want to ignore
    > namespaces, just strip the leading "{...}" (if any) from the tag and work
    > with the rest (so-called "local name").
    >
    > > Additionally, anyone know how ElementTree handle's XML elements that
    > > include Unicode?

    >
    > It's an XML parser, so the answer is: without any difficulties.
    >
    > Stefan


    Perfect... I can work with that. Thanks.
     
    Pete, Dec 3, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Bengtsson

    ElementTree and namespaces in the header only

    Peter Bengtsson, Jan 15, 2008, in forum: Python
    Replies:
    3
    Views:
    809
    Peter Bengtsson
    Jan 17, 2008
  2. Kee Nethery
    Replies:
    12
    Views:
    2,100
    Stefan Behnel
    Jun 27, 2009
  3. dmtr
    Replies:
    10
    Views:
    2,924
  4. Joe Kesselman

    Re: ignoring namespaces?

    Joe Kesselman, Jun 4, 2010, in forum: XML
    Replies:
    8
    Views:
    1,284
  5. Neil Cerutti
    Replies:
    0
    Views:
    262
    Neil Cerutti
    Jan 21, 2011
Loading...

Share This Page