ElementTree Issue - Search and remove elements

Discussion in 'Python' started by Tharanga Abeyseela, Oct 17, 2012.

  1. Hi Guys,

    I need to remove the parent node, if a particular match found.

    ex:


    <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
    <Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
    <TVEpisode>
    <Provider>0x5</Provider>
    <ItemId>http://fxxxxxxl</ItemId>
    <Title>WWE</Title>
    <SortTitle>WWE </SortTitle>
    <Description>WWE</Description>
    <IsUserGenerated>false</IsUserGenerated>
    <Images>
    <Image>
    <ImagePurpose>BoxArt</ImagePurpose>
    <Url>https://xxxxxx.xx/@006548-thumb.jpg</Url>
    </Image>
    </Images>
    <LastModifiedDate>2012-10-16T00:00:19.814+11:00</LastModifiedDate>
    <Genres>
    <Genre>xxxxx</Genre>
    </Genres>
    <ParentalControl>
    <System>xxxx</System>
    <Rating>M</Rating>


    if i found <Rating>NC</Rating>, i need to remove the <TVEpisode> from
    the XML. i have TVseries,Movies,and several items. (they also have
    Rating element). i need to remove all if i found the NC keyword.inside
    <Ratging>


    im using following code.

    when i do the following on python shell i can see the result (NC,M,etc)

    >>> x[1].text

    'NC'

    but when i do this inside the script, im getting the following error.

    Traceback (most recent call last):
    File "./test.py", line 10, in ?
    x = child.find('Rating').text
    AttributeError: 'NoneType' object has no attribute 'text'


    but how should i remove the parent node if i found the string "NC" i
    need to do this for all elements (TVEpisode,Movies,TVshow etc)
    how can i use python to remove the parent node if that string found.
    (not only TVEpisodes, but others as well)


    #!/usr/bin/env python

    import elementtree.ElementTree as ET

    tree = ET.parse('test.xml')
    root = tree.getroot()


    for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
    x = child.find('Rating').text
    if child[1].text == 'NC':
    print "found"
    root.remove('TVEpisode') ?????
    tree.write('output.xml')


    Really appreciate your thoughts on this.

    Thanks in advance,
    Tharanga
    Tharanga Abeyseela, Oct 17, 2012
    #1
    1. Advertising

  2. Tharanga Abeyseela <> writes:

    > I need to remove the parent node, if a particular match found.


    It looks like you can't get the parent of an Element with elementtree (I
    would love to be proven wrong on this).

    The solution is to find all nodes that have a Rating (grand-) child, and
    then test explicitly for the value you're looking for.

    > <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
    > <Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
    > <TVEpisode>

    [...]
    > <ParentalControl>
    > <System>xxxx</System>
    > <Rating>M</Rating>



    > for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
    > x = child.find('Rating').text
    > if child[1].text == 'NC':
    > print "found"
    > root.remove('TVEpisode') ?????


    Your code doesn't work because findall() already returns Rating
    elements, and these have no Rating child (so your first call to find()
    fails, i.e., returns None). And list indexes starts at 0, btw.

    Also, Rating is not a child of TVEpisode, it is a child of
    ParentalControl.

    Here is my suggestion:

    # Find nodes having a ParentalControl child
    for child in root.findall(".//*[ParentalControl]"):
    x = child.find("ParentalControl/Rating").text
    if x == "NC":
    ...

    Note that a complete XPath implementation would make that simpler: your
    query basically is //*[ParentalControl/Rating=="NC"]

    -- Alain.
    Alain Ketterlin, Oct 17, 2012
    #2
    1. Advertising

  3. Alain Ketterlin, 17.10.2012 08:25:
    > It looks like you can't get the parent of an Element with elementtree (I
    > would love to be proven wrong on this).


    No, that's by design. ElementTree allows you to reuse subtrees in a
    document, for example, which wouldn't work if you enforced a single parent.
    Also, keeping parent references out simplifies the tree structure
    considerably, saves space and time and all that. ElementTree is really
    great for what it does.

    If you need to access the parent more often in a read-only tree, you can
    quickly build up a back reference dict that maps each Element to its parent
    by traversing the tree once.

    Alternatively, use lxml.etree, in which Elements have a getparent() method
    and in which single parents are enforced (also by design).

    Stefan
    Stefan Behnel, Oct 17, 2012
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kee Nethery
    Replies:
    12
    Views:
    2,076
    Stefan Behnel
    Jun 27, 2009
  2. Stefan Behnel
    Replies:
    0
    Views:
    164
    Stefan Behnel
    Oct 17, 2012
  3. Tharanga Abeyseela
    Replies:
    0
    Views:
    137
    Tharanga Abeyseela
    Oct 17, 2012
  4. Tharanga Abeyseela

    re:ElementTree Issue - Search and remove elements

    Tharanga Abeyseela, Oct 17, 2012, in forum: Python
    Replies:
    0
    Views:
    148
    Tharanga Abeyseela
    Oct 17, 2012
  5. Tharanga Abeyseela

    RE:ElementTree Issue - Search and remove elements

    Tharanga Abeyseela, Oct 18, 2012, in forum: Python
    Replies:
    0
    Views:
    161
    Tharanga Abeyseela
    Oct 18, 2012
Loading...

Share This Page