ElementTree Issue - Search and remove elements


T

Tharanga Abeyseela

Hi Guys,

I need to remove the parent node, if a particular match found.

ex:


<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
<TVEpisode>
<Provider>0x5</Provider>
<ItemId>http://fxxxxxxl</ItemId>
<Title>WWE</Title>
<SortTitle>WWE </SortTitle>
<Description>WWE</Description>
<IsUserGenerated>false</IsUserGenerated>
<Images>
<Image>
<ImagePurpose>BoxArt</ImagePurpose>
<Url>https://xxxxxx.xx/@006548-thumb.jpg</Url>
</Image>
</Images>
<LastModifiedDate>2012-10-16T00:00:19.814+11:00</LastModifiedDate>
<Genres>
<Genre>xxxxx</Genre>
</Genres>
<ParentalControl>
<System>xxxx</System>
<Rating>M</Rating>


if i found <Rating>NC</Rating>, i need to remove the <TVEpisode> from
the XML. i have TVseries,Movies,and several items. (they also have
Rating element). i need to remove all if i found the NC keyword.inside
<Ratging>


im using following code.

when i do the following on python shell i can see the result (NC,M,etc)
'NC'

but when i do this inside the script, im getting the following error.

Traceback (most recent call last):
File "./test.py", line 10, in ?
x = child.find('Rating').text
AttributeError: 'NoneType' object has no attribute 'text'


but how should i remove the parent node if i found the string "NC" i
need to do this for all elements (TVEpisode,Movies,TVshow etc)
how can i use python to remove the parent node if that string found.
(not only TVEpisodes, but others as well)


#!/usr/bin/env python

import elementtree.ElementTree as ET

tree = ET.parse('test.xml')
root = tree.getroot()


for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
x = child.find('Rating').text
if child[1].text == 'NC':
print "found"
root.remove('TVEpisode') ?????
tree.write('output.xml')


Really appreciate your thoughts on this.

Thanks in advance,
Tharanga
 
Ad

Advertisements

A

Alain Ketterlin

Tharanga Abeyseela said:
I need to remove the parent node, if a particular match found.

It looks like you can't get the parent of an Element with elementtree (I
would love to be proven wrong on this).

The solution is to find all nodes that have a Rating (grand-) child, and
then test explicitly for the value you're looking for.
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
<TVEpisode> [...]
<ParentalControl>
<System>xxxx</System>
<Rating>M</Rating>


for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
x = child.find('Rating').text
if child[1].text == 'NC':
print "found"
root.remove('TVEpisode') ?????

Your code doesn't work because findall() already returns Rating
elements, and these have no Rating child (so your first call to find()
fails, i.e., returns None). And list indexes starts at 0, btw.

Also, Rating is not a child of TVEpisode, it is a child of
ParentalControl.

Here is my suggestion:

# Find nodes having a ParentalControl child
for child in root.findall(".//*[ParentalControl]"):
x = child.find("ParentalControl/Rating").text
if x == "NC":
...

Note that a complete XPath implementation would make that simpler: your
query basically is //*[ParentalControl/Rating=="NC"]

-- Alain.
 
Ad

Advertisements

S

Stefan Behnel

Alain Ketterlin, 17.10.2012 08:25:
It looks like you can't get the parent of an Element with elementtree (I
would love to be proven wrong on this).

No, that's by design. ElementTree allows you to reuse subtrees in a
document, for example, which wouldn't work if you enforced a single parent.
Also, keeping parent references out simplifies the tree structure
considerably, saves space and time and all that. ElementTree is really
great for what it does.

If you need to access the parent more often in a read-only tree, you can
quickly build up a back reference dict that maps each Element to its parent
by traversing the tree once.

Alternatively, use lxml.etree, in which Elements have a getparent() method
and in which single parents are enforced (also by design).

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top