parsing XML

K

kaklis

Hi to all, let's say we have the following Xml
<team>
<player name='Mick Fowler' age='27' height='1.96m'>
<points>17.1</points>
<rebounds>6.4</rebounds>
</player>
<player name='Ivan Ivanovic' age='29' height='2.04m'>
<points>15.5</points>
<rebounds>7.8</rebounds>
</player>
</team>

How can i get the players name, age and height?
DOM or SAX and how

Thanks
Antonis
 
M

Martin v. Loewis

Hi to all, let's say we have the following Xml
<team>
<player name='Mick Fowler' age='27' height='1.96m'>
<points>17.1</points>
<rebounds>6.4</rebounds>
</player>
<player name='Ivan Ivanovic' age='29' height='2.04m'>
<points>15.5</points>
<rebounds>7.8</rebounds>
</player>
</team>

How can i get the players name, age and height?
DOM or SAX and how

Homework?

Martin
 
S

Stefan Behnel

(e-mail address removed), 14.05.2010 16:57:
Hi to all, let's say we have the following Xml
<team>
<player name='Mick Fowler' age='27' height='1.96m'>
<points>17.1</points>
<rebounds>6.4</rebounds>
</player>
<player name='Ivan Ivanovic' age='29' height='2.04m'>
<points>15.5</points>
<rebounds>7.8</rebounds>
</player>
</team>

How can i get the players name, age and height?

Here's an overly complicated solution, but I thought that an object
oriented design would help here.


import xml.etree.ElementTree as ET

class Player(object):
def __init__(self, name, age, height):
self.name, self.age, self.height = name, age, height

attributes = ['name', 'age', 'height']

players = []
for _, element in ET.iterparse("teamfile.xml"):
if element.tag == 'player':
players.append(
Player(*[ element.get(attr) for attr in attributes ]))

for player in players:
print player.name, player.age, player.height


Stefan
 
K

kaklis

(e-mail address removed), 14.05.2010 16:57:
Hi to all, let's say we have the following Xml
<team>
   <player name='Mick Fowler' age='27' height='1.96m'>
     <points>17.1</points>
     <rebounds>6.4</rebounds>
   </player>
   <player name='Ivan Ivanovic' age='29' height='2.04m'>
     <points>15.5</points>
     <rebounds>7.8</rebounds>
   </player>
</team>
How can i get the players name, age and height?

Here's an overly complicated solution, but I thought that an object
oriented design would help here.

   import xml.etree.ElementTree as ET

   class Player(object):
      def __init__(self, name, age, height):
          self.name, self.age, self.height = name, age, height

   attributes = ['name', 'age', 'height']

   players = []
   for _, element in ET.iterparse("teamfile.xml"):
       if element.tag == 'player':
           players.append(
               Player(*[ element.get(attr) for attr in attributes ]))

   for player in players:
       print player.name, player.age, player.height

Stefan

Thanks stefan!

A.K.
 
S

Stefan Behnel

Martin v. Loewis, 14.05.2010 17:15:
Homework?

I would hope that every school teacher who teaches Python is able to skip
through c.l.py and the python-tutor list before accepting a homework result.

Stefan
 
A

Adam Tauno Williams

from lxml import etree
handle = open('file', 'rb')
doc = etree.parse(handle)
handle.close()
players = [ ]
for player in doc.xpath('/team/player'):
players.append({ 'name': player.xpath('./@name')[0],
'age': player.xpath('./@age')[0],
'height': player.xpath('./@height')[0] } )
print players
 
M

Martin v. Loewis

Hi to all, let's say we have the following Xml
I would hope that every school teacher who teaches Python is able to
skip through c.l.py and the python-tutor list before accepting a
homework result.

If he uses your proposed solution, it probably wouldn't pass, anyway,
because it's neither DOM nor SAX. If he's really interested in a
solution to the original problem, then ElementTree is fine, of course.

As for teachers scanning relevant forums: that's often impractical.
For example, for an XML lecture, choice of programming language may be
to the student. You then have to search web forums, mailing lists, and
newsgroups for Java, Python, C#, Ruby, Scala, plus StackOverflow.

Solutions copied from the net often show a level of cuteness beyond
what you'd expect from a student (like your solution: who'd be using
reflection to access three attributes?). So you rather take these clues
as the starting point for an investigation (and then hope that Google
comes up with the specific source code).

Of course, it may also be that getting help is explicitly allowed.

Regards,
Martin
 
L

Lawrence D'Oliveiro

Stefan said:
Here's an overly complicated solution, but I thought that an object
oriented design would help here.

How many times are you going to write the “"name", "age", "height"â€
sequence? The next assignment question I would ask is: how easy would it be
to add a fourth attribute?
attributes = ['name', 'age', 'height']

I would put this at the top.

Then this
class Player(object):
def __init__(self, name, age, height):
self.name, self.age, self.height = name, age, height

can become

class Player(object):
def __init__(self, **rest):
for attr in attributes :
setattr(self, attr, rest[attr])
#end for
#end __init__
#end Player

and
for player in players:
print player.name, player.age, player.height

can become

for player in players:
print " ".join(getattr(player, attr) for attr in attributes)
#end for
 
P

Pietro Campesato

Hi to all, let's say we have the following Xml
<team>
  <player name='Mick Fowler' age='27' height='1.96m'>
    <points>17.1</points>
    <rebounds>6.4</rebounds>
  </player>
  <player name='Ivan Ivanovic' age='29' height='2.04m'>
    <points>15.5</points>
    <rebounds>7.8</rebounds>
  </player>
</team>

How can i get the players name, age and height?
DOM or SAX and how

Thanks
Antonis

I've found some code which converts an XML string to a dictionary
here:
http://nonplatonic.com/ben.php?title=python_xml_to_dict_bow_to_my_recursive_g&more=1&c=1&tb=1&pb=1

Once your data is in a dictionary extracting info will be much easier.
 
J

Jake b

Check out Amara: http://www.xml3k.org/Amara/QuickRef

It looks promising. For a pythonic solution over sax / dom.
Iter(doc.team.player) # or
doc.team.player[0].name

[ new to the list, so I'm not sure why my previous response failed. Is
it on me? Because using iPod, vs thunderbird?

However, looking at superpollo, the reply-to didn't include
python-list. Or even a valid address ?

Message below. Ty.

Delivery to the following recipient failed permanently:

    (e-mail address removed)

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other server returned was: 554 554 <[email protected]>: Relay access denied (state 14).

----- Original message -----

MIME-Version: 1.0
Received: by 10.91.208.24 with SMTP id k24mr18590agq.155.1273969257113; Sat,
       15 May 2010 17:20:57 -0700 (PDT)
Received: by 10.90.80.20 with HTTP; Sat, 15 May 2010 17:20:57 -0700 (PDT)
In-Reply-To: <[email protected]>
References: <fd1e3290-49fa-4113-bc63-5886611d78b1@s29g2000yqd.googlegroups.com>
        <[email protected]>
Date: Sat, 15 May 2010 19:20:57 -0500
Message-ID: <[email protected]>
Subject: Re: parsing XML
From: Jake b <[email protected]>
To: superpollo <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1

Check out Amara: http://www.xml3k.org/Amara/QuickRef
For a pythonic solution over sax / dom.
Iter(doc.team.player) or
doc.team.player[0].name
 
S

Stefan Behnel

Jake b, 16.05.2010 09:40:
Check out Amara: http://www.xml3k.org/Amara/QuickRef

It looks promising. For a pythonic solution over sax / dom.
Iter(doc.team.player) # or
doc.team.player[0].name

Ah, right, and there's also lxml.objectify:

from lxml.objectify import parse

root = parse('teamfile.xml').getroot()

for player in root.player:
print(player.attrib)

Prints an attribute dict for each player.

Stefan
 
K

kaklis

Jake b, 16.05.2010 09:40:
It looks promising. For a pythonic solution over sax / dom.
Iter(doc.team.player) # or
doc.team.player[0].name

Ah, right, and there's also lxml.objectify:

     from lxml.objectify import parse

     root = parse('teamfile.xml').getroot()

     for player in root.player:
         print(player.attrib)

Prints an attribute dict for each player.

Stefan

Thank you all so much!!!
And it's not for a homework.
Just trying to find out the style DOM, SAX, ElementTree etc. that fits
better to my kind of thinking.
Antonis
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,112
Latest member
VinayKumar Nevatia
Top