xmltodict - TypeError: list indices must be integers, not str

F

flebber

I am using xmltodict.

This is how I have accessed and loaded my file.

import xmltodict
document = open("/home/sayth/Scripts/va_benefits/20140508GOSF0.xml", "r")
read_doc = document.read()
xml_doc = xmltodict.parse(read_doc)

The start of the file I am trying to get data out of is.

<meeting id="35483" barriertrial="0" venue="Gosford" date="2014-05-08T00:00:00" gearchanges="-1" stewardsreport="-1" gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB" rail="True" weather="Fine " trackcondition="Dead " nomsdeadline="2014-05-02T11:00:00" weightsdeadline="2014-05-05T16:00:00" acceptdeadline="2014-05-06T09:00:00" jockeydeadline="2014-05-06T12:00:00">
<club abbrevname="Gosford Race Club" code="49" associationclass="2"website="http://" />
<race id="185273" number="1" nomnumber="7" division="0" name="GOSFORD ROTARY MAIDEN HANDICAP" mediumname="MDN" shortname="MDN" stage="Acceptances" distance="1600" minweight="55" raisedweight="0" class="MDN " age="~ " grade="0" weightcondition="HCP " trophy="0" owner="0" trainer="0" jockey="0" strapper="0" totalprize="22000" first="12250" second="4250" third="2100" fourth="1000" fifth="525" time="2014-05-08T12:30:00" bonustype="BX02 " nomsfee="0" acceptfee="0" trackcondition=" " timingmethod=" " fastesttime=" " sectionaltime=" " formavailable="0" racebookprize="Of $22000. First $12250, second $4250, third $2100, fourth $1000, fifth $525, sixth $375, seventh $375, eighth $375, ninth $375, tenth $375">
<condition line="1">

So thought I had it figured. Can access the elements of meeting and the elements of club such as by doing this.

In [5]: xml_doc['meeting']['club']['@abbrevname']
Out[5]: u'Gosford Race Club'

However whenever I try and access race in the same manner I get errors.

In [11]: xml_doc['meeting']['club']['race']['@id']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-11-cce362d7e6fc> in <module>()
----> 1 xml_doc['meeting']['club']['race']['@id']

KeyError: 'race'

In [12]: xml_doc['meeting']['race']['@id']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-c304e2b8f9be> in <module>()
----> 1 xml_doc['meeting']['race']['@id']

TypeError: list indices must be integers, not str

why is accessing race @id any different to the access of club @abbrevname and how do I get it for race?

Thanks

Sayth
 
P

Peter Otten

flebber said:
I am using xmltodict.

This is how I have accessed and loaded my file.

import xmltodict
document = open("/home/sayth/Scripts/va_benefits/20140508GOSF0.xml", "r")
read_doc = document.read()
xml_doc = xmltodict.parse(read_doc)

The start of the file I am trying to get data out of is.

<meeting id="35483" barriertrial="0" venue="Gosford"
date="2014-05-08T00:00:00" gearchanges="-1" stewardsreport="-1"
gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB"
rail="True" weather="Fine " trackcondition="Dead "
nomsdeadline="2014-05-02T11:00:00" weightsdeadline="2014-05-05T16:00:00"
acceptdeadline="2014-05-06T09:00:00" jockeydeadline="2014-05-06T12:00:00">
<club abbrevname="Gosford Race Club" code="49" associationclass="2"
website="http://" />
<race id="185273" number="1" nomnumber="7" division="0" name="GOSFORD
ROTARY MAIDEN HANDICAP" mediumname="MDN" shortname="MDN"
stage="Acceptances" distance="1600" minweight="55" raisedweight="0"
class="MDN " age="~ " grade="0" weightcondition="HCP
" trophy="0" owner="0" trainer="0" jockey="0" strapper="0"
totalprize="22000" first="12250" second="4250" third="2100"
fourth="1000" fifth="525" time="2014-05-08T12:30:00" bonustype="BX02
" nomsfee="0" acceptfee="0" trackcondition=" " timingmethod="
" fastesttime=" " sectionaltime=" "
formavailable="0" racebookprize="Of $22000. First $12250, second $4250,
third $2100, fourth $1000, fifth $525, sixth $375, seventh $375, eighth
$375, ninth $375, tenth $375">
<condition line="1">

So thought I had it figured. Can access the elements of meeting and the
elements of club such as by doing this.

In [5]: xml_doc['meeting']['club']['@abbrevname']
Out[5]: u'Gosford Race Club'

However whenever I try and access race in the same manner I get errors.

In [11]: xml_doc['meeting']['club']['race']['@id']
---------------------------------------------------------------------------
KeyError Traceback (most recent call
last) <ipython-input-11-cce362d7e6fc> in <module>()
----> 1 xml_doc['meeting']['club']['race']['@id']

KeyError: 'race'

In [12]: xml_doc['meeting']['race']['@id']
---------------------------------------------------------------------------
TypeError Traceback (most recent call
last) <ipython-input-12-c304e2b8f9be> in <module>()
----> 1 xml_doc['meeting']['race']['@id']

TypeError: list indices must be integers, not str

why is accessing race @id any different to the access of club @abbrevname
and how do I get it for race?

If I were to guess: there are multiple races per meeting, xmltodict puts
them into a list under the "race" key, and you have to pick one:
.... <meeting>
.... <race id="first race">...</race>
.... <race id="second race">...</race>
.... said:
type(doc["meeting"]["race"])
doc["meeting"]["race"][0]["@id"] 'first race'
doc["meeting"]["race"][1]["@id"]
'second race'

So

xml_doc['meeting']['race'][0]['@id']

or

for race in xml_doc["meeting"]["race"]:
print(race["@id"])

might work for you.
 
F

flebber

flebber wrote:


I am using xmltodict.

This is how I have accessed and loaded my file.

import xmltodict
document = open("/home/sayth/Scripts/va_benefits/20140508GOSF0.xml", "r")
read_doc = document.read()
xml_doc = xmltodict.parse(read_doc)

The start of the file I am trying to get data out of is.

<meeting id="35483" barriertrial="0" venue="Gosford"
date="2014-05-08T00:00:00" gearchanges="-1" stewardsreport="-1"
gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB"
rail="True" weather="Fine " trackcondition="Dead "
nomsdeadline="2014-05-02T11:00:00" weightsdeadline="2014-05-05T16:00:00"
acceptdeadline="2014-05-06T09:00:00" jockeydeadline="2014-05-06T12:00:00">
<club abbrevname="Gosford Race Club" code="49" associationclass="2"
website="http://" />
<race id="185273" number="1" nomnumber="7" division="0" name="GOSFORD
ROTARY MAIDEN HANDICAP" mediumname="MDN" shortname="MDN"
stage="Acceptances" distance="1600" minweight="55" raisedweight="0"
class="MDN " age="~ " grade="0" weightcondition="HCP
" trophy="0" owner="0" trainer="0" jockey="0" strapper="0"
totalprize="22000" first="12250" second="4250" third="2100"
fourth="1000" fifth="525" time="2014-05-08T12:30:00" bonustype="BX02
" nomsfee="0" acceptfee="0" trackcondition=" " timingmethod="
" fastesttime=" " sectionaltime=" "
formavailable="0" racebookprize="Of $22000. First $12250, second $4250,
third $2100, fourth $1000, fifth $525, sixth $375, seventh $375, eighth
$375, ninth $375, tenth $375">
<condition line="1">

So thought I had it figured. Can access the elements of meeting and the
elements of club such as by doing this.
In [5]: xml_doc['meeting']['club']['@abbrevname']
Out[5]: u'Gosford Race Club'
However whenever I try and access race in the same manner I get errors.
In [11]: xml_doc['meeting']['club']['race']['@id']
last) <ipython-input-11-cce362d7e6fc> in <module>()
----> 1 xml_doc['meeting']['club']['race']['@id']
KeyError: 'race'
In [12]: xml_doc['meeting']['race']['@id']
last) <ipython-input-12-c304e2b8f9be> in <module>()
----> 1 xml_doc['meeting']['race']['@id']
TypeError: list indices must be integers, not str
why is accessing race @id any different to the access of club @abbrevname
and how do I get it for race?



If I were to guess: there are multiple races per meeting, xmltodict puts

them into a list under the "race" key, and you have to pick one:



... <meeting>

... <race id="first race">...</race>

... <race id="second race">...</race>

... </meeting>

... """)
type(doc["meeting"]["race"])
doc["meeting"]["race"][0]["@id"]

'first race'
doc["meeting"]["race"][1]["@id"]

'second race'



So



xml_doc['meeting']['race'][0]['@id']



or



for race in xml_doc["meeting"]["race"]:

print(race["@id"])



might work for you.

Thanks so much Peter, yes both worked indeed.

Sayth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top