Help with optimisation

S

special_dragonfly

Hello,
I know this might be a little cheeky, and if it is, please say, but I need a
little hand optimising some code. For the simple reason that this is
'company' code and I have no idea what I'm allowed to release and not as the
case may be I've changed anything that could give an indication of the
company - if that makes any sense...

for the code below:
text_buffer is a single record from an XML stream. I can't read in the
entire XML at once because it isn't all available straight away, so I
capture line by line, and when a full message is available I use parseString
under the minidom API.
The SQL version is SQLite. It was recommended to me, and is adequate for the
uses I put it to.
The function doesn't return anything, but it's called often enough and
depending on the optimisation I'll be able to use the same style in other
areas of the program.

previous code:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
reflist = dom.getElementsByTagName('Country')
Country = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Age')
Age = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Surname')
Surname = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Forename')
Forename = reflist[0].firstChild.nodeValue
cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
Surname, Age, Country))
connection.commit()

I've changed it now to this:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
Values=[]
for element in elements:
reflist=dom.getElementsByTagName(element)
Values.append(reflist[0].firstChild.nodeValue)
# I can get away with the above because I know the structure of the
XML
cursor.execute('INSERT INTO Person
VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
connection.commit()

They both seem ugly IMO (read: longer than intuitively necessary), and so I
was wondering whether there was any way to combine Forename and Surname
together within the Values list (think merge cells with the '-' in between)
so I could use the unary(?) operator within the SQL?

I suppose if this is a cheeky request then I won't get any replies.
Thank you for any help
Dominic
 
A

Alex Martelli

special_dragonfly said:
dom=xml.dom.minidom.parseString(text_buffer)

If you need to optimize code that parses XML, use ElementTree (some
other parsers are also fast, but minidom ISN'T).


Alex
 
B

Bruno Desthuilliers

special_dragonfly a écrit :
Hello, (snip)
The function doesn't return anything, but it's called often enough and
depending on the optimisation I'll be able to use the same style in other
areas of the program.

previous code:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
reflist = dom.getElementsByTagName('Country')
Country = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Age')
Age = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Surname')
Surname = reflist[0].firstChild.nodeValue
reflist = dom.getElementsByTagName('Forename')
Forename = reflist[0].firstChild.nodeValue
cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
Surname, Age, Country))
connection.commit()

I've changed it now to this:
def CreatePerson(text_buffer):
dom=xml.dom.minidom.parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
Values=[]
for element in elements:
reflist=dom.getElementsByTagName(element)
Values.append(reflist[0].firstChild.nodeValue)
# I can get away with the above because I know the structure of the
XML
cursor.execute('INSERT INTO Person
VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
connection.commit()

A common python optimisation trick is to stote local references to save
on attribute lookup time, ie:

# local ref to parseString
import dom
dom_parseString=xml.dom.minidom.parseString

def CreatePerson(text_buffer):
dom = dom_parseString(text_buffer)
elements=['Country','Age','Surname','Forename']
values=[]
getElementByTagName = dom.getElementsByTagName
for element in elements:
reflist = getElementsByTagName(element)
values.append(reflist[0].firstChild.nodeValue)


But as Alex already pointed out, you'd be better using (c)ElementTree.
They both seem ugly IMO (read: longer than intuitively necessary),

I'd say this is a common problem with XML :-/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top