Help with optimisation

Discussion in 'Python' started by special_dragonfly, Aug 13, 2007.

  1. Hello,
    I know this might be a little cheeky, and if it is, please say, but I need a
    little hand optimising some code. For the simple reason that this is
    'company' code and I have no idea what I'm allowed to release and not as the
    case may be I've changed anything that could give an indication of the
    company - if that makes any sense...

    for the code below:
    text_buffer is a single record from an XML stream. I can't read in the
    entire XML at once because it isn't all available straight away, so I
    capture line by line, and when a full message is available I use parseString
    under the minidom API.
    The SQL version is SQLite. It was recommended to me, and is adequate for the
    uses I put it to.
    The function doesn't return anything, but it's called often enough and
    depending on the optimisation I'll be able to use the same style in other
    areas of the program.

    previous code:
    def CreatePerson(text_buffer):
    dom=xml.dom.minidom.parseString(text_buffer)
    reflist = dom.getElementsByTagName('Country')
    Country = reflist[0].firstChild.nodeValue
    reflist = dom.getElementsByTagName('Age')
    Age = reflist[0].firstChild.nodeValue
    reflist = dom.getElementsByTagName('Surname')
    Surname = reflist[0].firstChild.nodeValue
    reflist = dom.getElementsByTagName('Forename')
    Forename = reflist[0].firstChild.nodeValue
    cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
    Surname, Age, Country))
    connection.commit()

    I've changed it now to this:
    def CreatePerson(text_buffer):
    dom=xml.dom.minidom.parseString(text_buffer)
    elements=['Country','Age','Surname','Forename']
    Values=[]
    for element in elements:
    reflist=dom.getElementsByTagName(element)
    Values.append(reflist[0].firstChild.nodeValue)
    # I can get away with the above because I know the structure of the
    XML
    cursor.execute('INSERT INTO Person
    VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
    connection.commit()

    They both seem ugly IMO (read: longer than intuitively necessary), and so I
    was wondering whether there was any way to combine Forename and Surname
    together within the Values list (think merge cells with the '-' in between)
    so I could use the unary(?) operator within the SQL?

    I suppose if this is a cheeky request then I won't get any replies.
    Thank you for any help
    Dominic
     
    special_dragonfly, Aug 13, 2007
    #1
    1. Advertising

  2. special_dragonfly <> wrote:
    ...
    > dom=xml.dom.minidom.parseString(text_buffer)


    If you need to optimize code that parses XML, use ElementTree (some
    other parsers are also fast, but minidom ISN'T).


    Alex
     
    Alex Martelli, Aug 13, 2007
    #2
    1. Advertising

  3. special_dragonfly a écrit :
    > Hello,

    (snip)
    > The function doesn't return anything, but it's called often enough and
    > depending on the optimisation I'll be able to use the same style in other
    > areas of the program.
    >
    > previous code:
    > def CreatePerson(text_buffer):
    > dom=xml.dom.minidom.parseString(text_buffer)
    > reflist = dom.getElementsByTagName('Country')
    > Country = reflist[0].firstChild.nodeValue
    > reflist = dom.getElementsByTagName('Age')
    > Age = reflist[0].firstChild.nodeValue
    > reflist = dom.getElementsByTagName('Surname')
    > Surname = reflist[0].firstChild.nodeValue
    > reflist = dom.getElementsByTagName('Forename')
    > Forename = reflist[0].firstChild.nodeValue
    > cursor.execute('INSERT INTO Person VALUES(?,?,?)', (Forename + "-" +
    > Surname, Age, Country))
    > connection.commit()
    >
    > I've changed it now to this:
    > def CreatePerson(text_buffer):
    > dom=xml.dom.minidom.parseString(text_buffer)
    > elements=['Country','Age','Surname','Forename']
    > Values=[]
    > for element in elements:
    > reflist=dom.getElementsByTagName(element)
    > Values.append(reflist[0].firstChild.nodeValue)
    > # I can get away with the above because I know the structure of the
    > XML
    > cursor.execute('INSERT INTO Person
    > VALUES(?,?,?)',(Forename+"-"+Surname,Age,Country))
    > connection.commit()


    A common python optimisation trick is to stote local references to save
    on attribute lookup time, ie:

    # local ref to parseString
    import dom
    dom_parseString=xml.dom.minidom.parseString

    def CreatePerson(text_buffer):
    dom = dom_parseString(text_buffer)
    elements=['Country','Age','Surname','Forename']
    values=[]
    getElementByTagName = dom.getElementsByTagName
    for element in elements:
    reflist = getElementsByTagName(element)
    values.append(reflist[0].firstChild.nodeValue)


    But as Alex already pointed out, you'd be better using (c)ElementTree.

    > They both seem ugly IMO (read: longer than intuitively necessary),


    I'd say this is a common problem with XML :-/
     
    Bruno Desthuilliers, Aug 13, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fredrik Ramsberg

    Optimisation of regexps in Perl?

    Fredrik Ramsberg, Oct 14, 2003, in forum: Perl
    Replies:
    2
    Views:
    474
    Fredrik Ramsberg
    Oct 15, 2003
  2. Roedy Green

    boolean loop optimisation

    Roedy Green, Sep 11, 2003, in forum: Java
    Replies:
    8
    Views:
    2,826
    Chris Uppal
    Sep 12, 2003
  3. sorry.no.email@post_NG.com

    Search Engine Optimisation

    sorry.no.email@post_NG.com, May 8, 2006, in forum: HTML
    Replies:
    0
    Views:
    351
    sorry.no.email@post_NG.com
    May 8, 2006
  4. Farraige
    Replies:
    4
    Views:
    286
    Farraige
    Nov 8, 2006
  5. Martin DeMello

    Optimisation help needed

    Martin DeMello, Feb 23, 2005, in forum: Ruby
    Replies:
    9
    Views:
    118
    Martin DeMello
    Feb 24, 2005
Loading...

Share This Page