Anyone happen to have optimization hints for this loop?

Discussion in 'Python' started by dp_pearce, Jul 9, 2008.

  1. dp_pearce

    dp_pearce Guest

    I have some code that takes data from an Access database and processes
    it into text files for another application. At the moment, I am using
    a number of loops that are pretty slow. I am not a hugely experienced
    python user so I would like to know if I am doing anything
    particularly wrong or that can be hugely improved through the use of
    another method.

    Currently, all of the values that are to be written to file are pulled
    from the database and into a list called "domainVa". These values
    represent 3D data and need to be written to text files using line
    breaks to seperate 'layers'. I am currently looping through the list
    and appending a string, which I then write to file. This list can
    regularly contain upwards of half a million values...

    count = 0
    dmntString = ""
    for z in range(0, Z):
    for y in range(0, Y):
    for x in range(0, X):
    fraction = domainVa[count]
    dmntString += " "
    dmntString += fraction
    count = count + 1
    dmntString += "\n"
    dmntString += "\n"
    dmntString += "\n***\n

    dmntFile = open(dmntFilename, 'wt')
    dmntFile.write(dmntString)
    dmntFile.close()

    I have found that it is currently taking ~3 seconds to build the
    string but ~1 second to write the string to file, which seems wrong (I
    would normally guess the CPU/Memory would out perform disc writing
    speeds).

    Can anyone see a way of speeding this loop up? Perhaps by changing the
    data format? Is it wrong to append a string and write once, or should
    hold a file open and write at each instance?

    Thank you in advance for your time,

    Dan
    dp_pearce, Jul 9, 2008
    #1
    1. Advertising

  2. dp_pearce wrote:

    > I have some code that takes data from an Access database and processes
    > it into text files for another application. At the moment, I am using
    > a number of loops that are pretty slow. I am not a hugely experienced
    > python user so I would like to know if I am doing anything
    > particularly wrong or that can be hugely improved through the use of
    > another method.
    >
    > Currently, all of the values that are to be written to file are pulled
    > from the database and into a list called "domainVa". These values
    > represent 3D data and need to be written to text files using line
    > breaks to seperate 'layers'. I am currently looping through the list
    > and appending a string, which I then write to file. This list can
    > regularly contain upwards of half a million values...
    >
    > count = 0
    > dmntString = ""
    > for z in range(0, Z):
    > for y in range(0, Y):
    > for x in range(0, X):
    > fraction = domainVa[count]
    > dmntString += " "
    > dmntString += fraction
    > count = count + 1
    > dmntString += "\n"
    > dmntString += "\n"
    > dmntString += "\n***\n
    >
    > dmntFile = open(dmntFilename, 'wt')
    > dmntFile.write(dmntString)
    > dmntFile.close()
    >
    > I have found that it is currently taking ~3 seconds to build the
    > string but ~1 second to write the string to file, which seems wrong (I
    > would normally guess the CPU/Memory would out perform disc writing
    > speeds).
    >
    > Can anyone see a way of speeding this loop up? Perhaps by changing the
    > data format? Is it wrong to append a string and write once, or should
    > hold a file open and write at each instance?


    Don't use in-place adding to concatenate strings. It might lead to
    quadaratic behavior.

    Use the "".join()-idiom instead:

    dmntStrings = []
    .....
    dmntStrings.append("\n")
    .....
    dmntFile.write("".join(dmntStrings))

    Diez
    Diez B. Roggisch, Jul 9, 2008
    #2
    1. Advertising

  3. dp_pearce

    Terry Reedy Guest

    Diez B. Roggisch wrote:
    > dp_pearce wrote:




    >>
    >> count = 0
    >> dmntString = ""
    >> for z in range(0, Z):
    >> for y in range(0, Y):
    >> for x in range(0, X):
    >> fraction = domainVa[count]
    >> dmntString += " "
    >> dmntString += fraction


    Minor point, just construct " "+domain[count] all at once

    >> count = count + 1
    >> dmntString += "\n"
    >> dmntString += "\n"
    >> dmntString += "\n***\n
    >>
    >> dmntFile = open(dmntFilename, 'wt')
    >> dmntFile.write(dmntString)
    >> dmntFile.close()
    >>
    >> I have found that it is currently taking ~3 seconds to build the
    >> string but ~1 second to write the string to file, which seems wrong (I
    >> would normally guess the CPU/Memory would out perform disc writing
    >> speeds).
    >>
    >> Can anyone see a way of speeding this loop up? Perhaps by changing the
    >> data format? Is it wrong to append a string and write once, or should
    >> hold a file open and write at each instance?

    >
    > Don't use in-place adding to concatenate strings. It might lead to
    > quadaratic behavior.


    Semantically, repeated extension of an immutable is inherently
    quadratic. And it is for strings in Python until 2.6 or possibly 2.5
    (not sure), when more sophisticated code was added because people kept
    falling into this trap. But since the more sophisticated code 'cheats'
    by mutating the immutable (with an algorithm similar to list.extend),I
    am pretty sure it will only be used when there is only one reference to
    the string and the extension is with +=, so that the extension can
    reliably be done in-place without changing semantics. Thus, Python
    programmers should still learn the following.

    > Use the "".join()-idiom instead:


    (Or upgrade)
    >
    > dmntStrings = []
    > ....
    > dmntStrings.append("\n")
    > ....
    > dmntFile.write("".join(dmntStrings))


    Note that the minor point above will cut the number of things to be
    joined nearly in half.

    tjr
    Terry Reedy, Jul 9, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kim Petersen

    Help and optimization hints, anyone?

    Kim Petersen, Jan 23, 2004, in forum: Python
    Replies:
    4
    Views:
    366
    Kim Petersen
    Jan 23, 2004
  2. Replies:
    8
    Views:
    849
  3. dp_pearce
    Replies:
    5
    Views:
    272
    dp_pearce
    Jul 15, 2008
  4. Replies:
    2
    Views:
    457
  5. Isaac Won
    Replies:
    9
    Views:
    364
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page