Printing to file, how do I do it efficiently?

Discussion in 'Python' started by Cameron Walsh, Nov 10, 2006.

  1. Hi all,

    I have a numpy.array of 89x512x512 uint8's, set up with code like this:


    data=numpy.array([],dtype="uint8")
    data.resize((89,512,512))
    # Data filled in about 4 seconds from 89 image slices
    <snip lots of processing code>


    I first tried writing this data to a binary raw format (for use in a
    program called Drishti) as follows:

    # The slow bit:
    volumeFile=file("/tmp/test.raw","wb")
    for z in xrange(Z):
    for y in xrange(Y):
    for x in xrange(X):
    volumeFile.write("%c" %(data[z,y,x]))
    volumeFile.close()

    That took about 39 seconds.

    My second attempt was as follows:
    volumeFile = open("/tmp/test2.raw","wb")
    data.resize((X*Y*Z)) # Flatten the array
    for i in data:
    volumeFile.write("%c" %i)
    data.resize((Z,Y,X))
    volumeFile.close()

    This took 32 seconds. (For those of you unfamiliar with numpy, the
    data.resize() operations take negligible amounts of time, all it does is
    allow the data to be accessed differently.)

    I'm guessing that the slow part is the fact that I am converting the
    data to character format and writing it one character at a time. What
    is a better way of doing this, or where should I look to find a better way?


    Thanks,

    Cameron.
    Cameron Walsh, Nov 10, 2006
    #1
    1. Advertising

  2. Cameron Walsh

    Robert Kern Guest

    Cameron Walsh wrote:
    > Hi all,
    >
    > I have a numpy.array of 89x512x512 uint8's, set up with code like this:


    numpy questions are best asked on the numpy list, not here.

    http://www.scipy.org/Mailing_Lists

    > data=numpy.array([],dtype="uint8")
    > data.resize((89,512,512))


    You might want to look at using numpy.empty() here, instead.

    > # Data filled in about 4 seconds from 89 image slices
    > <snip lots of processing code>
    >
    >
    > I first tried writing this data to a binary raw format (for use in a
    > program called Drishti) as follows:
    >
    > # The slow bit:
    > volumeFile=file("/tmp/test.raw","wb")
    > for z in xrange(Z):
    > for y in xrange(Y):
    > for x in xrange(X):
    > volumeFile.write("%c" %(data[z,y,x]))
    > volumeFile.close()
    >
    > That took about 39 seconds.
    >
    > My second attempt was as follows:
    > volumeFile = open("/tmp/test2.raw","wb")
    > data.resize((X*Y*Z)) # Flatten the array
    > for i in data:
    > volumeFile.write("%c" %i)
    > data.resize((Z,Y,X))
    > volumeFile.close()
    >
    > This took 32 seconds. (For those of you unfamiliar with numpy, the
    > data.resize() operations take negligible amounts of time, all it does is
    > allow the data to be accessed differently.)


    No, if the total size is different, it will also copy the array. Use .reshape()
    if you want to simply alter the shape, not the total number of elements.

    > I'm guessing that the slow part is the fact that I am converting the
    > data to character format and writing it one character at a time. What
    > is a better way of doing this, or where should I look to find a better way?


    data.tostring()

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
    Robert Kern, Nov 10, 2006
    #2
    1. Advertising

  3. Robert Kern wrote:
    > Cameron Walsh wrote:
    >> Hi all,
    >>
    >> I have a numpy.array of 89x512x512 uint8's, set up with code like this:

    >
    > numpy questions are best asked on the numpy list, not here.


    At first I thought it was a generic python question, since it had more
    to do with writing array data to file rather than the specific format of
    the array data.

    >
    >> data=numpy.array([],dtype="uint8")
    >> data.resize((89,512,512))

    >
    > You might want to look at using numpy.empty() here, instead.
    >


    Thanks!

    [...]
    >> I'm guessing that the slow part is the fact that I am converting the
    >> data to character format and writing it one character at a time. What
    >> is a better way of doing this, or where should I look to find a better way?

    >
    > data.tostring()
    >


    And here I see I was wrong, it was a numpy question. I assumed the
    tostring() method would produce the same output as printing the array to
    the screen by just calling "data". But of course, that would be the job
    of the __repr__() method.

    It is now ridiculously fast (<1second). Thank you for your help.

    Cameron.
    Cameron Walsh, Nov 10, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jane Austine
    Replies:
    14
    Views:
    780
    Dennis Lee Bieber
    Oct 9, 2004
  2. Jane Austine
    Replies:
    2
    Views:
    453
    Changjune Kim
    Oct 5, 2004
  3. fynali
    Replies:
    22
    Views:
    726
    Bengt Richter
    Jan 14, 2006
  4. random guy
    Replies:
    7
    Views:
    326
    James Kanze
    May 12, 2007
  5. Ulf Meinhardt
    Replies:
    22
    Views:
    1,132
Loading...

Share This Page