Seeking help: reading text file with genfromtxt

Discussion in 'Python' started by frankenstein, Apr 4, 2012.

  1. frankenstein

    frankenstein Guest

    Hi all

    I have got a text file which is only 32 MB in size and consists of the
    following type of lines (columns are fixed):

    ==
    Header text 1 line
    ....
    01-Jan-2006 0055 145.069
    -16.0449 83.2246 84.2835 499.14680
    0.074029965
    01-Jan-2006 0065 15.069 -1.0449
    83.2246 84.2835 499.14680 12.074029965
    ....
    12-Dec-2006 1255 145.069
    23.0449 3.2246 4.2835 49.140
    0.74029965
    ....
    ==

    I have 3 questions:

    1. Why is my translation (read_slow) of the IDL code so damn slow
    (IDL: 13 sec, Python:2min16sec). Although both IDL and Python consume
    about 40 MB.

    2. Why is my faster version (read_fast) (13sec) so memory hungry (it
    takes 200MB)?
    2.1 Why is my second fastest version (read_second_fast) (16sec) still
    memory hungry?

    3. What do I need to do to get the speed of IDL and the memory
    footprint of IDL (in that case 40MB)?


    #convdate converts the date in the first column (e.g. 12-Dec-2006)
    into day of year
    #convtime does something else
    ==
    import fileinput
    import numpy as np
    import datetime
    import time
    from StringIO import StringIO

    def read_slow(file):

    count=max(enumerate(open(file)))[0]

    erg=np.zeros((count,10),dtype=np.float64)

    convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
    convtime= lambda x: np.int(np.float64(x)*1.0e-1)

    i=0
    with open(file) as infile:
    #read first header line
    infile.readline()
    for line in infile:
    tmp=np.genfromtxt(StringIO(line),\
    dtype=np.float64,\
    converters={0:convdate,
    1:convtime})
    #not sure if it does the right thing here:
    erg[i,:]=tmp
    i=i+1
    infile.close()
    return erg

    ==
    def read_fast(file):

    convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
    convtime= lambda x: np.int(np.float64(x)*1.0e-1)

    with open(file) as infile:
    erg=np.genfromtxt(infile, autostrip=True,skip_header=1,\
    dtype=np.float64,\
    converters={0:convdate,1:convtime})
    infile.close()
    return erg
    ==

    ==
    def read_second_fast(file):

    convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
    convtime= lambda x: np.int(np.float64(x)*1.0e-1)


    erg=np.loadtxt(file,skiprows=1,\
    dtype=np.float64,\
    converters={0:convdate,1:convtime})
    return erg
    ==

    Thanks for all the help.

    By the way: I colleague told me my code is 1. poorly written and more
    or less unreadable and unmaintainable because of the use of lambda. I
    am just learning but is his observation true?
     
    frankenstein, Apr 4, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darrel
    Replies:
    3
    Views:
    695
    Kevin Spencer
    Nov 11, 2004
  2. Hiral
    Replies:
    0
    Views:
    237
    Hiral
    Aug 3, 2010
  3. simona bellavista

    genfromtxt and comment identifier

    simona bellavista, Apr 15, 2011, in forum: Python
    Replies:
    1
    Views:
    542
    Peter Otten
    Apr 15, 2011
  4. Simon Harrison
    Replies:
    7
    Views:
    152
    Simon Harrison
    Apr 3, 2011
  5. Helmut Jarausch

    numpy.genfromtxt with Python3 - howto

    Helmut Jarausch, Apr 6, 2012, in forum: Python
    Replies:
    0
    Views:
    408
    Helmut Jarausch
    Apr 6, 2012
Loading...

Share This Page