Seeking help: reading text file with genfromtxt

F

frankenstein

Hi all

I have got a text file which is only 32 MB in size and consists of the
following type of lines (columns are fixed):

==
Header text 1 line
....
01-Jan-2006 0055 145.069
-16.0449 83.2246 84.2835 499.14680
0.074029965
01-Jan-2006 0065 15.069 -1.0449
83.2246 84.2835 499.14680 12.074029965
....
12-Dec-2006 1255 145.069
23.0449 3.2246 4.2835 49.140
0.74029965
....
==

I have 3 questions:

1. Why is my translation (read_slow) of the IDL code so damn slow
(IDL: 13 sec, Python:2min16sec). Although both IDL and Python consume
about 40 MB.

2. Why is my faster version (read_fast) (13sec) so memory hungry (it
takes 200MB)?
2.1 Why is my second fastest version (read_second_fast) (16sec) still
memory hungry?

3. What do I need to do to get the speed of IDL and the memory
footprint of IDL (in that case 40MB)?


#convdate converts the date in the first column (e.g. 12-Dec-2006)
into day of year
#convtime does something else
==
import fileinput
import numpy as np
import datetime
import time
from StringIO import StringIO

def read_slow(file):

count=max(enumerate(open(file)))[0]

erg=np.zeros((count,10),dtype=np.float64)

convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
convtime= lambda x: np.int(np.float64(x)*1.0e-1)

i=0
with open(file) as infile:
#read first header line
infile.readline()
for line in infile:
tmp=np.genfromtxt(StringIO(line),\
dtype=np.float64,\
converters={0:convdate,
1:convtime})
#not sure if it does the right thing here:
erg[i,:]=tmp
i=i+1
infile.close()
return erg

==
def read_fast(file):

convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
convtime= lambda x: np.int(np.float64(x)*1.0e-1)

with open(file) as infile:
erg=np.genfromtxt(infile, autostrip=True,skip_header=1,\
dtype=np.float64,\
converters={0:convdate,1:convtime})
infile.close()
return erg
==

==
def read_second_fast(file):

convdate= lambda x: time.strptime(x,"%d-%b-%Y").tm_yday
convtime= lambda x: np.int(np.float64(x)*1.0e-1)


erg=np.loadtxt(file,skiprows=1,\
dtype=np.float64,\
converters={0:convdate,1:convtime})
return erg
==

Thanks for all the help.

By the way: I colleague told me my code is 1. poorly written and more
or less unreadable and unmaintainable because of the use of lambda. I
am just learning but is his observation true?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top