processing input from multiple files

Discussion in 'Python' started by Christopher Steele, Oct 14, 2010.

  1. Hi

    I've been trying to decode a series of observations from multiple files
    (each file is a different time) and put each type of observation into their
    own separate file. The script runs successfully for one file but whenever I
    try it for more they just overwrite each other. I'm new to python and I'm
    not sure how to go about efficiently running through the process once and
    then appending to the output file for all other input files. Has anyone done
    something similar to this before?



    If it helps, I'll also attach a sample of one of the input files


    #!/usr/bin/python

    import sys
    import os
    import re
    import fileinput

    #load in file list
    #obs = os.system('ls s[i,m,n]uk[0,2,4][1,2,3]d_??00P.DATA')
    obs = ['siuk21d_0300P.DATA', 'siuk21d_0900P.DATA']
    print obs
    #code for file type "datalist"
    #fname = "datalist_201081813.txt"


    #output files
    foutname1 = 'prestest.txt'
    foutname2 = 'temptest.txt'
    foutname3 = 'tempdtest.txt'
    foutname4 = 'wspeedtest.txt'
    foutname5 = 'winddtest.txt'


    #prepare times
    time=[]
    year="2009"
    month="09"
    day="18"
    hour=[]

    #outputs
    pres_out = ''
    temp_out = ''
    dtemp_out = ''
    dir_out = ''
    speed_out = ''
    x =''


    #load in station file with lat/lons
    file2 = open("uk_stations.txt","r")
    stations = file2.readlines()
    ids=[]
    names=[]
    lats=[]
    lons=[]
    for item in stations:
    item_list = item.strip().split(',')
    ids.append(item_list[0])
    names.append(item_list[1])
    lats.append(item_list[2])
    lons.append(item_list[3])

    #create loop over file list
    time= [item.split('_')[1].split('.')[0] for item in obs]
    print time
    for x in time:
    hour= x[:2]
    print hour
    newtime = year+month+day+'_'+hour+'00'
    print newtime
    for file in fileinput.input(obs):
    data=file[:file.find(' 333 ')]
    #data=st[split:]
    print data
    elements=data.split(' ')
    print elements
    station_id = elements[0]
    try:
    index = ids.index(station_id)
    lat = lats[index]
    lon = lons[index]
    message_type = 'ADPSFC'
    except:
    print 'Station ID',station_id,'not in list!'
    lat = lon = 'NaN'
    message_type = 'Bad_station_id'
    try:
    temp = [item for item in elements if item.startswith('1')][0]
    temperature = float(temp[2:])/10
    sign = temp[1]
    if sign == 1:
    temperature=-temperature
    except:
    temperature='NaN'

    try:
    dtemp = [item for item in elements if item.startswith('2')][0]
    dtemperature = float(dtemp[2:])/10
    sign = dtemp[1]
    if sign == 1:
    dtemperature=-dtemperature
    except:
    detemperature='NaN'
    try:
    press = [item for item in elements[2:] if item.startswith('4')][0]
    if press[1]=='9':
    pressure = float(press[1:])/10
    else:
    pressure = float(press[1:])/10+1000
    except:
    pressure = 'NaN'

    try:
    wind = elements[elements.index(temp)-1]
    direction = float(wind[1:3])*10
    speed = float(wind[3:])*0.514444444
    except:
    direction=speed='NaN'



    newline =
    message_type+c+str(station_id)+c+newtime+c+lat+c+lon+c+c+"-9999"+c+ "002"
    +c+"-9999"+c+"-9999"+c+str(pressure)+c
    pres_out+=newline+'\n'


    newline2 =
    message_type+c+str(station_id)+c+newtime+c+lat+c+lon+c+c+"-9999"+c+ "011"
    +c+"-9999"+c+"-9999"+c+str(temperature)+c
    print newline2
    temp_out+=newline2+'\n'
    fout = open(foutname2,'w')
    fout.writelines(temp_out)
    fout.close()




    newline3 =
    message_type+c+str(station_id)+c+newtime+c+lat+c+lon+c+c+"-9999"+c+ "017"
    +c+"-9999"+c+"-9999"+c+str(dtemperature)+c
    print newline3
    dtemp_out+=newline3+'\n'
    fout = open(foutname3,'w')
    fout.writelines(dtemp_out)
    fout.close()


    newline4 =
    message_type+c+str(station_id)+c+newtime+c+lat+c+lon+c+c+"-9999"+c+ "031"
    +c+"-9999"+c+"-9999"+c+str(direction)+c
    print newline4
    dir_out+=newline4+'\n'
    fout = open(foutname4,'w')
    fout.writelines(dir_out)
    fout.close()


    newline5 =
    message_type+c+str(station_id)+c+newtime+c+lat+c+lon+c+c+"-9999"+c+
    "032"+c+"-9999"+c+"-9999"+c+str(speed)+c
    print newline5
    speed_out+=newline5+'\n'


    fout = open(foutname1,'w')
    fout.writelines(pres_out)
    fout.close()
    fout = open(foutname2,'w')
    fout.writelines(temp_out)
    fout.close()
    fout = open(foutname3,'w')
    fout.writelines(dtemp_out)
    fout.close()
    fout = open(foutname4,'w')
    fout.writelines(dir_out)
    fout.close()
    fout = open(foutname5,'w')
    fout.writelines(speed_out)
    fout.close()










    cheers

    Chris
     
    Christopher Steele, Oct 14, 2010
    #1
    1. Advertising

  2. Christopher Steele

    John Posner Guest

    On 10/14/2010 6:08 AM, Christopher Steele wrote:
    > Hi
    >
    > I've been trying to decode a series of observations from multiple files
    > (each file is a different time) and put each type of observation into
    > their own separate file. The script runs successfully for one file but
    > whenever I try it for more they just overwrite each other.


    fileinput.input() iterates over *lines* not entire *files*. So take a
    look at this location in the code:

    for file in fileinput.input(obs):
    data=file[:file.find(' 333 ')]

    Did you mean your iteration variable to be "file", implying that it will
    hold an entire file of input data?

    If you meant the iteration variable to be named "textline" instead of
    "file", is it guaranteed that string ' 333 ' will occur in every such
    text line?


    -John
     
    John Posner, Oct 14, 2010
    #2
    1. Advertising

  3. Christopher Steele

    John Posner Guest

    On 10/14/2010 10:44 AM, Christopher Steele wrote:
    > The issue is that I need to be able to both, split the names of the
    > files so that I can extract the relevant times, and open each
    > individual file and process each line individually. Once I have
    > achieved this I need to append the sorted files onto one another in
    > one long file so that I can pass them into a verification package.
    > I've tried changing the name to textline and I get the same result


    I'm very happy to hear that changing the name of a variable did not
    affect the way the program works! Anything else would be worrisome.


    > - the sorted files overwrite one another.


    Variable *time* names a list, with one member for each input file. But
    variable *newtime* names a scalar value, not a list. That looks like a
    problem to me. Either of the following changes might help:

    Original:

    for x in time:
    hour= x[:2]
    print hour
    newtime = year+month+day+'_'+hour+'00'

    Alternative #1:

    newtime = []
    for x in time:
    hour= x[:2]
    print hour
    newtime.append(year+month+day+'_'+hour+'00')

    Alternative #2:
    newtime = [year + month + day + '_' + x[:2] + '00' for x in time]


    HTH,
    John
     
    John Posner, Oct 14, 2010
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maxim
    Replies:
    0
    Views:
    417
    Maxim
    Jul 7, 2003
  2. Filip Hendrickx
    Replies:
    3
    Views:
    815
    Filip Hendrickx
    Feb 7, 2006
  3. Replies:
    4
    Views:
    984
    M.E.Farmer
    Feb 13, 2005
  4. Ron Brennan
    Replies:
    5
    Views:
    328
    Dr John Stockton
    May 14, 2004
  5. ollie
    Replies:
    17
    Views:
    183
    Tad McClellan
    Mar 14, 2006
Loading...

Share This Page