writing results to array

Bevan Jenkins · Dec 3, 2007

Hello,

I have recently discovered the python language and am having a lot of
fun getting head around the basics of it.
However, I have run into a stumbling block that I have not been able
to overcome, so I thought I would ask for help.
<Overview>
I am trying to import a text file that has the following format:
02/01/2000 @ 00:00:00 0.983896 Q10 T2
03/01/2000 @ 00:00:00 0.557377 Q10 T2
04/01/2000 @ 00:00:00 0.508871 Q10 T2
05/01/2000 @ 00:00:00 0.583196 Q10 T2
06/01/2000 @ 00:00:00 0.518281 Q10 T2
when there is missing data:
12/09/2000 @ 00:00:00 Q151 T2
13/09/2000 @ 00:00:00 Q151 T2

I have cobbled together some code which imports the data. The next
step is to create an array in which each column contains a years worth
of values. Thus, if i have 6 years of data (2001-2006 inclusive),
there will be six columns, with 365 rows (not all years have a full
data set and may only have say 340 days of data.
<The question>
In the code below
print answer[j,1] is giving me the right answer but i can't write it
to an array.
any suggestions welcomed.

This is what I have:
flow=[]
flowdate=[]
yeardate=[]
uniqueyear=[]
#flow_order=
flow_rank=[]
icount=[]
p=[]

filename=r"C:\Documents and Settings\bevanj\Desktop\flow_duration.tsf"
linesep ="\n"

# read in whole file
tempdata = open( filename).read()
# break into lines
tempdata = string.split( tempdata, linesep )
# for each record, get the field values
for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)
if len(fields)>5:
flowdate.append(fields[0])
list =string.split(fields[0],"/")
yeardate.append(list[2])
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

for rows in yeardate:
if rows not in uniqueyear:
uniqueyear.append(rows)

#print answer[:,0] #date
flow_order=empty((0,0),dtype=float)
#for yr in enumerate(uniqueyear):
for iyr,yr in enumerate(uniqueyear):
for j, val, in enumerate (answer[:,0]):
flowyr=string.split(val,"/")
if int(flowyr[2])==int(yr):
print answer[j,1]
#flow_order =

Matimus · Dec 3, 2007

Hello,

I have recently discovered the python language and am having a lot of
fun getting head around the basics of it.
However, I have run into a stumbling block that I have not been able
to overcome, so I thought I would ask for help.
<Overview>
I am trying to import a text file that has the following format:
02/01/2000 @ 00:00:00 0.983896 Q10 T2
03/01/2000 @ 00:00:00 0.557377 Q10 T2
04/01/2000 @ 00:00:00 0.508871 Q10 T2
05/01/2000 @ 00:00:00 0.583196 Q10 T2
06/01/2000 @ 00:00:00 0.518281 Q10 T2
when there is missing data:
12/09/2000 @ 00:00:00 Q151 T2
13/09/2000 @ 00:00:00 Q151 T2

I have cobbled together some code which imports the data. The next
step is to create an array in which each column contains a years worth
of values. Thus, if i have 6 years of data (2001-2006 inclusive),
there will be six columns, with 365 rows (not all years have a full
data set and may only have say 340 days of data.
<The question>
In the code below
print answer[j,1] is giving me the right answer but i can't write it
to an array.
any suggestions welcomed.

This is what I have:
flow=[]
flowdate=[]
yeardate=[]
uniqueyear=[]
#flow_order=
flow_rank=[]
icount=[]
p=[]

filename=r"C:\Documents and Settings\bevanj\Desktop\flow_duration.tsf"
linesep ="\n"

# read in whole file
tempdata = open( filename).read()
# break into lines
tempdata = string.split( tempdata, linesep )
# for each record, get the field values
for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)
if len(fields)>5:
flowdate.append(fields[0])
list =string.split(fields[0],"/")
yeardate.append(list[2])
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

for rows in yeardate:
if rows not in uniqueyear:
uniqueyear.append(rows)

#print answer[:,0] #date
flow_order=empty((0,0),dtype=float)
#for yr in enumerate(uniqueyear):
for iyr,yr in enumerate(uniqueyear):
for j, val, in enumerate (answer[:,0]):
flowyr=string.split(val,"/")
if int(flowyr[2])==int(yr):
print answer[j,1]
#flow_order =

I'm not sure what you mean by `write it to an array'. `answers' is an
array. Perhaps you could show an example that has the bad behavior you
are observing. Or at least an example of what you expect to get.

Also, just a couple of pointers:

this:

tempdata = open( filename).read()
# break into lines
tempdata = string.split( tempdata, linesep )
# for each record, get the field values
for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)

Click to expand...

is better written (and usually written) in python like this:

for line in open(filename):
fields = line.split()

Don't use the string module, use the methods of the strings
themselves.
Don't use built-in type names as variable names, as seen on this line:

list =string.split(fields[0],"/") # list is a built-in type

Click to expand...

You only need to use enumerate if you actually want the index. If you
don't need the index, just iterate over the sequence. eg. use this:

for yr in uniqueyear:

Click to expand...

You don't need to re-create the column-stack each time you get a value
from the file. It is very inefficient.

eg. this:

for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)
if len(fields)>5:
flowdate.append(fields[0])
list =string.split(fields[0],"/")
yeardate.append(list[2])
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

Click to expand...

to this:

for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)
if len(fields)>5:
flowdate.append(fields[0])
list =string.split(fields[0],"/")
yeardate.append(list[2])
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

Click to expand...

or, with the other suggested changes:

for line in open(filename):
# split into the lines
fields = line.split()
if len(fields) > 5:
flowdate.append(fields[0])
year = fields[0].split("/")[2]
yeardate.append(year)
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

Click to expand...

If I was doing this though, I would use a dictionary (dict) where the
keys are the year and the values are lists of flows for that year.

Something like this:

Code:

filename=r"C:\Documents and Settings\bevanj\Desktop\flow_duration.tsf" year2flows = {} fin = open(filename) for line in fin: # split into the lines fields = line.split() if len(fields)>5: date = fields[0] year = fields[0].split("/")[-1] flow = float(fields[3]) year2flows.setdefault(year, []).append((date, flow)) fin.close() # This does what you were doing. for yr in sorted(year2flows.keys()): for date, flow in year2flows[yr] print flow # If you just wanted one year though you could do something like this: for date, flow in year2flows[2004]: print flow

The above code is untested, so I make no guarantees. If you are using
python 2.5, you might look into using defaultdict (in the collections
module). It will simplify the code a bit.

from this:
year2flows = {}
# bunch of stuff...
year2flows.setdefault(year, []).append((date, flow))
to this:
from collections import defaultdict
year2flows = defaultdict(list)
# bunch of stuff...
year2flows[year].append((date, flow))

Matt

Chris · Dec 4, 2007

Hello,

I have recently discovered the python language and am having a lot of
fun getting head around the basics of it.
However, I have run into a stumbling block that I have not been able
to overcome, so I thought I would ask for help.
<Overview>
I am trying to import a text file that has the following format:
02/01/2000 @ 00:00:00 0.983896 Q10 T2
03/01/2000 @ 00:00:00 0.557377 Q10 T2
04/01/2000 @ 00:00:00 0.508871 Q10 T2
05/01/2000 @ 00:00:00 0.583196 Q10 T2
06/01/2000 @ 00:00:00 0.518281 Q10 T2
when there is missing data:
12/09/2000 @ 00:00:00 Q151 T2
13/09/2000 @ 00:00:00 Q151 T2

I have cobbled together some code which imports the data. The next
step is to create an array in which each column contains a years worth
of values. Thus, if i have 6 years of data (2001-2006 inclusive),
there will be six columns, with 365 rows (not all years have a full
data set and may only have say 340 days of data.
<The question>
In the code below
print answer[j,1] is giving me the right answer but i can't write it
to an array.
any suggestions welcomed.

This is what I have:
flow=[]
flowdate=[]
yeardate=[]
uniqueyear=[]
#flow_order=
flow_rank=[]
icount=[]
p=[]

filename=r"C:\Documents and Settings\bevanj\Desktop\flow_duration.tsf"
linesep ="\n"

# read in whole file
tempdata = open( filename).read()
# break into lines
tempdata = string.split( tempdata, linesep )
# for each record, get the field values
for i in range( len( tempdata)):
# split into the lines
fields = string.split( tempdata)
if len(fields)>5:
flowdate.append(fields[0])
list =string.split(fields[0],"/")
yeardate.append(list[2])
flow.append(float(fields[3]))
answer=column_stack((flowdate,flow))

for rows in yeardate:
if rows not in uniqueyear:
uniqueyear.append(rows)

#print answer[:,0] #date
flow_order=empty((0,0),dtype=float)
#for yr in enumerate(uniqueyear):
for iyr,yr in enumerate(uniqueyear):
for j, val, in enumerate (answer[:,0]):
flowyr=string.split(val,"/")
if int(flowyr[2])==int(yr):
print answer[j,1]
#flow_order =

Maybe you're looking for something more in the line of:

fInput = open('tst.txt')
dictObj = {}
"""{ Year_Key: { DayKey: FloatValue}}"""
for each_line in fInput.readlines():
if each_line.strip():
line = each_line.strip().split()
if len(line) == 6:
if dictObj.has_key(line[0].split('/')[-1]):
tmpDict = dictObj[line[0].split('/')[-1]]
tmpDict[line[0]] = line[3]
else:
dictObj[line[0].split('/')[-1]] = {line[0]:line[3]}
fInput.close()

Dennis Lee Bieber · Dec 4, 2007

<The question>
In the code below
print answer[j,1] is giving me the right answer but i can't write it
to an array.

Unless you are using some module/class that you didn't show us in
the code, Python doesn't really have arrays (there is an array built-in,
but I don't recall ever seeing it used, and then there are the various
numeric processing modules: numarry, numeric, and numpy [which
supercedes the other two]).

answer=column_stack((flowdate,flow))

You don't supply the code/definition for column_stack(), other than
that you are passing in a single argument -- which is a tuple containing
a list of dates and a list of whatever "flow" represents. Lacking this,
I can not guess what "answer" is supposed to represent.

#print answer[:,0] #date
flow_order=empty((0,0),dtype=float)

Where did empty() come from, and what is it supposed to be doing?

A cut at the parsing half of the problem:

-=-=-=-=-=-=-

#FILENAME = r"C:\Documents and
Settings\bevanj\Desktop\flow_duration.tsf"
FILENAME = "test.data"
#convention is that "constants" be all UPPERCASE name

data = {} #empty dictionary

fin = open(FILENAME, "r")

for ln in fin: #automatically reads by lines
flds = ln.split() #use string methods, not module functions
if len(flds) == 6:
(day, mon, year) = flds[0].split("/")
if year in data: #does dictionary already have the year?
data[year].append((flds[0], float(flds[3]))) #append to
previous list
else:
data[year] = [(flds[0], float(flds[3]))] #new list
created

fin.close()

#at this point, we should have a dictionary keyed by year, each year
#contains a list of (date, value) tuples.

# no code was given for column_stack() which looks to be taking
# ONE argument: a tuple containing two lists -> a list of dates and a
# list of values [just the opposite of what the above code produces,
to whit:
# ( [ "01/01/2001", "01/02/2001", ...], [ 0.9, 0.5, ...] )
# vs
# [ ( "01/01/2001", 0.9), ( "01/02/2001", 0.5), ... ( ..., ...) ]
#
# with no code for it, I can not guess at what "answer" is supposed to
contain
#
# furthermore, for normal Python lists (NOT arrays -- arrays are
special module
# creatures and don't work quite like lists) one does not write
multidimensional
# (nested lists) using mdl[x, y] notation, but by mdl[x][y]

import pprint
pprint.pprint(data)
-=-=-=-=-=-=-=-

When fed the following data (note that I mixed some orders to
illustrate the code)

-=-=-=-=-=-=-=-
02/01/2000 @ 00:00:00 0.983896 Q10 T2
03/01/2000 @ 00:00:00 0.557377 Q10 T2
04/01/2000 @ 00:00:00 0.508871 Q10 T2
05/01/2000 @ 00:00:00 0.583196 Q10 T2
06/01/2000 @ 00:00:00 0.518281 Q10 T2
12/09/2000 @ 00:00:00 Q151 T2
13/09/2000 @ 00:00:00 Q151 T2
02/01/2001 @ 00:00:00 0.983896 Q10 T2
03/01/2001 @ 00:00:00 0.557377 Q10 T2
04/01/2002 @ 00:00:00 0.608871 Q10 T2
05/01/2001 @ 00:00:00 0.583196 Q10 T2
06/01/2001 @ 00:00:00 0.518281 Q10 T2
12/09/2001 @ 00:00:00 Q151 T2
13/09/2002 @ 00:00:00 Q151 T2
02/01/2002 @ 00:00:00 0.983896 Q10 T2
03/01/2002 @ 00:00:00 0.557377 Q10 T2
04/01/2001 @ 00:00:00 0.408871 Q10 T2
05/01/2002 @ 00:00:00 0.583196 Q10 T2
06/01/2002 @ 00:00:00 0.518281 Q10 T2
12/09/2002 @ 00:00:00 Q151 T2
13/09/2001 @ 00:00:00 Q151 T2

-=-=-=-=-=-=-=-

produces:

pythonw -u "Script11.py"

{'2000': [('02/01/2000', 0.98389599999999999),
('03/01/2000', 0.55737700000000001),
('04/01/2000', 0.50887099999999996),
('05/01/2000', 0.58319600000000005),
('06/01/2000', 0.51828099999999999)],
'2001': [('02/01/2001', 0.98389599999999999),
('03/01/2001', 0.55737700000000001),
('05/01/2001', 0.58319600000000005),
('06/01/2001', 0.51828099999999999),
('04/01/2001', 0.40887099999999998)],
'2002': [('04/01/2002', 0.60887100000000005),
('02/01/2002', 0.98389599999999999),
('03/01/2002', 0.55737700000000001),
('05/01/2002', 0.58319600000000005),
('06/01/2002', 0.51828099999999999)]}

Exit code: 0

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/

Bevan Jenkins · Dec 4, 2007

Thank you all very much.

Firstly for providing an answer that does exactly what I require. But
also for the hints on the naming conventions and the explanations of
how I was going wrong.

Thanks again,
b

Character set woes with binary data	0	Apr 1, 2007
Trouble creating multi dimensional array. 0 to 26 in 3 dimensions.	1	Oct 12, 2022
How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023
[MUDFLAP] Is sizeof(ARRAY[0]) equivalent to sizeof(*ARRAY) ?	46	Jan 9, 2013
How to use ufixed when it involves multiplication a number of times?(VHDL question)	0	Aug 22, 2016
Outputting signal values to terminal Within Character Array	0	Dec 10, 2021
How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023
Converting hex to decimal when printed?	2	Oct 10, 2007

writing results to array

Bevan Jenkins

Matimus

Chris

Dennis Lee Bieber

Bevan Jenkins

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads