table from csv file


M

marlowe

I am trying to create a table in python from a csv file where I input
which columns I would like to see, and the table only shows those
columns. I have attached an example of the csv file i am using, and
some of the code I have written. I am having trouble converting
variables between lists, dictionaries and tuples. Is there a name for
what I am attempting to do? any help to get me on the right track with
this is appreciated.

test.csv
Date Open High Low Close Volume Adj Close
12/14/09 110.01 110.7 109.5 110.24 16316000 110.24
12/11/09 110.6 110.82 108.72 109.32 28983100 109.32
12/10/09 110.66 111.03 110.04 110.82 20491700 110.82
12/9/09 111.6 112.48 109.38 110.84 37104900 110.84
12/8/09 112.76 113.06 110.21 110.95 37630800 110.95
12/7/09 111.51 114.22 111.44 113.11 42480500 113.11
12/4/09 117.15 117.26 112.41 113.75 79182600 113.75
12/3/09 118.57 119.54 118.03 118.7 28802100 118.7
12/2/09 118.8 119.27 118.3 119.18 30994600 119.18
12/1/09 117.3 117.93 116.78 117.38 27641000 117.38
11/30/09 114.48 115.89 114.27 115.64 16402300 115.64
11/27/09 113.08 115.81 113.02 115.06 21234400 115.06
11/25/09 115.69 116.88 115.53 116.62 24553300 116.62
11/24/09 114.73 114.81 113.97 114.73 22599700 114.73
11/23/09 114.67 115.12 113.99 114.29 24422700 114.29
11/20/09 111.74 112.94 111.54 112.94 17302500 112.94
11/19/09 111.85 112.4 110.76 112.3 21239800 112.3
11/18/09 112.69 113.09 111.8 112.25 22320600 112.25
11/17/09 111.09 111.99 110.9 111.97 19732900 111.97
11/16/09 110.7 112.16 110.65 111.63 25002300 111.63
11/13/09 108.32 109.8 108.14 109.74 17246000 109.74
11/12/09 109.16 109.56 108.12 108.21 17848300 108.21
11/11/09 109.49 109.71 109 109.6 17654100 109.6
11/10/09 108.03 108.78 107.7 108.39 15973300 108.39
11/9/09 108.69 108.75 107.91 108.19 18444800 108.19
11/6/09 107.38 108.04 107.06 107.43 14789000 107.43
11/5/09 106.81 107.2 106.6 106.98 10189000 106.98
11/4/09 107.11 107.68 106.43 107.1 27125500 107.1

table.py;

import csv

(D, O, H, L, C, V, A) = (11, 'open', 'high', 'low', 'close', 66, 77)

d = {'high':H, 'low':L, 'close':C, 'open':O}

spacing = '%-*s'

w=raw_input('what do you want to see? (use commas to seperate values)
')
y=w.lower()
x=y.replace(' ','')
print x

p = x.split(',') #this takes string, converts to
list
print p
num = len(p)
format = num*spacing
width = 12*num
wi = 12
secwidth = width - wi


bb = []
i=0
while i < num: #creates new list with variables
ll = d[p]
bb.insert(i,ll)
i+=1
print bb
i = 0
while i < num:
bb.insert(i*2, 12) #this works on list
i+=1
print bb



i = 0
while i < num:
p.insert(i*2, 12) #this works on list
i+=1

q = tuple (p) #takes list, converts to tuple


reader = csv.reader(open('/prog/test.csv', "rb"))



rownum = 0
for row in reader:
if rownum == 0:
print '=' * width
print format % (q)
print '-' * width
print ''

else:
D, O, H, L, C, V, A = row [:7]
o = tuple (bb)
print format % (o)

rownum += 1
 
Ad

Advertisements

J

Jon Clements

I am trying to create a table in python from a csv file where I input
which columns I would like to see, and the table only shows those
columns. I have attached an example of the csv file i am using, and
some of the code I have written. I am having trouble converting
variables between lists, dictionaries and tuples. Is there a name for
what I am attempting to do? any help to get me on the right track with
this is appreciated.

test.csv
Date    Open     High    Low    Close   Volume  Adj Close
12/14/09        110.01  110.7   109.5   110.24  16316000        110.24
12/11/09        110.6   110.82  108.72  109.32  28983100        109.32
12/10/09        110.66  111.03  110.04  110.82  20491700        110.82
12/9/09 111.6   112.48  109.38  110.84  37104900        110.84
12/8/09 112.76  113.06  110.21  110.95  37630800        110.95
12/7/09 111.51  114.22  111.44  113.11  42480500        113.11
12/4/09 117.15  117.26  112.41  113.75  79182600        113.75
12/3/09 118.57  119.54  118.03  118.7   28802100        118.7
12/2/09 118.8   119.27  118.3   119.18  30994600        119.18
12/1/09 117.3   117.93  116.78  117.38  27641000        117.38
11/30/09        114.48  115.89  114.27  115.64  16402300        115.64
11/27/09        113.08  115.81  113.02  115.06  21234400        115.06
11/25/09        115.69  116.88  115.53  116.62  24553300        116.62
11/24/09        114.73  114.81  113.97  114.73  22599700        114.73
11/23/09        114.67  115.12  113.99  114.29  24422700        114.29
11/20/09        111.74  112.94  111.54  112.94  17302500        112.94
11/19/09        111.85  112.4   110.76  112.3   21239800        112.3
11/18/09        112.69  113.09  111.8   112.25  22320600        112.25
11/17/09        111.09  111.99  110.9   111.97  19732900        111.97
11/16/09        110.7   112.16  110.65  111.63  25002300        111.63
11/13/09        108.32  109.8   108.14  109.74  17246000        109.74
11/12/09        109.16  109.56  108.12  108.21  17848300        108.21
11/11/09        109.49  109.71  109     109.6   17654100        109.6
11/10/09        108.03  108.78  107.7   108.39  15973300        108.39
11/9/09 108.69  108.75  107.91  108.19  18444800        108.19
11/6/09 107.38  108.04  107.06  107.43  14789000        107.43
11/5/09 106.81  107.2   106.6   106.98  10189000        106.98
11/4/09 107.11  107.68  106.43  107.1   27125500        107.1

table.py;

import csv

(D, O, H, L, C, V, A) = (11, 'open', 'high', 'low', 'close', 66, 77)

d = {'high':H, 'low':L, 'close':C, 'open':O}

spacing = '%-*s'

w=raw_input('what do you want to see? (use commas to seperate values)
')
y=w.lower()
x=y.replace(' ','')
print x

p = x.split(',')                    #this takes string, converts to
list
print p
num = len(p)
format = num*spacing
width = 12*num
wi = 12
secwidth = width - wi

bb = []
i=0
while i < num:                      #creates new list with variables
    ll = d[p]
    bb.insert(i,ll)
    i+=1
print bb
i = 0
while i < num:
    bb.insert(i*2, 12)               #this works on list
    i+=1
print bb

i = 0
while i < num:
    p.insert(i*2, 12)               #this works on list
    i+=1

q = tuple (p)                       #takes list, converts to tuple

reader = csv.reader(open('/prog/test.csv', "rb"))

rownum = 0
for row in reader:
    if rownum == 0:
        print '=' * width
        print format % (q)
        print '-' * width
        print ''

    else:
        D, O, H, L, C, V, A = row [:7]
        o = tuple (bb)
        print format % (o)

    rownum += 1


This might be a useful starting point (I'm guessing this is what
you're after...)

Let's assume your 'CSV' file is tab separated as it's certainly not
comma separated :)

import csv
csvin = csv.reader(open('test.csv'), delimiter='\t')
header = dict( (val.strip(),idx) for idx, val in enumerate(next
(csvin)) )

We can use header as a column name->column index lookup eg header
['Open'] == 1

from operator import itemgetter
wanted = ['Open', 'Close'] # Although you'll want to use raw_input and
split on ','
getcols = itemgetter(*[header[col] for col in wanted])

getcols is a helper function that'll return a tuple of the columns in
the requested order...

for row in csvin:
print getcols(row)

Loop over the rest of the file and output the required columns.


hth
Jon.
 
J

J

I had to edit that and comma delimit it, because cut and paste gave me
random numbers/types of whitespace...

[code snipped]
This might be a useful starting point (I'm guessing this is what
you're after...)

Let's assume your 'CSV' file is tab separated as it's certainly not
comma separated :)

import csv
csvin = csv.reader(open('test.csv'), delimiter='\t')
header = dict( (val.strip(),idx) for idx, val in enumerate(next
(csvin)) )

We can use header as a column name->column index lookup eg header
['Open'] == 1

from operator import itemgetter
wanted = ['Open', 'Close'] # Although you'll want to use raw_input and
split on ','
getcols = itemgetter(*[header[col] for col in wanted])

getcols is a helper function that'll return a tuple of the columns in
the requested order...

for row in csvin:
   print getcols(row)

Loop over the rest of the file and output the required columns.

As someone who knows just enough to be dangerous... what about this:

import csv

reader = open('C:/test.txt','rb')
data = csv.DictReader(reader,restval='000',restkey='Misc')

print "Options are: Date, Open, Close, High, Low, Volume, Adj Close"
column = raw_input('What do you want? (comma delimited)?')
choice = column.split(',')
f = 12

print ''.join([s.center(f) for s in choice])

for row in data:
print ''.join([(row.get(s)).center(f) for s in choice])
 
J

Jon Clements

I had to edit that and comma delimit it, because cut and paste gave me
random numbers/types of whitespace...

[code snipped]


This might be a useful starting point (I'm guessing this is what
you're after...)
Let's assume your 'CSV' file is tab separated as it's certainly not
comma separated :)
import csv
csvin = csv.reader(open('test.csv'), delimiter='\t')
header = dict( (val.strip(),idx) for idx, val in enumerate(next
(csvin)) )
We can use header as a column name->column index lookup eg header
['Open'] == 1
from operator import itemgetter
wanted = ['Open', 'Close'] # Although you'll want to use raw_input and
split on ','
getcols = itemgetter(*[header[col] for col in wanted])
getcols is a helper function that'll return a tuple of the columns in
the requested order...
for row in csvin:
   print getcols(row)
Loop over the rest of the file and output the required columns.

As someone who knows just enough to be dangerous... what about this:

import csv

reader = open('C:/test.txt','rb')
data = csv.DictReader(reader,restval='000',restkey='Misc')

[snip]

DictReader works, but what use to bug me was the fact you couldn't
then output the cols in the 'correct' order afterwards, so you had
to store the header row anyway to re-order the rows...
(although admittedly this doesn't affect the OP's question).

However, I see that 2.6+ offers .fieldnames on DictReader objects.

Cheers,

Jon.
 
Ad

Advertisements

J

J

reader = open('C:/test.txt','rb')
data = csv.DictReader(reader,restval='000',restkey='Misc')

[snip]

DictReader works, but what use to bug me was the fact you couldn't
then output the cols in the 'correct' order afterwards, so you had
to store the header row anyway to re-order the rows...
(although admittedly this doesn't affect the OP's question).

However, I see that 2.6+ offers .fieldnames on DictReader objects.

At a guess, and I stipulate that because I am just learning about all
this, I'd say that's because DictReader returns a dictionary for each
row retrieved... It DOES take the first row and make those the dict
keys, but after that, it just returns a Dict for each rows and
dictionaries seem to be randomly ordered.

It may not be that great if you're doing something that needs order,
but all he's doing is pulling arbitrary columns from a file and
printing as a table, so that works great in this case...

It was quite interesting messing around with that yesterday. I saw
the question, and it looked interesting, so I went and learned and
applied ;-) I'm rather smugly satisfied with myself ...

Cheers,

Jeff
 

Top