handling tabular data in python--newbie question

H

hyena

Hi,
Just jump in python few days. I am wondering how to store and index a
table in python effectively and easily.I think the basic data types are not
very straight foward to handle a table (eg, from csv or data base.)

I have a csv file, the first row of it is column names and the rest rows
are data. There are some tens of columns and hundreds rows in the file. I am
planning to use the column names as variables to access data, currently I am
thinking of using a dictionary to store this file but did not figure out a
elegant way to start.

Any comments and suggestions are wellcomed. Please forgive me if this
question is too naive , and yes, I did search google a while but did not
find what I want.

Thanks
 
B

Bruno Desthuilliers

hyena a écrit :
Hi,
Just jump in python few days. I am wondering how to store and index a
table in python effectively and easily.I think the basic data types are not
very straight foward to handle a table (eg, from csv or data base.)

What make you think such a thing ?
I have a csv file, the first row of it is column names and the rest rows
are data. There are some tens of columns and hundreds rows in the file. I am
planning to use the column names as variables to access data, currently I am
thinking of using a dictionary to store this file but did not figure out a
elegant way to start.

Use the csv module - it's in the standard lib. It has an option to use
dicts to allow keyed access to 'columns' (look for csv.DictReader).

Anyway, if you have anything complex to do with your data, you'd
probably better use SQLite (possibly using csv module to help importing
your data).

My 2 cents...
 
S

Steve Holden

hyena said:
Hi,
Just jump in python few days. I am wondering how to store and index a
table in python effectively and easily.I think the basic data types are not
very straight foward to handle a table (eg, from csv or data base.)

I have a csv file, the first row of it is column names and the rest rows
are data. There are some tens of columns and hundreds rows in the file. I am
planning to use the column names as variables to access data, currently I am
thinking of using a dictionary to store this file but did not figure out a
elegant way to start.

Any comments and suggestions are wellcomed. Please forgive me if this
question is too naive , and yes, I did search google a while but did not
find what I want.

Thanks
One way would be to store each row as a dictionary. Suppose your data
file is called "myfile.txt" and, for simplicity, that columns are
separated by whitespace. Please note the following code is untested.

f = open("myfile.txt", "r")
names = file.next().split()

So now names contains a list of the field names you want to use. Let's
store the rows in a dictionary of dictionaries, using the first column
to index each row.

rows = {}
for line in file:
cols = line.split()
rdict = dict(zip(names, cols))
rows[cols[0]] = rdict

dict(zip(names, cols)) should create a dictionary where each field is
stored against its column name. I assume that cols[0] is unique for each
row, otherwise you will suffer data loss unless you check for that
circumstance. You can check this kind of thing in the interactive
interpreter:
>>> names = ["first", "second", "third"]
>>> dict(zip(names, [1, 2, 3])
.... )
{'second': 2, 'third': 3, 'first': 1}
Another alternative, however, would be to create an object for each row
where the columns are stored as attributes. This approach would be
useful if the column names are predictable, but rather less so if each
of your data files were to use different column names. Let us know and
if appropriate someone can point you at the "bunch" class.

Welcome to Python!

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
 
H

hyena

Thanks Bruno and Steve for the quick answer!
What make you think such a thing ?
I am also using R and java from time to time, and think it is very covinient
that in R tables are handled as matrixs or data frames. Thus I expect python
has something similiar. :)

And I went for Steve's first suggestion, which is just what I want at this
moment.

Thanks again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top