Processing file with lists.

G

Geobird

I have a text file , having fields delimited by ; in the first
line and all the way down is the data taken for those fields . Say
FAMILY NAME;SPECIES/SUBSPECIES;GENUS NAME;SUBGENUS NAME;SPECIES
NAME;SUBSPECIES NAME;AUTHORSHIP
Acrididae;Acanthacris ruficornis (Fabricius,
1787);Acanthacris;;ruficornis;;(Fabricius, 1787)
Acrididae;Acrida bicolor (Thunberg, 1815);Acrida;;bicolor;;(Thunberg,
1815)
Acrididae;Acrida oxycephala (Pallas, 1771);Acrida;;oxycephala;;
(Pallas, 1771)
Acrididae;Acrida turrita (Linnaeus, 1758);Acrida;;turrita;;(Linnaeus,
1758)

I want to know how could I process this file using ' lists ' ,
that could answer questions like . How many ? , Who did .. ?
etc.

I am a newbie , and would appreciate your help
 
A

Alice Bevan–McGregor

You describe a two-part problem. The first, loading the data, is
easily accomplished with the Python CSV module:

http://docs.python.org/library/csv.html

e.g.: reader = csv.reader(open('filename', 'rb'), delimiter=';',
quotechar=None)

In the above example, you can iterate over 'reader' in a for loop to
read out each row. The values will be returned in a list.

You could also use a DictReader to make the data more naturally
accessible using name=value pairs.
I want to know how could I process this file using ' lists ' ,
that could answer questions like . How many ? , Who did .. ?
etc.

This isn't very clear, but if your dataset is small (< 1000 rows or so)
you can fairly quickly read the data into RAM then run through the data
with loops designed to pull out certain data, though it seems your data
would need additional processing. (The authorship information should
be split into two separate columns, for example.)

An alternative would be to load the data into a relational database
like MySQL or even SQLite (which offers in-memory databases), or an
object database such as MongoDB which supports advanced querying using
map/reduce.

You'd have to examine the documentation on these different systems to
see which would best fit your use case. I prefer Mongo as it is very
easy to get data into and out of, supports SQL-like queries, and
map/reduce is extremely powerful.

— Alice.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top