lists vs. NumPy arrays for sets of dates and strings

B

beliavsky

I am going to read a multivariate time series from a CSV file that looks like

Date,A,B
2014-01-01,10.0,20.0
2014-01-02,10.1,19.9
....

The numerical data I will store in a NumPy array, since they are more convenient to work with than lists of lists. What are the advantages and disadvantages of storing the symbols [A,B] and dates [2014-01-01,2014-01-02] as lists vs. NumPy arrays?
 
D

Denis McMahon

I am going to read a multivariate time series from a CSV file that looks
like

Date,A,B 2014-01-01,10.0,20.0 2014-01-02,10.1,19.9 ...

The numerical data I will store in a NumPy array, since they are more
convenient to work with than lists of lists. What are the advantages and
disadvantages of storing the symbols [A,B] and dates
[2014-01-01,2014-01-02] as lists vs. NumPy arrays?

You could also use a dictionary of either lists or tuples or even NumPy
arrays keyed on the date.

$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import numpy
x = {}
y = numpy.array( [0,1] )
x['2014-06-05'] = y
x['2014-06-05'] array([0, 1])
x {'2014-06-05': array([0, 1])}
x['2014-06-05'][0] 0
x['2014-06-05'][1]
1
 
P

Peter Otten

I am going to read a multivariate time series from a CSV file that looks
like

Date,A,B
2014-01-01,10.0,20.0
2014-01-02,10.1,19.9
...

The numerical data I will store in a NumPy array, since they are more
convenient to work with than lists of lists. What are the advantages and
disadvantages of storing the symbols [A,B] and dates
[2014-01-01,2014-01-02] as lists vs. NumPy arrays?

If you don't mind the numpy dependency I can't see any disadvantages.
You might also have a look at pandas:
.... Date,A,B
.... 2014-01-01,10.0,20.0
.... 2014-01-02,10.1,19.9
.... """), parse_dates=[0])Date A B
0 2014-01-01 00:00:00 10.0 20.0
1 2014-01-02 00:00:00 10.1 19.9
0 10.0
1 10.1
Name: A, dtype: float64
0 2014-01-01 00:00:00
1 2014-01-02 00:00:00
Name: Date, dtype: datetime64[ns]
ts["Date"][0] Timestamp('2014-01-01 00:00:00', tz=None)
pylab.show(ts.plot(x="Date", y=["A", "B"]))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top