time series data and NumPy

B

BBands

Good morning,

I store time series data in a SQL database. The results of a typical
query using pyodbc look like this.

Date Close
"2007-01-17" 22.57

Where Date is a datetime.date object and Close is a float.

I'd like to put this data in a NumPy array for processing, but am
unsure as to how to handle the date. In the past I've used lists, but I
am looking to boost speed a bit as I wish to do a large number of
transformations and comparisons.

Can one index an array using datetime objects?

For example it would be nice to do a union of two arrays so that any
dates missing in either one were eliminated.

Thoughts on doing rolling operations, such as an n-period average or
variance?

Thoughts on working with time series data in arrays in general?

Thanks in advance,

jab--who is very happily returning to Python after a sojourn in
R-land
 
D

Diez B. Roggisch

BBands said:
Good morning,

I store time series data in a SQL database. The results of a typical
query using pyodbc look like this.

Date Close
"2007-01-17" 22.57

Where Date is a datetime.date object and Close is a float.

I'd like to put this data in a NumPy array for processing, but am
unsure as to how to handle the date. In the past I've used lists, but I
am looking to boost speed a bit as I wish to do a large number of
transformations and comparisons.

Can one index an array using datetime objects?

For example it would be nice to do a union of two arrays so that any
dates missing in either one were eliminated.

Thoughts on doing rolling operations, such as an n-period average or
variance?

Thoughts on working with time series data in arrays in general?

I'm pretty sure you're out of luck here - even _if_ NumPy would handle
arbitrary data-types (AFAIK it doesn't, but then I'm not a total expert
there), it certainly won't be able to make its hi-performance functions
work on them.

What you could do would be to convert the date-column into a timestamp,
which is a int/long, and use that. Would that help?

diez
 
B

BBands

What you could do would be to convert the date-column into a timestamp,
which is a int/long, and use that. Would that help?

Actually that might help, as all I need the date for is to index
values.

Thanks, I'll give it a spin.

jab
 
B

Bob Greschke

Actually that might help, as all I need the date for is to index
values.

Thanks, I'll give it a spin.

jab

You're using the Python-MySQL module mysqldb, right? You can select
the data from the database and have MySQL do the conversion with an SQL
command.

select to_days(<your date column name here>), <your close column> from
<your table>;

Bob
 
R

Robert Kern

Diez said:
I'm pretty sure you're out of luck here - even _if_ NumPy would handle
arbitrary data-types (AFAIK it doesn't, but then I'm not a total expert
there), it certainly won't be able to make its hi-performance functions
work on them.

Yes, one can make numpy arrays with "object" as its type. One can even extend
the C-level parts as well. For example, we have an experimental package in the
scipy sandbox for uniform time series that uses mx.DateTime.

http://www.scipy.org/TimeSeriesPackage
What you could do would be to convert the date-column into a timestamp,
which is a int/long, and use that. Would that help?

This is frequently what I do. For dates, I like Modified Julian Day Numbers
although I am sure that would horrify some people more knowledgeable than I.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
D

Dennis Lee Bieber

Actually that might help, as all I need the date for is to index
values.
Check your database functions -- it may already have a function that
returns dates as integers, so all you'd need do is modify the SQL doing
the retrieval.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
B

BBands

You're using the Python-MySQL module mysqldb, right?

Actually I using MySQL with pyodbc as the mysqldb Windows binaries for
Python 2.5 aren't out yet. :-(
You can select the data from the database and have
MySQL do the conversion with an SQL command.

select to_days(<your date column name here>),
<your close column> from <your table>;

That works very nicely indeed.

Thanks,

jab
 
B

BBands

Yes, one can make numpy arrays with "object" as its type. One can even extend
the C-level parts as well. For example, we have an experimental package in the
scipy sandbox for uniform time series that uses mx.DateTime.

http://www.scipy.org/TimeSeriesPackage

I saw that, but I am working with irregular time series.
This is frequently what I do. For dates, I like Modified Julian Day Numbers
although I am sure that would horrify some people more knowledgeable than I.

Not horrified, just doing it. ;-)

jab
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top