analyzing time

N

noydb

Hello All,

I have a table with a column of type date, with dates and time combined (like '1/6/2013 3:52:69PM'), that spans many months. How would I pull out records that are the first and last entries per day?

Also, if I wanted to find time clusters per day (or per week) -- like if an entry is made every day around 11am -- is there a way to get at that temporal statistical cluster?

Python 2.7, Windows 7.

Any guidance would be greatly appreciated! Time seems tricky...

Thanks,

N
 
N

Neil Cerutti

Hello All,

I have a table with a column of type date, with dates and time
combined (like '1/6/2013 3:52:69PM'), that spans many months.
How would I pull out records that are the first and last
entries per day?

Also, if I wanted to find time clusters per day (or per week)
-- like if an entry is made every day around 11am -- is there a
way to get at that temporal statistical cluster?

Python 2.7, Windows 7.

Any guidance would be greatly appreciated! Time seems tricky...

Time *is* really tricky, but that's because we humans created a
tricky system. If can ignore issues of timespampts, timezones and
daylight savings time, then time handling in Python can be
simple.

datetime.datetime.strptime can translate the time format above
into datetime.datetime objects, which provide all the methods you
will need.

To find clusters and min and max values you will likely need to
put the datetime objects in a list, and use some Python builtins
and list methods.
 
S

Skip Montanaro

I have a table with a column of type date, with dates and time combined (like '1/6/2013 3:52:69PM'), that spans many months. How would I pull out records that are the first and last entries per day?

You mentioned "table" and "column", which leads me to think you are
dealing with data in a SQL database. If so, that would likely change
the problem solution significantly.

If you have something like lists of string data in Python though, you
might want to make sure your timestamps are in a normalized form
first, so you can accurately sort by the timestamps. For that, my
weapon of choice is the dateutil package, and in particular, the
dateutil.parser module:
2013-01-06 15:52:59

Once your timestamps are represented as Python datetime objects, the
problem gets a bit easier, especially if you want to find the
beginning and ending timestamps of a bunch of dates. Sort, then throw
some itertools.groupby pixie dust at it. My ancient, reptilian brain
has never quite grokked all that iterator stuff, so I won't hazard a
guess how to spell the exact solution.

Skip
 
G

Gary Herron

Hello All,

I have a table with a column of type date, with dates and time combined (like '1/6/2013 3:52:69PM'), that spans many months. How would I pull out records that are the first and last entries per day?

Also, if I wanted to find time clusters per day (or per week) -- like if an entry is made every day around 11am -- is there a way to get at that temporal statistical cluster?

Python 2.7, Windows 7.

Any guidance would be greatly appreciated! Time seems tricky...

Thanks,

N

Are you asking a Python question, like how to turn a string "1/6/2013
3:52:69PM" into an internal representation of time, or are you asking a
data analysis and statistical question?

If the former, then look at datetime.strptime from the datetime module.

If the later, then you may get an answer here, but I'd suggest trying
somewhere that discusses statistics and analysis.

Gary Herron
 
T

Terry Reedy

I have a table with a column of type date, with dates and time

This is a datetime in Python parlance.
combined (like '1/6/2013 3:52:69PM'), that spans many months. How
would I pull out records that are the first and last entries per
day?

Sort on that column. Look at pairs of rows. If the days differ, you have
the last of the first and the first of the second. One way:

it = <table iterator>
dt1 = next(it)
d1 = date(dt1) # whatever that looks like

for row in it:
dt2 = row
d2 = date(dt2)
if d1 != d2:
do_whatever(dt1, dt2)
dt1, d1 = dt2, d2
Also, if I wanted to find time clusters per day (or per week) -- like
if an entry is made every day around 11am -- is there a way to get at
that temporal statistical cluster?

Make a histogram of time, ignoring date.
Python 2.7, Windows 7.

Any guidance would be greatly appreciated! Time seems tricky...

Yes
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,431
Messages
2,571,678
Members
48,796
Latest member
Greg L.

Latest Threads

Top