Portable general timestamp format, not 2038-limited

J

James Harris

I have a requirement to store timestamps in a database. Simple enough
you might think but finding a suitably general format is not easy. The
specifics are

1) subsecond resolution - milliseconds or, preferably, more detailed
2) not bounded by Unix timestamp 2038 limit
3) readable in Java
4) writable portably in Perl which seems to mean that 64-bit values
are out
5) readable and writable in Python
6) storable in a free database - Postgresql/MySQL

The formats provided by the two database systems are such as 8-byte or
12-byte values which, even if I could get Perl to work with I guess it
would be messy. Keeping to 32-bit values should give me portability
and be easy enough to work with without obscuring the program logic.
Since 32 bits of microseconds is less than 50 days I have to store two
32-bit values. How to split them? The option I favour at the moment is
to split days and parts of days like this:

a) store, as a 32-bit number, days since a virtual year zero (there is
no year zero in common era time <http://en.wikipedia.org/wiki/
Common_Era>). This allows over five million years plus and minus.
Still not completely general, I know.
b) store parts of days as another 32-bit value. Its range would have
to go to 86401 seconds - the number of seconds in a leap day. This
means each 'tick' would be around 21 microseconds. For regularity I
could make the ticks 25 microseconds so there would be 40,000 in a
second and 3,456,000,000 in a day; and, finally, the counter could
tick about 5 hours into the next day if not caught.

Any thoughts on a better way to do this? (Please reply-all. Thanks).
 
O

Owen Jacobson

James said:
I have a requirement to store timestamps in a database. Simple enough
you might think but finding a suitably general format is not easy. The
specifics are

1) subsecond resolution - milliseconds or, preferably, more detailed
2) not bounded by Unix timestamp 2038 limit
3) readable in Java
4) writable portably in Perl which seems to mean that 64-bit values
are out
5) readable and writable in Python
6) storable in a free database - Postgresql/MySQL

Use your database's TIMESTAMP type (which is a standard SQL type) and
the java.sql.Timestamp class and call it a day. As far as I know the
various perl and python database libraries can cope with SQL
TIMESTAMPs too.

For postgresql, TIMESTAMP is internally an 8-byte value representing a
point in MJD, with millisecond precision and 4713 BC to 5874897 AD
range.

For mysql, TIMESTAMP is internally a time_t; this does give it the
same limitations as time_t on the platform it was compiled on (1970 to
2038 AD and 1 second resolution for 32-bit systems); you may consider
using DATETIME which the various MySQL drivers for different languages
all convert to timestamp equivalents. There are no built-in temporal
types for mysql with resolution better than 1 second, a problem which
is (as far as I know) unique to that database.
 
R

Roger Miller

I have a requirement to store timestamps in a database. Simple enough
you might think but finding a suitably general format is not easy.
...
Any thoughts on a better way to do this? (Please reply-all. Thanks).


My rule of thumb in situations like this is "When in doubt store it as
text". The one format I am pretty sure we will still be able to deal
with in 2039.

- Roger
 
P

Paul Rubin

James Harris said:
I have a requirement to store timestamps in a database. Simple enough
you might think but finding a suitably general format is not easy. The
specifics are

1) subsecond resolution - milliseconds or, preferably, more detailed
...

There are subtle issues that have been messed up many times. See:

http://cr.yp.to/time.html

particularly the TAI stuff for some info.
 
J

James Harris

My rule of thumb in situations like this is "When in doubt store it as
text". The one format I am pretty sure we will still be able to deal
with in 2039.

Interesting. I hadn't thought about using text. It would add to the
storage a bit as each record is otherwise quite short. But this sounds
like a good option and may help - at least while debugging - to see
the raw date and time as digits. I will consider using this, perhaps
as yyyymmddhhmmssttt.
 
R

rossum

Interesting. I hadn't thought about using text. It would add to the
storage a bit as each record is otherwise quite short. But this sounds
like a good option and may help - at least while debugging - to see
the raw date and time as digits. I will consider using this, perhaps
as yyyymmddhhmmssttt.
You might prefer to use one of the ISO 8601 formats:
yyyymmddThhmmssttt or yyyy-mm-ddThh:mm:ss.ttt

http://www.cl.cam.ac.uk/~mgk25/iso-time.html

rossum
 
R

Roedy Green

1) subsecond resolution - milliseconds or, preferably, more detailed
2) not bounded by Unix timestamp 2038 limit
3) readable in Java
4) writable portably in Perl which seems to mean that 64-bit values
are out
5) readable and writable in Python
6) storable in a free database - Postgresql/MySQL

Unix gets in trouble in 2038 only with 32-bit timestamps. Java's
64-bit longs are fine.

If you need code to create timestamps, you can modify parts of BigDate
to run in Perl or Python.
see http://mindprod.com/products1.html#BIGDATE

To get more detailed, just use a unix long timestamp multiplied by
1000 to track in microseconds.

You can use MS nanosecond timestamps. see
http://mindprod.com/products1.html#FILETIMES

just store them as longs in the database. The only catch is ad-hoc
queries won't work with them.

JDBC out the box should be fine.
one of :
DATE java.sql.Date
TIME java.sql.Time
TIMESTAMP java.sql.Timestamp
BIGINT long

will be what you need.
 
R

Robert Maas, see http://tinyurl.com/uh3t

From: James Harris said:
I have a requirement to store timestamps in a database. ...
1) subsecond resolution - milliseconds or, preferably, more detailed

How do you plan to deal with leap seconds?
- Stick to astronomical time, which is absolutely consistent but
which drifts from legal time?
- Stick to legal time (UTC), which stalls by one second from time
to time, causing time-difference calculations to be incorrect by
varying numbers of seconds?
Only after you make *that* crucial decision, will it be reasonable
to consider milliseconds or other sub-second resolution.

As for the representation to store in the DB, somebody suggested
text, and I agree, with one clarification: Stick to US-ASCII, which
has been incorporated into UniCode hence is pretty much guaranteed
to be stable for longer than you care about.
 
R

Roedy Green

- Stick to astronomical time, which is absolutely consistent but
which drifts from legal time?

depends what you are measuring. IF you are doing astronomy, your
advice would apply. If you are doing payrolls, you want effectively to
pretend the leap seconds never happened, just as Java does.
 
S

Steve O'Hara-Smith

depends what you are measuring. IF you are doing astronomy, your
advice would apply. If you are doing payrolls, you want effectively to
pretend the leap seconds never happened, just as Java does.

Which leaves you about 30 seconds out by now - smelly.
 
M

Martin Gregorie

Steve said:
Which leaves you about 30 seconds out by now - smelly.
Easy solution: always read Zulu time directly from a recognized
real-time clock and store the result in a database as a
ccyymmddhhmmssfff ASCII string where fff is milliseconds). By
"recognized real-time clock) that I mean an atomic clock and
distribution network such as GPS or (in the UK or Germany) an MSF low
frequency radio broadcast. NTP using tier-1 sources may do the job too.
The clock interface may need to be JINI because most suitable receivers
have serial interfaces.

This is certainly accurate for financial transactions: the UK CHAPS
inter-bank network has a Rugby MSF receiver in each bank's gateway
computer and uses that for all timestamps.
 
J

James Harris

How do you plan to deal with leap seconds?
- Stick to astronomical time, which is absolutely consistent but
which drifts from legal time?
- Stick to legal time (UTC), which stalls by one second from time
to time, causing time-difference calculations to be incorrect by
varying numbers of seconds?
Only after you make *that* crucial decision, will it be reasonable
to consider milliseconds or other sub-second resolution.

Not a problem for me. I will be taking samples and storing either
point samples or averages depending on the value being sampled. Pseudo-
GMT will be good enough. Astronimical time will be no good as the time
is to relate to the time of day the samples were taken. I think I can
just use the time as returned by the language I am using (which
presumably will get it from a C system call or similar). If one sample
over midnight when a leap second adjustment happens turns out to be
slightly incorrect it won't skew the results significantly. I could
sanity check the time, though. Hmmm.....
 
P

Paul Rubin

Martin Gregorie said:
Easy solution: always read Zulu time directly from a recognized
real-time clock

That's no good, it doesn't let you accurately compute the difference
between timestamps. Nixon resigned the US presidency at noon EDT
(1800 UTC, I think) on August 9, 1974. You cannot accurately compute
the number of seconds between Nixon's resignation and 1800 UTC today,
unless you take into account the leap seconds have been occurred
between then and now. If you want a precise timestamp and you don't
want to deal with leap seconds, TAI is one approach. There is
currently some political pressure to get rid of leap seconds to ease
computer synchronization, but (at least some of) the astronomy
community is opposed; see

http://en.wikipedia.org/wiki/Leap_second
http://www.ucolick.org/~sla/leapsecs/

TAI really does seem like the most absolute--if you are a user in
orbit or on Mars, then UTC timestamps will seem pretty meaningless and
artificial.
By "recognized real-time clock) that I mean an atomic clock and
distribution network such as GPS or (in the UK or Germany) an MSF
low frequency radio broadcast. NTP using tier-1 sources may do the
job too. The clock interface may need to be JINI because most
suitable receivers have serial interfaces.

No do NOT use stratum 1 sources for something like this. They are
reference clocks for stratum 2 servers and are overloaded from being
used unnecessarily for other purposes. You are fine using GPS or one
of the many public lower stratum servers for just about any purpose.
See:

http://support.ntp.org/bin/view/Servers/RulesOfEngagement
This is certainly accurate for financial transactions: the UK CHAPS
inter-bank network has a Rugby MSF receiver in each bank's gateway
computer and uses that for all timestamps.

That is much more sensible than using a stratum 1 server.
 
S

sla29970

TAI really does seem like the most absolute--if you are a user in
orbit or on Mars, then UTC timestamps will seem pretty meaningless and
artificial.

TAI makes sense for clocks on the surface of the earth (at least until
ion trap clocks and picosecond intercomparison become routine, at
which point not even TAI tells what time it is for you), but clocks
off the surface of the earth tick at rates which already differ
nonlinearly from TAI by measurable amounts.
 
M

Martin Gregorie

Paul said:
That's no good, it doesn't let you accurately compute the difference
between timestamps.
>
I don't recall the OP mentioning time interval computability - just a
requirement for sub second accuracy timestamps.
If you want a precise timestamp and you don't
want to deal with leap seconds, TAI is one approach.
>
TAI? Care to provide a reference?
There is
currently some political pressure to get rid of leap seconds to ease
computer synchronization, but (at least some of) the astronomy
community is opposed; see
Yes, that's just silly, especially because if you're trying to do
date-time calculations across historic time or non-western calendars
(e.g. Islamic) the minuscule accumulated leap second error is dwarfed by
all the other uncertainties.
No do NOT use stratum 1 sources for something like this.
>
Fair comment. I was thinking about network delays and jitter and should
not have forgotten Stratum 1 congestion. Of course, you could always run
your own local Stratum 1 clock if accuracy is that important.

IIRC the major American interbank networks use GPS as their time
standard because its about the only system that can avoid jitter and
propagation delays over continental areas without introducing smoothing
engines, e.g. ntpd.
 
P

Paul Rubin

Martin Gregorie said:
I don't recall the OP mentioning time interval computability - just a
requirement for sub second accuracy timestamps.

That Y2038 is an issue suggests the OP wants a timestamp format that
is future-proof and that means it should be good for all plausible
applications. That would include computing intervals.
TAI? Care to provide a reference?

Same one already given: http://cr.yp.to/proto/utctai.html
 
M

Martin Gregorie

TAI makes sense for clocks on the surface of the earth (at least until
ion trap clocks and picosecond intercomparison become routine, at
which point not even TAI tells what time it is for you), but clocks
off the surface of the earth tick at rates which already differ
nonlinearly from TAI by measurable amounts.
True. The first direct demonstration of relativistic time dilation was
made in 1971 with three HP cesium beam atomic clocks. One stayed in the
lab, while the other were shipped round the world in opposite directions
on commercial jet flights. When the clocks were compared afterwards
the errors in the traveling clocks agreed with theory within
experimental error. See:

http://hyperphysics.phy-astr.gsu.edu/hbase/relativ/airtim.html

for more detail. This shows the clocks don't have to be moving at
interplanetary speeds to be significantly affected.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,571
Members
45,045
Latest member
DRCM

Latest Threads

Top