Parsing ISO date/time strings - where did the parser go?

J

John Nagle

In Python 2.7:

I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects. The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser. There ought to be

classmethod datetime.fromisoformat(s)

but there isn't. I'd like to avoid adding a dependency on
a third party module like "dateutil".

The "Working with time" section of the Python wiki is so
ancient it predates "datetime", and says so.

There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
updated since 2007 and has many outstanding issues.

There are mentions of "xml.utils.iso8601.parse" in
various places, but the "xml" module that comes
with Python 2.7 doesn't have xml.utils.

http://www.seehuhn.de/pages/pdate
says:

"Unfortunately there is no easy way to parse full ISO 8601 dates using
the Python standard library."

It looks like this was taken out of "xml" at some point,
but not moved into "datetime".

John Nagle
 
T

Thomas Jollans

In Python 2.7:

I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects. The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser. There ought to be

classmethod datetime.fromisoformat(s)

http://docs.python.org/library/datetime.html#datetime.datetime.strptime

The ISO date/time format is dead simple and well-defined. strptime is
quite suitable.
 
P

Paul Rubin

John Nagle said:
There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
updated since 2007 and has many outstanding issues.

Hmm, I have some code that uses ISO date/time strings and just checked
to see how I did it, and it looks like it uses iso8601-0.1.4-py2.6.egg .
I don't remember downloading that module (I must have done it and
forgotten). I'm not sure what its outstanding issues are, as it works
ok in the limited way I use it.

I agree that this functionality ought to be in the stdlib.
 
D

Dave Angel

In Python 2.7:

I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects. The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser. There ought to be

classmethod datetime.fromisoformat(s)

but there isn't. I'd like to avoid adding a dependency on
a third party module like "dateutil".

The "Working with time" section of the Python wiki is so
ancient it predates "datetime", and says so.

There's an iso8601 module on PyPi, but it's abandoned; it hasn't been
updated since 2007 and has many outstanding issues.

There are mentions of "xml.utils.iso8601.parse" in
various places, but the "xml" module that comes
with Python 2.7 doesn't have xml.utils.

http://www.seehuhn.de/pages/pdate
says:

"Unfortunately there is no easy way to parse full ISO 8601 dates using
the Python standard library."

It looks like this was taken out of "xml" at some point,
but not moved into "datetime".

For working with datetime, see
http://docs.python.org/library/datetime.html#datetime.datetime

and look up datetime.strptime()

Likewise for generalized output, check out datetime.strftime().
 
J

John Nagle

Hmm, I have some code that uses ISO date/time strings and just checked
to see how I did it, and it looks like it uses iso8601-0.1.4-py2.6.egg .
I don't remember downloading that module (I must have done it and
forgotten). I'm not sure what its outstanding issues are, as it works
ok in the limited way I use it.

I agree that this functionality ought to be in the stdlib.

Yes, it should. There's no shortage of implementations.
PyPi has four. Each has some defect.

PyPi offers:

iso8601 0.1.4 Simple module to parse ISO 8601 dates
iso8601.py 0.1dev Parse utilities for iso8601 encoding.
iso8601plus 0.1.6 Simple module to parse ISO 8601 dates
zc.iso8601 0.2.0 ISO 8601 utility functions

Unlike CPAN, PyPi has no quality control.

Looking at the first one, it's in Google Code.

http://code.google.com/p/pyiso8601/source/browse/trunk/iso8601/iso8601.py

The first bug is at line 67. For a timestamp with a "Z"
at the end, the offset should always be zero, regardless of the default
timezone. See "http://en.wikipedia.org/wiki/ISO_8601".
The code uses the default time zone in that case, which is wrong.
So don't call that code with your local time zone as the default;
it will return bad times.

Looking at the second one, it's on github:

https://github.com/accellion/iso8601.py/blob/master/iso8601.py

Giant regular expressions! The code to handle the offset
is present, but it doesn't make the datetime object a
timezone-aware object. It returns a naive object in UTC.

The third one is at

https://github.com/jimklo/pyiso8601plus

This is a fork of the first one, because the first one is abandonware.
The bug in the first one, mentioned above, isn't fixed. However, if
a time zone is present, it does return an "aware" datetime object.

The fourth one is the Zope version. This brings in the pytz
module, which brings in the Olsen database of named time zones and
their historical conversion data. None of that information is
used, or necessary, to parse ISO dates and times. Somebody
just wanted the pytz.fixedOffset() function, which does something
datetime already does.

(For all the people who keep saying "use strptime", that doesn't
handle time zone offsets at all.)

John Nagle
 
R

Roy Smith

John Nagle <[email protected]> said:
In Python 2.7:

I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects. The "datetime" object offers
an output method , datetimeobj.isoformat(), but not an input
parser. There ought to be

classmethod datetime.fromisoformat(s)

but there isn't. I'd like to avoid adding a dependency on
a third party module like "dateutil".

I'm curious why? I really think dateutil is the way to go.

It's really amazing (and unfortunate) that datetime has isoformat(), but
no way to go in the other direction.
 
R

Roy Smith

Dave Angel said:
For working with datetime, see
http://docs.python.org/library/datetime.html#datetime.datetime

and look up datetime.strptime()

strptime has two problems.

One is that it's a pain to use (you have to look up all those
inscrutable %-thingies every time).

The second is that it doesn't always work. To correctly parse an
ISO-8601 string, you need '%z', which isn't supported on all platforms.

The third is that I never use methods I can't figure out how to
pronounce.
 
J

John Nagle

Here are three more on PyPI you can try:

iso-8601 0.2.3 Flexible ISO 8601 parser...
PySO8601 0.1.7 PySO8601 aims to parse any ISO 8601 date...
isodate 0.4.8 An ISO 8601 date/time/duration parser and formater

All three have been updated this year.

There's another one inside feedparser, and there used to be
one in the xml module.

Filed issue 15873: "datetime" cannot parse ISO 8601 dates and times
http://bugs.python.org/issue15873

This really should be handled in the standard library, instead of
everybody rolling their own, badly. Especially since in Python 3.x,
there's finally a useful "tzinfo" subclass for fixed time zone
offsets. That provides a way to directly represent ISO 8601 date/time
strings with offsets as "time zone aware" date time objects.

John Nagle
 
R

Roy Smith

Thomas Jollans said:
The ISO date/time format is dead simple and well-defined.

Well defined, perhaps. But nobody who has read the standard could call
it "dead simple". ISO-8601-2004(E) is 40 pages long.

Of course, that fact that it's complicated enough to generate 40 pages
worth of standards document just argues that much more strongly for it
being in the standard lib (so there can be one canonical, well-tested,
way to do it).
 
P

Pete Forman

John Nagle said:
I want to parse standard ISO date/time strings such as

2012-09-09T18:00:00-07:00

into Python "datetime" objects.

Consider whether RFC 3339 might be a more suitable format.

It is a subset of ISO 8601 extended format. Some of the restrictions are

Year must be 4 digits
Fraction separator is period, not comma
All components including time-offset are mandatory, except for time-secfrac
time-minute in time-offset is not optional, must use ±hh:mm or Z

Some latitude is allowed

T may be replaced by e.g. space

Extra feature

time-offset of -00:00 means UTC but local time is unknown
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top