Natural Language Date Processing.

Discussion in 'Python' started by Andrew Gwozdziewycz, Feb 7, 2006.

  1. I've been looking recently for date processing modules that can parse
    dates in forms such as
    "next week" or "6 days from next sunday". PHP has a function that
    will do this called 'strtotime',
    but I have not found a Python implementation. I've check the standard
    date, datetime and time
    modules and also mx.Date* modules. Am I overlooking something here or
    does it not exist?
    Anyone know of an implementation? Also, I realize that this is
    perhaps very English specific
    so I apologize to any non-native English speakers.

    ---
    Andrew Gwozdziewycz

    http://ihadagreatview.org
    http://plasticandroid.org
    Andrew Gwozdziewycz, Feb 7, 2006
    #1
    1. Advertising

  2. Andrew Gwozdziewycz

    Guest

    >From the docs for PHP's 'strtotime'

    Parameters

    time
    The string to parse, according to the GNU Date Input Formats syntax.
    Before PHP 5.0, microseconds weren't allowed in the time, since PHP 5.0
    they are allowed but ignored.
    ....

    It seems that the string to be parsed has to be provided in the GNU
    date format. One option would be to provide a function that calls out
    to the the GNU date program with whatever string you want to parse.

    I'm not aware of an existing function in Python that does this.

    Regards,
    Andy
    , Feb 7, 2006
    #2
    1. Advertising

  3. On 7 Feb 2006 05:51:40 -0800,
    > It seems that the string to be parsed has to be provided in the GNU
    > date format. One option would be to provide a function that calls out
    > to the the GNU date program with whatever string you want to parse.


    Actually, I looked at the source of the php function and it manually
    parses it (i assume according to the GNU date rules). However, i'd
    prefer not to have to port the function to python if someone else has
    already done so, or has a more pythonic implementation.


    --
    Andrew Gwozdziewycz <>
    http://ihadagreatview.org
    http://plasticandroid.org
    Andrew Gwozdziewycz, Feb 7, 2006
    #3
  4. Andrew Gwozdziewycz wrote:
    > I've been looking recently for date processing modules that can parse
    > dates in forms such as "next week" or "6 days from next sunday".


    This is, in fact, a fairly difficult problem in general. See the TERN_
    competition that's currently held yearly on this task. There are a few
    taggers available that do this at:

    http://timex2.mitre.org/taggers/timex2_taggers.html

    But none of them are available as Python modules. You might be able to
    port the Perl script there, but it won't do as well as the ATEL program
    from CU which uses machine learning techniques.

    ... _TERN: http://timex2.mitre.org/tern.html

    STeVe
    Steven Bethard, Feb 7, 2006
    #4
  5. Andrew Gwozdziewycz schrieb:
    > I've been looking recently for date processing modules that can parse
    > dates in forms such as
    > "next week" or "6 days from next sunday". PHP has a function that will
    > do this called 'strtotime',
    > but I have not found a Python implementation. I've check the standard
    > date, datetime and time
    > modules and also mx.Date* modules. Am I overlooking something here or
    > does it not exist?
    > Anyone know of an implementation? Also, I realize that this is perhaps
    > very English specific
    > so I apologize to any non-native English speakers.
    >
    > ---
    > Andrew Gwozdziewycz
    >
    > http://ihadagreatview.org
    > http://plasticandroid.org
    >
    >


    You may take a look at http://labix.org/python-dateutil
    Have fun
    Michael
    Michael Amrhein, Feb 7, 2006
    #5
  6. Andrew Gwozdziewycz schrieb:
    > I've been looking recently for date processing modules that can parse
    > dates in forms such as
    > "next week" or "6 days from next sunday". PHP has a function that will
    > do this called 'strtotime',
    > but I have not found a Python implementation. I've check the standard
    > date, datetime and time
    > modules and also mx.Date* modules. Am I overlooking something here or
    > does it not exist?
    > Anyone know of an implementation? Also, I realize that this is perhaps
    > very English specific
    > so I apologize to any non-native English speakers.
    >
    > ---
    > Andrew Gwozdziewycz
    >
    > http://ihadagreatview.org
    > http://plasticandroid.org
    >
    >


    You may take a look at http://labix.org/python-dateutil
    Have fun
    Michael
    Michael Amrhein, Feb 7, 2006
    #6
  7. > You may take a look at http://labix.org/python-dateutil
    > Have fun
    > Michael
    >


    Looks like it does a good job parsing dates, but doesn't seem to do
    english dates. I found a javascript implementation of a few functions
    that will probably be relatively easy to port to python. Whether or
    not it'll perform well is another story... Thanks for the help.

    --
    Andrew Gwozdziewycz <>
    http://ihadagreatview.org
    http://plasticandroid.org
    Andrew Gwozdziewycz, Feb 7, 2006
    #7
  8. Andrew Gwozdziewycz schrieb:
    >> You may take a look at http://labix.org/python-dateutil
    >> Have fun
    >> Michael
    >>

    >
    > Looks like it does a good job parsing dates, but doesn't seem to do
    > english dates. I found a javascript implementation of a few functions
    > that will probably be relatively easy to port to python. Whether or
    > not it'll perform well is another story... Thanks for the help.
    >
    > --
    > Andrew Gwozdziewycz <>
    > http://ihadagreatview.org
    > http://plasticandroid.org


    >>>from dateutil.parser import parse
    >>>parse("April 16th, 2003")

    datetime.datetime(2003, 4, 16, 0, 0)
    >>>parse("4/16/2003")

    datetime.datetime(2003, 4, 16, 0, 0)

    Aren't these "english dates"?

    Michael
    Michael Amrhein, Feb 9, 2006
    #8
  9. Andrew Gwozdziewycz schrieb:
    >> You may take a look at http://labix.org/python-dateutil
    >> Have fun
    >> Michael
    >>

    >
    > Looks like it does a good job parsing dates, but doesn't seem to do
    > english dates. I found a javascript implementation of a few functions
    > that will probably be relatively easy to port to python. Whether or
    > not it'll perform well is another story... Thanks for the help.
    >
    > --
    > Andrew Gwozdziewycz <>
    > http://ihadagreatview.org
    > http://plasticandroid.org


    >>>from dateutil.parser import parse
    >>>parse("April 16th, 2003")

    datetime.datetime(2003, 4, 16, 0, 0)
    >>>parse("4/16/2003")

    datetime.datetime(2003, 4, 16, 0, 0)

    Aren't these "english dates"?

    Michael
    Michael Amrhein, Feb 9, 2006
    #9
  10. On Thu, 2006-02-09 at 09:47 +0100, Michael Amrhein wrote:
    > Andrew Gwozdziewycz schrieb:
    > >> You may take a look at http://labix.org/python-dateutil
    > >> Have fun
    > >> Michael
    > >>

    > >
    > > Looks like it does a good job parsing dates, but doesn't seem to do
    > > english dates. I found a javascript implementation of a few functions
    > > that will probably be relatively easy to port to python. Whether or
    > > not it'll perform well is another story... Thanks for the help.
    > >
    > > --
    > > Andrew Gwozdziewycz <>
    > > http://ihadagreatview.org
    > > http://plasticandroid.org

    >
    > >>>from dateutil.parser import parse
    > >>>parse("April 16th, 2003")

    > datetime.datetime(2003, 4, 16, 0, 0)
    > >>>parse("4/16/2003")

    > datetime.datetime(2003, 4, 16, 0, 0)
    >
    > Aren't these "english dates"?


    I suspect the OP is referring to "english dates" as non-US format - to
    follow the international convention (DD/MM/YYYY).

    You can use the dayfirst parameter to the parse function to always
    assume international date format:

    US Format:
    >>> parse("10/12/2004")

    datetime.datetime(2004, 10, 12, 0, 0)

    International format:
    >>> parse("10/12/2004", dayfirst=True)

    datetime.datetime(2004, 12, 10, 0, 0)


    John




    --
    This message has been scanned for viruses and
    dangerous content by MailScanner, and is
    believed to be clean.
    John McMonagle, Feb 9, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Grison

    Date, date date date....

    Peter Grison, May 28, 2004, in forum: Java
    Replies:
    10
    Views:
    3,243
    Michael Borgwardt
    May 30, 2004
  2. Steven Bird
    Replies:
    1
    Views:
    390
    tool69
    May 26, 2007
  3. Fred
    Replies:
    3
    Views:
    358
    red floyd
    Sep 25, 2007
  4. Prateek
    Replies:
    1
    Views:
    345
    Alejandro E. Ciniglio
    Aug 14, 2009
  5. sixtyfourbit
    Replies:
    4
    Views:
    108
    Steven D'Aprano
    Jun 18, 2013
Loading...

Share This Page