Regular expression for different date formats in Python

Discussion in 'Python' started by undesputed.hackerz@gmail.com, Nov 26, 2012.

  1. Guest

    Hello Developers,

    I am a beginner in python and need help with writing a regular expression for date and time to be fetched from some html documents. In the following code I am walking through the html files in a folder called event and printing the headings with h1 tag using beautifulsoup. These html pages also contains different formats of date and time. I want to fetch and display this information as well. Different formats of date in these html documents are:

    21 - 27 Nov 2012
    1 Dec 2012
    30 Nov - 2 Dec 2012
    26 Nov 2012

    Can someone help me out with fetching these formats from these html documents ?
    Here is my code for walking through the files and fetching h1 from those html files:


    Code:


    import re
    import os
    from bs4 import BeautifulSoup

    for subdir, dirs, files in os.walk("/home/himanshu/event/"):
    for fle in files:
    path = os.path.join(subdir, fle)
    soup = BeautifulSoup(open(path))

    print (soup.h1.string)

    #Date and Time detection
     
    , Nov 26, 2012
    #1
    1. Advertising

  2. On 11/26/2012 06:15 AM, wrote:
    > I am a beginner in python and need help with writing a regular
    > expression for date and time to be fetched from some html documents.


    Would the "parser" module from the third-party dateutil module work for you?

    http://pypi.python.org/pypi/python-dateutil
    http://labix.org/python-dateutil#head-c0e81a473b647dfa787dc11e8c69557ec2c3ecd2

    I don't believe the library is updated for Python 3 yet, sadly. But I
    bet it could be ported fairly easily. I think it's pure python.
     
    Michael Torrie, Nov 26, 2012
    #2
    1. Advertising

  3. 2012/11/26 <>:
    > Hello Developers,
    >
    > I am a beginner in python and need help with writing a regular expressionfor date and time to be fetched from some html documents. In the followingcode I am walking through the html files in a folder called event and printing the headings with h1 tag using beautifulsoup. These html pages also contains different formats of date and time. I want to fetch and display thisinformation as well. Different formats of date in these html documents are:
    >
    > 21 - 27 Nov 2012
    > 1 Dec 2012
    > 30 Nov - 2 Dec 2012
    > 26 Nov 2012
    >
    > Can someone help me out with fetching these formats from these html documents ?
    > Here is my code for walking through the files and fetching h1 from those html files:
    >
    >
    > Code:
    >
    >
    > import re
    > import os
    > from bs4 import BeautifulSoup
    >
    > for subdir, dirs, files in os.walk("/home/himanshu/event/"):
    > for fle in files:
    > path = os.path.join(subdir, fle)
    > soup = BeautifulSoup(open(path))
    >
    > print (soup.h1.string)
    >
    > #Date and Time detection
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list


    Hi,
    the following pattern seems to match all of your examples,

    (\d{1,2} )?(Nov|Dec)?( ?- )?(\d{1,2}) (Nov|Dec) (\d{4})

    however, it doesn't look like very robust - of course, you have to add
    the remaining months' abbreviations and check on the (parts of the)
    HTML documents, you are interested in.

    hth,
    vbr
     
    Vlastimil Brom, Nov 26, 2012
    #3
  4. Miki Tebeka Guest

    On Monday, November 26, 2012 8:34:22 AM UTC-8, Michael Torrie wrote:
    > http://pypi.python.org/pypi/python-dateutil
    > ...
    > I don't believe the library is updated for Python 3 yet, sadly.

    dateutil supports 3.x since version 2.0.
     
    Miki Tebeka, Nov 26, 2012
    #4
  5. Miki Tebeka Guest

    On Monday, November 26, 2012 8:34:22 AM UTC-8, Michael Torrie wrote:
    > http://pypi.python.org/pypi/python-dateutil
    > ...
    > I don't believe the library is updated for Python 3 yet, sadly.

    dateutil supports 3.x since version 2.0.
     
    Miki Tebeka, Nov 26, 2012
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,307
  2. Martin Eyles

    Date Formats of Date in Database

    Martin Eyles, Mar 28, 2006, in forum: ASP .Net
    Replies:
    5
    Views:
    1,757
    Martin Eyles
    Mar 29, 2006
  3. Peter Grison

    Date, date date date....

    Peter Grison, May 28, 2004, in forum: Java
    Replies:
    10
    Views:
    3,279
    Michael Borgwardt
    May 30, 2004
  4. =?Utf-8?B?bWF2cmlja18xMDE=?=

    different date formats list/page url?

    =?Utf-8?B?bWF2cmlja18xMDE=?=, Jul 24, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    813
    Ken Cox [Microsoft MVP]
    Jul 24, 2006
  5. Aparna
    Replies:
    2
    Views:
    549
    Martin Gregorie
    Jun 14, 2007
Loading...

Share This Page