Re: parse date/time from a log entry with only strftime (and noregexen)

Discussion in 'Python' started by Gabriel Genellina, Feb 3, 2009.

  1. En Tue, 03 Feb 2009 11:52:07 -0200, Simon Mullis <>
    escribió:

    > I'm writing a script to help with analyzing log files timestamps and
    > have a
    > very specific question on which I'm momentarily stumped....
    >
    > I'd like the script to support multiple log file types, so allow a
    > strftime
    > format to be passed in as a cli switch (default is %Y-%m-%d %H:%M:%S).


    >>>> import datetime
    >>>> p = datetime.datetime.strptime("2008-07-23 12:18:28 this is the

    > remainder of the log line that I do not care about", "%Y-%m-%d %H:%M:%S")
    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > File "/opt/local/lib/python2.5/_strptime.py", line 333, in strptime
    > data_string[found.end():])
    > ValueError: unconverted data remains: this is the remainder of the log
    > line
    > that I do not care about


    If the logfile has fixed width fields, you may ask the user a column range
    to extract the date/time information.

    --
    Gabriel Genellina
     
    Gabriel Genellina, Feb 3, 2009
    #1
    1. Advertising

  2. Gabriel Genellina

    andrew cooke Guest

    > > ValueError: unconverted data remains:  this is the remainder of the log  
    > > line
    > > that I do not care about


    you could catch the ValueError and split at the ':' in the .args
    attribute to find the extra data. you could then find the extra data
    in the original string, use the index to remove it, and re-parse the
    time.

    ugly, but should work.
    andrew
     
    andrew cooke, Feb 3, 2009
    #2
    1. Advertising

  3. Gabriel Genellina

    Simon Mullis Guest

    This was a long time ago.... But just in case anyone googling ever has
    the same question, this is what I did (last year). The user just needs
    to supply a strftime formatted string, such as "%A, %e %b %h:%M" and
    this Class figures out the regex to use on the log entries...

    class RegexBuilder(object):
    """This class is used to create the regex from the strftime string.
    So, we pass it a strftime string and it returns a regex with capture
    groups."""

    lookup_table = { '%a' : r"(\w{3})", # locale's abbrev day name
    '%A' : r"(\w{6,8})", # locale's full day name
    '%b' : r"(\w{3})", # abbrev month name
    '%B' : r"(\w{4,9})", # full month name
    '%d' : r"(3[0-1]|[1-2]\d|0[1-9]|[1-9]|[1-9])",
    # day of month
    '%e' : r"([1-9]|[1-3][0-9])", # day of month, no leader
    '%H' : r"(2[0-3]|[0-1]\d|\d)", # Hour (24h clock)
    '%I' : r"(1[0-2]|0[1-9]|[1-9])", # Hour (12h clock)
    '%j' : r"(36[0-6]|3[0-5]\d|[1-2]\d\d|0[1-9]\d|00[1-9]\
    |[1-9]\d|0[1-9]|[1-9])", # Day of year
    '%m' : r"(1[0-2]|0[1-9]|[1-9])", # Month as decimal
    '%M' : r"([0-5]\d|\d)", # Minute
    '%S' : r"(6[0-1]|[0-5]\d|\d)", # Second
    '%U' : r"(5[0-3]|[0-4]\d|\d)", # Week of year (Sun = 0)
    '%w' : r"([0-6])", # Weekday (Sun = 0)
    '%W' : r"(5[0-3]|[0-5]\d|\d)", # Week of year (Mon = 0)
    '%y' : r"(\d{2})", # Year (no century)
    '%Y' : r"(\d{4})", # Year with 4 digits
    '%p' : r"(AM|PM)",
    '%P' : r"(am|pm)",
    '%f' : r"(\d+)", # TODO: microseconds. Only in Py 2.6+
    }

    # Format of the keys in the table above
    strftime_re = r'%\w'

    def __init__(self, date_format):
    r = re.compile(RegexBuilder.strftime_re)
    self.created_re = r.sub(self._lookup, date_format)

    def _lookup(self, match):
    """ Regex lookup..."""
    return RegexBuilder.lookup_table[match.group()]


    > 2009/2/3 andrew cooke <>
    >>
    >> > > ValueError: unconverted data remains:  this is the remainder of the
    >> > > log
    >> > > line
    >> > > that I do not care about

    >>
    >> you could catch the ValueError and split at the ':' in the .args
    >> attribute to find the extra data.  you could then find the extra data
    >> in the original string, use the index to remove it, and re-parse the
    >> time.
    >>
    >> ugly, but should work.
    >> andrew
    >> --
    >> http://mail.python.org/mailman/listinfo/python-list

    >
    >
    >
    > --
    > Simon Mullis
    > _________________
    >
    >
     
    Simon Mullis, Nov 12, 2010
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. davidj411
    Replies:
    11
    Views:
    2,334
  2. Tim Golden
    Replies:
    0
    Views:
    565
    Tim Golden
    Oct 22, 2009
  3. D'Arcy J.M. Cain
    Replies:
    0
    Views:
    447
    D'Arcy J.M. Cain
    Oct 22, 2009
  4. Replies:
    0
    Views:
    1,417
  5. Noozer
    Replies:
    2
    Views:
    302
    Dr John Stockton
    Aug 1, 2005
Loading...

Share This Page