Discussion in 'Python' started by TYR, May 29, 2008.

  1. TYR

    TYR Guest

    I'm doing some data normalisation, which involves data from a Web site
    being extracted with BeautifulSoup, cleaned up with a regex, then
    having the current year as returned by time()'s tm_year attribute
    inserted, before the data is concatenated with string.join() and fed
    to time.strptime().

    Here's some code:
    timeinput = re.split('[\s:-]', rawtime)
    print timeinput #trace statement
    print year #trace statement
    t = timeinput.insert(2, year)
    print t #trace statement
    t1 = string.join(t, '')
    timeobject = time.strptime(t1, "%d %b %Y %H %M")

    year is a Unicode string; so is the data in rawtime (BeautifulSoup
    gives you Unicode, dammit). And here's the output:

    [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
    2008 (OK, so the year is a year)
    None (...but what's this?)
    Traceback (most recent call last):
    File "", line 71, in <module>
    t1 = string.join(t, '')
    File "/usr/lib/python2.5/", line 316, in join
    return sep.join(words)
    TYR, May 29, 2008
  2. First - don't use module string anymore. Use e.g.


    Second, you can only join strings. but year is an integer. So convert it to
    a string first:

    t = timeinput.insert(2, str(year))

    Diez B. Roggisch, May 29, 2008
  3. TYR

    alex23 Guest

    list.insert modifies the list in-place:
    [1, 2, 4, 3]

    It also returns None, which is what you're assigning to 't' and then
    trying to join.

    Replace your usage of 't' with 'timeinput' and it should work.
    alex23, May 29, 2008
  4. TYR

    TYR Guest

    Yes, tm_year is converted to a unicode string elsewhere in the program.
    TYR, May 29, 2008
  5. TYR

    TYR Guest

    Thank you.
    TYR, May 29, 2008
