Strange thing with types

T

TYR

I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError
 
D

Diez B. Roggisch

TYR said:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

First - don't use module string anymore. Use e.g.

''.join(t)

Second, you can only join strings. but year is an integer. So convert it to
a string first:

t = timeinput.insert(2, str(year))

Diez
 
A

alex23

I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().

Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")

year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:

[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

list.insert modifies the list in-place:
[1, 2, 4, 3]

It also returns None, which is what you're assigning to 't' and then
trying to join.

Replace your usage of 't' with 'timeinput' and it should work.
 
T

TYR

TYR said:
I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().
Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")
year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:
[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

First - don't use module string anymore. Use e.g.

''.join(t)

Second, you can only join strings. but year is an integer. So convert it to
a string first:

t = timeinput.insert(2, str(year))

Diez

Yes, tm_year is converted to a unicode string elsewhere in the program.
 
T

TYR

I'm doing some data normalisation, which involves data from a Web site
being extracted with BeautifulSoup, cleaned up with a regex, then
having the current year as returned by time()'s tm_year attribute
inserted, before the data is concatenated with string.join() and fed
to time.strptime().
Here's some code:
timeinput = re.split('[\s:-]', rawtime)
print timeinput #trace statement
print year #trace statement
t = timeinput.insert(2, year)
print t #trace statement
t1 = string.join(t, '')
timeobject = time.strptime(t1, "%d %b %Y %H %M")
year is a Unicode string; so is the data in rawtime (BeautifulSoup
gives you Unicode, dammit). And here's the output:
[u'29', u'May', u'01', u'00'] (OK, so the regex is working)
2008 (OK, so the year is a year)
None (...but what's this?)
Traceback (most recent call last):
File "bothv2.py", line 71, in <module>
t1 = string.join(t, '')
File "/usr/lib/python2.5/string.py", line 316, in join
return sep.join(words)
TypeError

list.insert modifies the list in-place:
l = [1,2,3]
l.insert(2,4)
l

[1, 2, 4, 3]

It also returns None, which is what you're assigning to 't' and then
trying to join.

Replace your usage of 't' with 'timeinput' and it should work.

Thank you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top