G
Gilles Ganault
Hello
I'm trying to read pages from Amazon JP, whose web pages are
supposed to be encoded in ShiftJIS, and decode contents into Unicode
to keep Python happy:
www.amazon.co.jp
<meta http-equiv="content-type" content="text/html; charset=Shift_JIS"
/>
But this doesn't work:
======
m = try.search(the_page)
if m:
#UnicodeEncodeError: 'charmap' codec can't encode characters in
position 49-55: character maps to <undefined>
title = m.group(1).decode('shift_jis').strip()
======
Has someone successfully accessed Shift-JIS-encoded Japanese contents
with Python?
Thank you.
I'm trying to read pages from Amazon JP, whose web pages are
supposed to be encoded in ShiftJIS, and decode contents into Unicode
to keep Python happy:
www.amazon.co.jp
<meta http-equiv="content-type" content="text/html; charset=Shift_JIS"
/>
But this doesn't work:
======
m = try.search(the_page)
if m:
#UnicodeEncodeError: 'charmap' codec can't encode characters in
position 49-55: character maps to <undefined>
title = m.group(1).decode('shift_jis').strip()
======
Has someone successfully accessed Shift-JIS-encoded Japanese contents
with Python?
Thank you.