Unicode error in sax parser

  • Thread starter Rickard Lindberg
  • Start date
R

Rickard Lindberg

Hi,

Here is a bash script to reproduce my error:

#!/bin/sh

cat > å.timeline <<EOF
<?xml version="1.0" encoding="utf-8"?>
<timeline>
<version>0.13.0devb38ace0a572b+</version>
<categories>
</categories>
<events>
<event>
<start>2011-02-01 00:00:00</start>
<end>2011-02-03 08:46:00</end>
<text>asdsd</text>
</event>
</events>
<view>
<displayed_period>
<start>2011-01-24 16:38:11</start>
<end>2011-02-23 16:38:11</end>
</displayed_period>
<hidden_categories>
</hidden_categories>
</view>
</timeline>
EOF

python <<EOF
# encoding: utf-8
from xml.sax import parse
from xml.sax.handler import ContentHandler
parse(u"Ã¥.timeline", ContentHandler())
EOF

If I instead do

parse(u"Ã¥.timeline".encode("utf-8"), ContentHandler())

the script runs without errors.

Is this a bug or expected behavior?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top