UnicodeDecodeError, how to elegantly deal with this?

J

Jorgen Bodde

Hi All,

I am relatively new to python unicode pains and I would like to have
some advice. I have this snippet of code:

def playFile(cmd, args):
argstr = list()
for arg in appcfg.options[appcfg.CFG_PLAYER_ARGS].split():
thefile = args["file"]
filemask = u"%file%"
therep = arg.replace(filemask, thefile) ##### error here
argstr.append(therep)
argstr.insert(0, appcfg.options[appcfg.CFG_PLAYER_PATH])

try:
subprocess.Popen( argstr )
except OSError:
cmd.html = "<h1>Can't play file</h1></br>" + args["file"]
return

cmd.redirect = _getBaseURL("series?cmd_get_series=%i" % args["id"])
cmd.html = ""

-------------------

It crashes on this:

20:03:49: File
"D:\backup\important\src\airs\webserver\webdispatch.py", line 117, in
playFile therep = arg.replace(filemask, thefile)

20:03:49: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in
position 93: ordinal not in range(128)

20:03:49: Unhandled Error: <type 'exceptions.UnicodeDecodeError'>:
'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
range(128)

It chokes on a ` character in a file name. I read this file from disk,
and I would like to play it. However in the replace action it cannot
translate this character. How can I transparently deal with this issue
because in my eyes it is simply replacing a string with a string, and
I do not want to be bothered with unicode problems. I am not sure in
which encoding it is in, but I am not experienced enough to see how I
can solve this

Can anybody guide me to an elegant solution?

Thanks in advance!
- Jorgen
 
J

John Machin

Hi All,

I am relatively new to python unicode pains and I would like to have
some advice. I have this snippet of code:
thefile = args["file"]
filemask = u"%file%"
therep = arg.replace(filemask, thefile) ##### error here

It crashes on this:

20:03:49: File
"D:\backup\important\src\airs\webserver\webdispatch.py", line 117, in
playFile therep = arg.replace(filemask, thefile)

20:03:49: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in
position 93: ordinal not in range(128)

20:03:49: Unhandled Error: <type 'exceptions.UnicodeDecodeError'>:
'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
range(128)

It chokes on a ` character in a file name. I read this file from disk,
and I would like to play it. However in the replace action it cannot
translate this character. How can I transparently deal with this issue
because in my eyes it is simply replacing a string with a string, and
I do not want to be bothered with unicode problems. I am not sure in
which encoding it is in, but I am not experienced enough to see how I
can solve this

If you don't want to be bothered with "unicode problems":
(1) Don't create a "unicode problem" when one doesn't exist.
(2) Don't bother other people with *your* "unicode problems".
Can anybody guide me to an elegant solution?

Short path:
In this case, less is more; remove the u prefix in the line
filemask = u"%file%"

Long Path:
Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
an instrument of Satan. Read this:
http://www.amk.ca/python/howto/unicode

By the way, how one's filesystem encodes file names can be a good
thing to know; in your case it appears to be UTF-8.

HTH,
John
 
J

Jorgen Bodde

Hi John,
If you don't want to be bothered with "unicode problems":
(1) Don't create a "unicode problem" when one doesn't exist.
(2) Don't bother other people with *your* "unicode problems".

Well I guess you misunderstood what I meant. I meant I am a simple
developer, getting a string from the file system that happens to be in
some kind of encoding. It is totally a mystery to me why it crashes on
that so that is what I meant with not wanted to be bothered with it,
because I don't see any obvious reason why, not that I am too lazy to
deal with it, it simply seems strange to me.
In this case, less is more; remove the u prefix in the line
filemask = u"%file%"

Ok thanks. I thought making it unicode because it is a search string
that is used in a UTF-8 encoded replacement, would solve it,
Long Path:
Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
an instrument of Satan. Read this:
http://www.amk.ca/python/howto/unicode

I never said that I have an attitude towards unicode, I simply
misunderstood it's inner workings. Thanks for the link I will look at
it.

ps. sorry for the direct mail, I can't get used to one mailinglist
always replying to the list, and the other replying to the user by
default ;-)

With regards,
- Jorgen

Hi All,

I am relatively new to python unicode pains and I would like to have
some advice. I have this snippet of code:
thefile = args["file"]
filemask = u"%file%"
therep = arg.replace(filemask, thefile) ##### error here

It crashes on this:

20:03:49: File
"D:\backup\important\src\airs\webserver\webdispatch.py", line 117, in
playFile therep = arg.replace(filemask, thefile)

20:03:49: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in
position 93: ordinal not in range(128)

20:03:49: Unhandled Error: <type 'exceptions.UnicodeDecodeError'>:
'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
range(128)

It chokes on a ` character in a file name. I read this file from disk,
and I would like to play it. However in the replace action it cannot
translate this character. How can I transparently deal with this issue
because in my eyes it is simply replacing a string with a string, and
I do not want to be bothered with unicode problems. I am not sure in
which encoding it is in, but I am not experienced enough to see how I
can solve this

If you don't want to be bothered with "unicode problems":
(1) Don't create a "unicode problem" when one doesn't exist.
(2) Don't bother other people with *your* "unicode problems".
Can anybody guide me to an elegant solution?

Short path:
In this case, less is more; remove the u prefix in the line
filemask = u"%file%"

Long Path:
Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
an instrument of Satan. Read this:
http://www.amk.ca/python/howto/unicode

By the way, how one's filesystem encodes file names can be a good
thing to know; in your case it appears to be UTF-8.

HTH,
John
 
J

John Machin

Hi John,


Well I guess you misunderstood what I meant.

Sorry, it's my ETL (English as a Third Language) problem; my mother
tongue is the Queensland dialect of Australian :)
Ok thanks. I thought making it unicode because it is a search string
that is used in a UTF-8 encoded replacement, would solve it,

"UTF-8 encoded" implies a str (8-bits per character) object, not a
unicode object. Solve what? What problem did you have before you put
the u in there?
I never said that I have an attitude towards unicode, I simply
misunderstood it's inner workings.

I must have misunderstood "pains" and "bother", eh?

Cheers,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top