decoding a byte array that is unicode escaped?

S

sam

I have a byte stream read over the internet:

responseByteStream = urllib.request.urlopen( httpRequest );
responseByteArray = responseByteStream.read();

The characters are encoded with unicode escape sequences, for example
a copyright symbol appears in the stream as the bytes:

5C 75 30 30 61 39

which translates to:
\u00a9

which is unicode for the copyright symbol.

I am simply trying to display this copyright symbol on a webpage, so
how do I encode the byte array to utf-8 given that it is 'escape
encoded' in the above way? I tried:

responseByteArray.decode('utf-8')
and responseByteArray.decode('unicode_escape')
and str(responseByteArray).

I am using Python 3.1.
 
P

Peter Otten

sam said:
I have a byte stream read over the internet:

responseByteStream = urllib.request.urlopen( httpRequest );
responseByteArray = responseByteStream.read();

The characters are encoded with unicode escape sequences, for example
a copyright symbol appears in the stream as the bytes:

5C 75 30 30 61 39

which translates to:
\u00a9

which is unicode for the copyright symbol.

I am simply trying to display this copyright symbol on a webpage, so
how do I encode the byte array to utf-8 given that it is 'escape
encoded' in the above way? I tried:

responseByteArray.decode('utf-8')
and responseByteArray.decode('unicode_escape')
and str(responseByteArray).

I am using Python 3.1.

Convert the bytes to unicode first:
'©'

Then convert the string to bytes:
b'\xc2\xa9'
 
S

strong.drug

пÑтница, 6 ноÑÐ±Ñ€Ñ 2009 г., 12:48:47 UTC+4 пользователь sam напиÑал:
I am simply trying to display this copyright symbol on a webpage, so
how do I encode the byte array to utf-8 given that it is 'escape
encoded' in the above way? I tried:

responseByteArray.decode('utf-8')
and responseByteArray.decode('unicode_escape')
and str(responseByteArray).

I am using Python 3.1.
I had some problem with reading zip archive in raw (binary) mode.
I solve it this way
.....
open (filename, 'rb').read ().encode('string_escape')
# now we had strings with strange symbols are escaped
# than we can handle it without decoding excepions for example:
body = '\r\n'.join (lines)
.....
# if we have unescaped strings we can get an exception there
# after opertions, we needed we must unescape all content
# and drop it out to network (in my case)
body = body.decode('string-escape')
.....
# then we can send so to the server
connection.request('POST', upload_url, body, headers)

BR)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top