Is there any way to decode String using unknown codec?

H

howmuchistoday

Hi
I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').
It raises an Error with showing right Korean.
Is there any way to deal it being not broken?
 
B

Benjamin Kaplan

Hi
I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').
It raises an Error with showing right Korean.
Is there any way to deal it being not broken?
--

It's not broken. You're just using the wrong encodings. Try utf-16le.
 
M

MRAB

Hi
I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').
It raises an Error with showing right Korean.
Is there any way to deal it being not broken?
It might be UTF-16:
'댓젒'

I don't know Korean, but that looks reasonable!
 
D

Dieter Maurer

I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').

This looks as if "sys.stdout/sys.stderr" knew the correct encoding.
Check it like this:

import sys
sys.stdout.encoding
 
H

howmuchistoday

T

2012ë…„ 6ì›” 28ì¼ ëª©ìš”ì¼ ì˜¤ì „ 11ì‹œ 20분 28ì´ˆ UTC+9, Benjamin Kaplan ë‹˜ì˜ ë§:
It's not broken. You're just using the wrong encodings. Try utf-16le.

Thank you guys. The problem is solved!
 
H

howmuchistoday

T

2012ë…„ 6ì›” 28ì¼ ëª©ìš”ì¼ ì˜¤ì „ 11ì‹œ 20분 28ì´ˆ UTC+9, Benjamin Kaplan ë‹˜ì˜ ë§:
It's not broken. You're just using the wrong encodings. Try utf-16le.

Thank you guys. The problem is solved!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top