Trouble with the encoding of os.getcwd() in Korean Windows

E

Erik Bethke

Hello All,

I have found much help in the google archives but I am still stuck...

here is my code snippet:
path = os.getcwd()
path = path.decode('UTF8')

Now the trouble is I am getting that darn UnicodeDecodeError, where it
is tripping up on the Korean hangul for My Desktop. Now I have tried
utf8 and utf16 and neither of these works.

So is this my question?: What encoding does windows use for Korean
Windows? I thought it might be and so I surfed around
(http://foundationstone.com.au/HtmlSupport/OnlineHelp/Localisation/SupportedEncodings.html)
and there appears to be an encoding called: windows-949 labeled to be
Korean Windows, which of couse is *not* one of the encodings to be
found in the encodings package... which would suck.

But then I thought about it some more, how could you make software to
do things like read the current directory work on different language
machines??? It would be madness to have a try statement for each and
every encoding under the sun...

Why isn't there a get system encoding method?

Or am I on the entirely wrong track?

Thanks,
-Erik
 
E

Erik Bethke

Hello All,

Well as usual, after I post I keep on digging and I found the answer...

http://cjkpython.i18n.org/

Has the encodings for Chinese, Korean and Japanese... and I took the
hint that I found from the foundationstore and tried cp949 and wa-la!
it works...

Now, the question remains, how do I write windows python code that will
work on all flavors of windows languages? The wxPython demo works,
because I have installed it on a path on my machine that does not have
Hangul in the path. But if I distribute something to end users, they
most certainly have Hangul in their path, or Japanese or Chinese, or
some other encoding... so how do you get the correct encoding from the
system?

Thanks,
-Erik
 
V

Vincent Wehren

Erik said:
Hello All,

I have found much help in the google archives but I am still stuck...

here is my code snippet:
path = os.getcwd()
path = path.decode('UTF8')

Now the trouble is I am getting that darn UnicodeDecodeError, where it
is tripping up on the Korean hangul for My Desktop. Now I have tried
utf8 and utf16 and neither of these works.


So is this my question?: What encoding does windows use for Korean
Windows?

Try "mbcs". This is a built-in encoding avalaible only on Windows and
that equals the system's default ANSI codepage. Using "mbcs", which is
short for "multi-byte character set", the conversions to and from
Unicode (decode/encode) are internally handled by the corresponding
win32 api functions.
 
E

Erik Bethke

Thank you Vincent, I will try this...

I did get over my troubles with this new code snippet:

encoding = locale.getpreferredencoding()
htmlpath = os.getcwd()
htmlpath = htmlpath.decode( encoding )


That seems to be working well too. I can write to these files and I
can open them with the file dialog, but this is now failing with the
famous aschii error:

webbrowser.open( htmlpath, True, True )
 
E

Erik Bethke

Hello All,

sorry for all the posts... I am *almost* there now...

okay I have this code:

import sys, os

encoding = locale.getpreferredencoding()
htmlpath = os.getcwd()
htmlpath = htmlpath.decode( encoding )

..... write to the file .....
...... file is written fine, and can be opened by both FireFox and IE
and displays fine ...

webbrowser.open( htmlpath.encode ( encoding ), True, True )

the line above now works fine (fixed the ascii error)

but *NOW* my problem is that FirefOX pops up a message box
complaining that the file does not exist, but it certainly does, it
just doesn't like what it is called...

Any ideas now?

Thanks,
-Erik
 
E

Erik Bethke

Ah and PS, again this is only for paths that are non-aschii or at least
have Korean in them...

The broswer bit launches successfully in other locations.

-Erik
 
E

Erik Bethke

Wow, even more information. When I set my default browser to IE, it
launches fine... so it is something about FireFox being more picky than
IE...

Where would I hunt down this sort of problem? Sounds rare, should I
contact Mozilla, or can you guys spot something silly I am doing?

Thank you,
-Erik
 
?

=?ISO-8859-1?Q?Walter_D=F6rwald?=

Erik said:
Hello All,

sorry for all the posts... I am *almost* there now...

okay I have this code:

import sys, os

encoding = locale.getpreferredencoding()
htmlpath = os.getcwd()
htmlpath = htmlpath.decode( encoding )

You might want to try os.getcwdu() instead of this. According to
http://www.python.org/doc/2.4/lib/os-file-dir.html
this has been added in Python 2.3 and should work on Windows.

Bye,
Walter Dörwald
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top