Encoding confusion, please help

P

Pekka Niiranen

Hi,

Probing my system from Python 2.3.4 gives
'iso-8859-1'

Manual says:

locale.getpreferredencoding():
Return the encoding used for text data, according to user preferences...

sys.getdefaultencoding()
Return the name of the current default string encoding
used by the Unicode implementation

When should I use locale.getpreferredencoding() and when
sys.getdefaultencoding()?

Why two different encodings 'cp1252' and 'iso-8859-1' are provided
for my Windows 2000 system?

-pekka-
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Pekka said:
'iso-8859-1'

This is already troublesome; it means somebody (perhaps you)
has tampered with your Python installation. The default system
encoding is ascii, and it should not be changed unless
absolutely necessary.
When should I use locale.getpreferredencoding() and when
sys.getdefaultencoding()?

There should never be a need to probe sys.getdefaultencoding(),
as it should always be ascii.

locale.getpreferredencoding() should be used when converting
Unicode strings to and from byte strings to be stored on the local
system (e.g. in files). Notice that this may or may not be adequate
also when printing data to the terminal. Specifically, on Windows,
the terminal often uses yet another encoding.
Why two different encodings 'cp1252' and 'iso-8859-1' are provided
for my Windows 2000 system?

Python provides many more encodings, including UTF-8, KOI-8R,
ISO-8859-2, cp1250, and so on. Having many codecs available in
the library is a good thing, because different applications have
different needs.

I somehow feel this doesn't answer your question, but then, I don't
fully understand the question.

Regards,
Martin
 
P

Pekka Niiranen

Martin said:
This is already troublesome; it means somebody (perhaps you)
has tampered with your Python installation. The default system
encoding is ascii, and it should not be changed unless
absolutely necessary.

I do not recall changing it manually so just in case I reinstalled
latest versions of my default set of python tools in this order:

Python-2.3.4.exe
pywin32-203.win32-py2.3.exe
wxPython2.5-win32-unicode-2.5.3.1-py23.exe
ctypes-0.9.2.win32-py2.3.exe
numarray-1.1.win32-py2.3.exe
pychecker-0.8.14

and now it IS "ascii".
There should never be a need to probe sys.getdefaultencoding(),
as it should always be ascii.

locale.getpreferredencoding() should be used when converting
Unicode strings to and from byte strings to be stored on the local
system (e.g. in files). Notice that this may or may not be adequate
also when printing data to the terminal. Specifically, on Windows,
the terminal often uses yet another encoding.

Can I find out the terminal encoding somehow?
Python provides many more encodings, including UTF-8, KOI-8R,
ISO-8859-2, cp1250, and so on. Having many codecs available in
the library is a good thing, because different applications have
different needs.

I somehow feel this doesn't answer your question, but then, I don't
fully understand the question.

The reason I asked was that since my Window's regional settings
matches "cp1252" I was puzzled by sys.getdefaultencoding()
not being the same.
 
K

Kent Johnson

Pekka said:
Can I find out the terminal encoding somehow?

On Windows:

D:\>help chcp
Displays or sets the active code page number.

CHCP [nnn]

nnn Specifies a code page number.

Type CHCP without a parameter to display the active code page number.

D:\>chcp
Active code page: 437

Kent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top