Unicode question

  • Thread starter Ben Edwards (lists)
  • Start date
B

Ben Edwards (lists)

I am using python 2.4 on Ubuntu dapper, I am working through Dive into
Python.

There are a couple of inconsictencies.

Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
sys.setdefaultencoding = 'iso−8859−1'

secondly the following does not give a 'UnicodeError: ASCII encoding
error:', and I would expect ti to. In fact it prints out the n with ~
above it fine:

sys.setdefaultencoding = 'ascii'
s = u'La Pe\xf1a'
print s

Any insight?
Ben
 
S

Steve M

Ben said:
I am using python 2.4 on Ubuntu dapper, I am working through Dive into
Python.

There are a couple of inconsictencies.

Firstly sys.setdefaultencoding('iso-8859-1') does not work, I have to do
sys.setdefaultencoding = 'iso-8859-1'

When you run a Python script, the interpreter does some of its own
stuff before executing your script. One of the things it does is to
delete the name sys.setdefaultencoding. This means that by the time
even your first line of code runs that name no longer exists and so you
will be unable to invoke the function as in your first attempt.

The second attempt "sys.setdefaultencoding = 'iso-8859-1' " is creating
a new name under the sys namespace and assigning it a string. This will
not have the desired effect, or probably any effect at all.

I have found that in order to change the default encoding with that
function, you can put the command in a file called sitecustomize.py
which, when placed in the appropriate location (which is
platform-dependent), will be called in time to have the desired effect.

So the order of events is something like:
1. Invoke Python on myscript.py
2. Python does some stuff and then executes sitecustomize.py
3. Python deletes the name sys.setdefaultencoding, thereby making the
function that was so-named inaccessible.
4. Python then begins executing myscript.py.


Regarding the location of sitecustomize.py, on Windows it is
C:\Python24\Lib\sitecustomize.py.

My guess is that you should put it in the same directory as the bulk of
the Python standard library files. (Also in that directory is a
subdirectory called site-packages, where you can put custom modules
that will be available for import from any of your scripts.)
 
G

Guest

Ben said:
Firstly sys.setdefaultencoding('iso−8859−1') does not work, I have to do
sys.setdefaultencoding = 'iso−8859−1'

That "works", but has no effect. You bind the variable
sys.setdefaultencoding to some value, but that value is never used for
anything (do sys.getdefaultencoding() to see what I mean). You could
just as well write

sys.standardkodierung = 'iso-8859-1'
secondly the following does not give a 'UnicodeError: ASCII encoding
error:', and I would expect ti to. In fact it prints out the n with ~
above it fine:

sys.setdefaultencoding = 'ascii'
s = u'La Pe\xf1a'
print s

Any insight?

The print statement uses sys.stdout.encoding, not the default encoding.

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top