print u"\u0432": why is this so hard? UnciodeEncodeError

?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Skip said:
Martin> I wonder how much would break if Python would assume the
Martin> terminal encoding is UTF-8 on Darwin. Do people use different
Martin> terminal encodings?

I generally use xterm instead of Terminal.app. I think it's encoding is
latin-1.

Not on OS X. Terminal.App is a software package completely different
from xterm. For one thing, xterm is for X11, whereas Terminal.App is
for Window Server (or whatever the name of the Apple GUI is).

It may be that Apple has changed the defaults at some time, but
atleast on my installation, the default encoding of Terminal.App for
all users is UTF-8.

Regards,
Martin
 
M

Michael Hudson

Martin v. Löwis said:
Not on OS X. Terminal.App is a software package completely different
from xterm. For one thing, xterm is for X11, whereas Terminal.App is
for Window Server (or whatever the name of the Apple GUI is).

I think Skip knows this :)

Cheers,
mwh
 
S

Skip Montanaro

David> Now I'm curious -- how do you even find out it's a Terminal
David> window you're looking at, rather than say an xterm?

I just compared the output of "env" in both xterm and Terminal windows and
came up with these clues:

* TERM in an xterm is "xterm". In Terminal it's "vt100".

* In Terminal a TERM_PROGRAM environment variable is defined with a
value of "Apple_Terminal".

* The xterm also defines DISPLAY for obvious reasons.

Skip
 
S

Skip Montanaro

Skip> I generally use xterm instead of Terminal.app. I think it's
Skip> encoding is latin-1.

Martin> Not on OS X.

I run xterms under XDarwin. I don't think the default encoding is different
than xterms in any other X environment. If I execute

print u'\xed'.encode("latin-1")

in an xterm I get an accented "i". If I execute

print u'\xed'.encode("utf-8")

in a Terminal window I also get an accented "i".

Skip
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Skip said:
Skip> I generally use xterm instead of Terminal.app. I think it's
Skip> encoding is latin-1.

Martin> Not on OS X.

I run xterms under XDarwin.

Sorry, I completely misunderstood.

My apologies,
Martin
 
S

Scott Schwartz

Skip Montanaro said:
I run xterms under XDarwin. I don't think the default encoding is different
than xterms in any other X environment. If I execute

These days you can invoke it as uxterm to force it into utf-8 mode.
 
S

Scott Schwartz

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= said:
It is: however, your locale only tells Python the encoding of your
terminal, not the encoding of an arbitrary file you may write to.

That's not the usual interpretation of locale. It's not about
terminals, it's about everything, especially files.
 
D

David Eppstein

It is: however, your locale only tells Python the encoding of your
terminal, not the encoding of an arbitrary file you may write to.

That's not the usual interpretation of locale. It's not about
terminals, it's about everything, especially files.[/QUOTE]

Files should be in a format that specifies the encoding explicitly,
either within the file or as part of an external file format
specification. One should not have to hope that the locale of the
person using the file is the same as that of the person who created it.

(I realize locale also affects e.g. collation order, but the most
problems come from encoding mismatches.)
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Scott said:
That's not the usual interpretation of locale. It's not about
terminals, it's about everything, especially files.

At the application's choice, though. Python should not guess unless
it is likely that it is guessing right. For files, it is likely
guessing wrong - even for text files.

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top