diferent answers with isalpha()

N

nuno

Hi,

I have python with sys.version_info = (2, 4, 4, 'final', 0)

In Idle when I do print 'á'.isalpha() I get True. When I make and
execute a script file with the same code I get False.

Why do I have diferent answers ?


Thank you
 
G

Gabriel Genellina

Hi,

I have python with sys.version_info = (2, 4, 4, 'final', 0)

In Idle when I do print 'á'.isalpha() I get True. When I make and
execute a script file with the same code I get False.

Why do I have diferent answers ?

Do you include an encoding directive at the top? (If you omit it you get a
warning in 2.4 and an error in 2.5)

Try this:

import unicodedata
print unicodedata.name(u'á')

from IDLE and from inside a script. You should get "LATIN SMALL LETTER A
WITH ACUTE"; if not, Python thinks your terminal uses a different encoding
than the actual one.
 
J

Jyotirmoy Bhattacharya

In Idle when I do print 'á'.isalpha() I get True. When I make and
execute a script file with the same code I get False.

Why do I have diferent answers ?

Non-ASCII characters in ordinary (8-bit) strings have all kinds of
strangeness. First, the answer of isalpha() and friends depends on the
current locale. By default, Python uses the "C" locale where the
alphabetic characters are a-zA-z only. To set the locale to whatever
is the OS setting for the current user, put this near the beginning of
your script:

import locale
locale.setlocale(locale.LC_ALL,'')

Apparently IDLE does this for you. Hence the discrepancy you noted.

Second, there is the matter of encoding. String literals like the one
you used in your example are stored in whatever encoding your text
editor chose to store your program in. If it doesn't match the
encoding using by the current locale, once again the program fails.

As I see it, the only way to properly handle characters outside the
ASCII set is to use Unicode strings.
 
N

nuno

Non-ASCII characters in ordinary (8-bit) strings have all kinds of
strangeness. First, the answer of isalpha() and friends depends on the
current locale. By default, Python uses the "C" locale where the
alphabetic characters are a-zA-z only. To set the locale to whatever
is the OS setting for the current user, put this near the beginning of
your script:

import locale
locale.setlocale(locale.LC_ALL,'')

Apparently IDLE does this for you. Hence the discrepancy you noted.

Second, there is the matter of encoding. String literals like the one
you used in your example are stored in whatever encoding your text
editor chose to store your program in. If it doesn't match the
encoding using by the current locale, once again the program fails.

As I see it, the only way to properly handle characters outside the
ASCII set is to use Unicode strings.

Jyotirmoy,

You are right. Thank you for your information.

I will follow your advice but it gets me into another problem with
string.maketrans/translate that I can't solve.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top