J
Jorgen Grahn
[Long posting due to the examples, but pretty simple question.]
I'm sitting here with a Debian Linux 'Woody' system with the default Python
2.2 installation, and I want the re module to understand that
re.compile(r'\W+'. re.LOCALE) doesn't match my national, accented
characters.
I don't quite understand how the locale module reasons about these things,
and Python doesn't seem to act as other programs on my system. Bug or my
mistake? Here's my environment:
frailea> env |grep -e LC -e LANG
LC_MESSAGES=C
LC_TIME=C
LANG=sv_SE
LC_NUMERIC=C
LC_MONETARY=C
frailea> locale
LANG=sv_SE
LC_CTYPE="sv_SE"
LC_NUMERIC=C
LC_TIME=C
LC_COLLATE="sv_SE"
LC_MONETARY=C
LC_MESSAGES=C
LC_PAPER="sv_SE"
LC_NAME="sv_SE"
LC_ADDRESS="sv_SE"
LC_TELEPHONE="sv_SE"
LC_MEASUREMENT="sv_SE"
LC_IDENTIFICATION="sv_SE"
LC_ALL=
This seems to indicate that $LANG acts as a fallback when other things (e.g.
LC_CTYPE isn't defined) and that's also what the glibc setlocale(3) man page
says. Works well for me in general, too. However, consider this tiny Python
program:
frailea> cat foo
import locale
print locale.getlocale()
locale.setlocale(locale.LC_CTYPE)
print locale.getlocale()
When I paste it into an interactive Python session, the locale is already
set up correctly (which is what I suppose interactive mode /should/ do):
When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:
frailea> python foo
(None, None)
(None, None)
The corresponding program written in C works as expected:
frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CTYPE, 0));
printf("%s\n", setlocale(LC_CTYPE, ""));
printf("%s\n", setlocale(LC_CTYPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE
So, is this my fault or Python's? I realize I could just adapt and set
$LC_CTYPE explicitly in my environment, but I don't want to capitulate for a
Python bug, if that's what this is.
BR,
Jorgen
I'm sitting here with a Debian Linux 'Woody' system with the default Python
2.2 installation, and I want the re module to understand that
re.compile(r'\W+'. re.LOCALE) doesn't match my national, accented
characters.
I don't quite understand how the locale module reasons about these things,
and Python doesn't seem to act as other programs on my system. Bug or my
mistake? Here's my environment:
frailea> env |grep -e LC -e LANG
LC_MESSAGES=C
LC_TIME=C
LANG=sv_SE
LC_NUMERIC=C
LC_MONETARY=C
frailea> locale
LANG=sv_SE
LC_CTYPE="sv_SE"
LC_NUMERIC=C
LC_TIME=C
LC_COLLATE="sv_SE"
LC_MONETARY=C
LC_MESSAGES=C
LC_PAPER="sv_SE"
LC_NAME="sv_SE"
LC_ADDRESS="sv_SE"
LC_TELEPHONE="sv_SE"
LC_MEASUREMENT="sv_SE"
LC_IDENTIFICATION="sv_SE"
LC_ALL=
This seems to indicate that $LANG acts as a fallback when other things (e.g.
LC_CTYPE isn't defined) and that's also what the glibc setlocale(3) man page
says. Works well for me in general, too. However, consider this tiny Python
program:
frailea> cat foo
import locale
print locale.getlocale()
locale.setlocale(locale.LC_CTYPE)
print locale.getlocale()
When I paste it into an interactive Python session, the locale is already
set up correctly (which is what I suppose interactive mode /should/ do):
import locale
print locale.getlocale() ['sv_SE', 'ISO8859-1']
locale.setlocale(locale.LC_CTYPE) 'sv_SE'
print locale.getlocale() ['sv_SE', 'ISO8859-1']
When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:
frailea> python foo
(None, None)
(None, None)
The corresponding program written in C works as expected:
frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CTYPE, 0));
printf("%s\n", setlocale(LC_CTYPE, ""));
printf("%s\n", setlocale(LC_CTYPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE
So, is this my fault or Python's? I realize I could just adapt and set
$LC_CTYPE explicitly in my environment, but I don't want to capitulate for a
Python bug, if that's what this is.
BR,
Jorgen