Martin v. Löwis a écrit :
PEP 1 specifies that PEP authors need to collect feedback from the
community. As the author of PEP 3131, I'd like to encourage comments
to the PEP included below, either here (comp.lang.python), or to
(e-mail address removed)
In summary, this PEP proposes to allow non-ASCII letters as
identifiers in Python. If the PEP is accepted, the following
identifiers would also become valid as class, function, or
variable names: Löffelstiel, changé, ошибка, or å£²ã‚Šå ´
(hoping that the latter one means "counter").
I believe this PEP differs from other Py3k PEPs in that it really
requires feedback from people with different cultural background
to evaluate it fully - most other PEPs are culture-neutral.
So, please provide feedback, e.g. perhaps by answering these
questions:
- should non-ASCII identifiers be supported? why?
- would you use them if it was possible to do so? in what cases?
I strongly prefer to stay with current standard limited ascii for
identifiers.
Ideally, it would be agreable to have variables like greek letters for
some scientific vars, for french people using éèçà in names...
But... (I join common obections):
* where are-they on my keyboard, how can I type them ?
(i can see french éèçà , but us-layout keyboard dont know them, imagine
kanji or greek)
* how do I spell this cyrilic/kanji char ?
* when there are very similar chars, how can I distinguish them?
(without dealing with same representation chars having different unicode
names)
* is "amédé" variable and "amede" the same ?
* its an anti-KISS rule.
* not only I write code, I read it too, and having such variation
possibility in names make code really more unreadable.
(unless I learn other scripting representation - maybe not a bad thing
itself, but its not the objective here).
* I've read "Restricting the language to ASCII-only identifiers does
not enforce comments and documentation to be English, or the identifiers
actually to be English words, so an additional policy is necessary,
anyway."
But even with comments in german or spanish or japanese, I can guess to
identify what a (well written) code is doing with its data. It would be
very difficult with unicode spanning identifiers.
==> I wouldn't use them.
So, keep ascii only.
Basic ascii is the lower common denominator known and available
everywhere, its known by all developers who can identify these chars
correctly (maybe 1 vs I or O vs 0 can get into problems with uncorrect
fonts).
Maybe, make default file-encoding to utf8 and strings to be unicode
strings by default (with a s"" for basic strings by example), but this
is another problem.
L.Pointal.