PEP 3131: Supporting Non-ASCII Identifiers

Ross Ridge · May 15, 2007

So, please provide feedback, e.g. perhaps by answering these
questions:
- should non-ASCII identifiers be supported? why?

I think the biggest argument against this PEP is how little similar
features are used in other languages and how poorly they are supported
by third party utilities. Your PEP gives very little thought to how
the change would affect the standard Python library. Are non-ASCII
identifiers going to be poorly supported in Python's own library and
utilities?

Ross Ridge

Guest · May 15, 2007

Stefan said:
You're not trying to suggest that writing code in a closed area project is a
bad habit, are you?

I think that the idea that you know today, with 100% certainty, that all
parts of your closed area project will stay closed forever is an
illusion and thus a bad idea, yes.

What would be bad about allowing a project to decide about the best and
clearest way to name identifiers?

That very same argument could be used to allow all sorts of "strange
stuff" in Python like gotos and worse. What would be bad about allowing
a project to decide about how to do flow control, especially if you
never get in touch with it?

And: if it's not a project you will ever getin touch with - what do

you care?

I just fear that I *will* get in touch with identifiers using non-ASCII
symbols if this PEP is implemented.

Carsten Haese · May 15, 2007

- should non-ASCII identifiers be supported? why?

Click to expand...

Yes. I want this for years. I am Chinese, and teaching some 12 years
old children learning programming. The biggest problem is we cannot
use Chinese words for the identifiers. As the program source becomes
longer, they always lost their thought about the program logic.

English keywords and libraries is not the problem, because we only use
about 30 - 50 of these words for teaching programming idea. They can
remember these words in one week. But for the name of variable or
function, it is difficult to remember all the English word. For
example, when we are doing numbers, maybe these words: [odd, even,
prime, minus ...], when we are programming for drawing: [line, circle,
pixel, ...], when it's for GUI: [ button, event, menu...]. There are
so many words that they cannot just remeber and use these words to
explain there idea.

That is a good point, but I'd like to ask out of curiosity, at what age
do children generally learn pinyin? (Assuming you speak Mandarin. If
not, replace pinyin with the name of whatever phonetic transliteration
is common in your region.) Granted, pinyin shoehorned into ASCII loses
its tone marks, but the result would still be more mnemonic than an
English word that the student has to learn.

Regards,

Guest · May 15, 2007

Thorsten said:
* RenÃ© Fleschenberg (Tue, 15 May 2007 14:04:07 +0200)

No, because it's the /Standard/ Library to be used by everyone. And
the lowest common denominator is ASCII and English.

This makes the argument that this PEP would allow people to write
"Chinese only" Python code invalid (unless they do not use the stdlib).

Stefan Behnel · May 15, 2007

René Fleschenberg said:
I just fear that I *will* get in touch with identifiers using non-ASCII
symbols if this PEP is implemented.

That's easy to prevent: just keep your fingers from projects that work with
them and make sure that projects you start do not use them.

Stefan

Thorsten Kampe · May 15, 2007

* René Fleschenberg (Tue, 15 May 2007 14:50:41 +0200)

You are doing just the same. Your argument that encouraging code-sharing
is not a worthwhile goal is an ideologic one, just as the opposite
argument is, too.

No, if you claim that something by itself is good and has to be
encouraged then you are obliged to prove or give arguments for that.

If I say that I don't think that something in general but only in
special case is a good thing that should be encouraged then I don't
have to proof or give special arguments for that.

If you say there is a man in the moon and I say there isn't then you
have to proof you point, not me.

(I do think that code sharing is very different from
sharing of material goods). That is why I do not think it makes alot of
sense to argue about it. If you don't consider code sharing to be a
value of its own, then that is of course also not an argument against
this PEP. I just happen to have different beliefs.

Exactly. So whether this PEP encourages or discourages code sharing
(and I don't think it does either) has nothing to do with the value of
this PEP.

Thorsten Kampe · May 15, 2007

* René Fleschenberg (Tue, 15 May 2007 14:57:45 +0200)

I think that the idea that you know today, with 100% certainty, that all
parts of your closed area project will stay closed forever is an
illusion and thus a bad idea, yes.

That very same argument could be used to allow all sorts of "strange
stuff" in Python like gotos and worse. What would be bad about allowing
a project to decide about how to do flow control, especially if you
never get in touch with it?

GOTOs are not understable. Identifiers in foreign languages are
perfectly understable. Just not to you.

For coding problems better solutions (approaches) exist than using
GOTOs (procedural programming, modules). For identifier naming
problems it's not a better approach to stick to English or ASCII. It's
just a different approach.

Guest · May 15, 2007

Stefan said:
That's easy to prevent: just keep your fingers from projects that work with
them and make sure that projects you start do not use them.

You keep bringing up that argument that completely neglects reality. The
same argument can be used to justify anything else (including the
opposite of your position: Don't like the fact that Python does not
support non-ASCII identifiers? Pick another language!). Let's introduce
gotos and all other kinds of funny stuff -- after all, noone is forced
to work on a project that uses it!

Thorsten Kampe · May 15, 2007

* René Fleschenberg (Tue, 15 May 2007 15:01:42 +0200)

This makes the argument that this PEP would allow people to write
"Chinese only" Python code invalid (unless they do not use the stdlib).

It has nothing to do with that. It simply allows people to write their
own identifier names with their native character set without using
dodgy transcriptions (that might not even exist). There is no sense in
making people write "ueber" instead of "über". That's 20th century
thinking and archaic.

Marc 'BlackJack' Rintsch · May 15, 2007

RenÃ© said:
As I have said, I don't need to be able to do that (model the
application in perfect English terms). It is better to model it in
non-perfect English terms than to model it in perfect German terms. Yes,
I do sometimes use a dictionary to look up the correct English term for
a domain-specific German word when programming. It is rarely necessary,
but when it is, I usually prefer to take that effort over writing a
mixture of German and English.

What about words that can't really be translated because they are not only
domain specific but some "code" within the organization the project is
written for? Wouldn't it be much easier for maintenance if the
specification, the code, and the users of the program use the same terms
for the same things or concepts instead of mixing this with some
artificial translations?

Maybe you don't need this. The field of programming is very broad and many
domains can be translated and make sense in an international context, but
e.g. software that should map the workflow of a local company with local
laws and regulations and internal "names" for things and concepts looks
strange in both, pure "english" and mixed local language and english. But
the latter is easier to map to the specifications and language of the end
users.

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch · May 15, 2007

Now you are starting to troll?

I thought he starts to argument like you. ;-)

Ciao,
Marc 'BlackJack' Rintsch

Guest · May 15, 2007

Thorsten said:
GOTOs are not understable. Identifiers in foreign languages are
perfectly understable. Just not to you.
For coding problems better solutions (approaches) exist than using
GOTOs (procedural programming, modules). For identifier naming
problems it's not a better approach to stick to English or ASCII. It's
just a different approach.

Just by stating your personal opinion that it is not a better but just
different approach, you won't convince anyone. There are important
arguments for why it actually *is* a better approach -- these have been
brought up in this thread many times now.

Guest · May 15, 2007

Marc said:
I thought he starts to argument like you. ;-)

I did not argue that way. I doubted that non-ASCII identifiers have
proven substantial benefits for *anyone* (except very rare and special
cases). That is quite different from the with-statement.

Thorsten Kampe · May 15, 2007

* René Fleschenberg (Tue, 15 May 2007 15:14:20 +0200)

You keep bringing up that argument that completely neglects reality. The
same argument can be used to justify anything else (including the
opposite of your position: Don't like the fact that Python does not
support non-ASCII identifiers? Pick another language!). Let's introduce
gotos and all other kinds of funny stuff -- after all, noone is forced
to work on a project that uses it!

You are right, except that using the correct characters for words is
not a "funny thing". Using Polish diacritics (for example) for
identifier names just makes sense for a project that already uses
polish comments and polish names for their code. You will never get in
touch with that. Using the right charset for these polish words
doesn't change a bit in your ability to debug or understand this code.

Thorsten Kampe · May 15, 2007

* René Fleschenberg (Tue, 15 May 2007 15:18:48 +0200)

Just by stating your personal opinion that it is not a better but just
different approach, you won't convince anyone. There are important
arguments for why it actually *is* a better approach -- these have been
brought up in this thread many times now.

Yeah. For code sharing for instance *chuckle*.

Guest · May 15, 2007

Thorsten said:
It has nothing to do with that. It simply allows people to write their
own identifier names with their native character set without using
dodgy transcriptions (that might not even exist). There is no sense in
making people write "ueber" instead of "Ã¼ber". That's 20th century
thinking and archaic.

"ueber" at least displays correctly on virtually any computer system in
use today, which for example makes it possible to read and process
tracebacks, something that might actually not only concern developers,
but also end-users.

HYRY · May 15, 2007

That is a good point, but I'd like to ask out of curiosity, at what age
do children generally learn pinyin? (Assuming you speak Mandarin. If
not, replace pinyin with the name of whatever phonetic transliteration
is common in your region.) Granted, pinyin shoehorned into ASCII loses
its tone marks, but the result would still be more mnemonic than an
English word that the student has to learn.

Yes, we use Pinyin, and add a number to deal with tone marks, it is
better than English words, but as a Chinese, reading pingyin is much
slower than reading HanZi.

Carsten Haese · May 15, 2007

I think the biggest argument against this PEP is how little similar
features are used in other languages

That observation is biased by your limited sample. You only see open
source code that chooses to restrict itself to ASCII and mostly English
identifiers to allow for easier code sharing. There could be millions of
kids in China learning C# in native Mandarin and you'd never know about
it.

and how poorly they are supported
by third party utilities. Your PEP gives very little thought to how
the change would affect the standard Python library. Are non-ASCII
identifiers going to be poorly supported in Python's own library and
utilities?

How would a choice of identifiers interact in any way with Python's
standard or third-party libraries? The only things that get passed
between an application and the libraries are objects that neither know
nor care what identifiers, if any, are attached to them.

Guest · May 15, 2007

Thorsten said:
No, if you claim that something by itself is good and has to be
encouraged then you are obliged to prove or give arguments for that.

That would be well outside the scope of this newsgroup, and if you
cannot see the reaons for this yourself, I am afraid that I won't be
able to convince you anyway.

Exactly. So whether this PEP encourages or discourages code sharing
(and I don't think it does either) has nothing to do with the value of
this PEP.

That completely depends on how you look at code-sharing. My impression
always was that the Python community in general does regard code-sharing
as A Good Thing. It is not as if we were talking about forcing people to
share code. Just about creating/keeping an environment that makes this
easily possible and encourages it. That *is* something I regard as good
and which therefore, for me, forms an argument against this PEP --
wether you share that opinion or not.

Guest · May 15, 2007

Thorsten said:
You are right, except that using the correct characters for words is
not a "funny thing". Using Polish diacritics (for example) for
identifier names just makes sense for a project that already uses
polish comments and polish names for their code. You will never get in
touch with that. Using the right charset for these polish words
doesn't change a bit in your ability to debug or understand this code.

I will get in touch with it. I currently have applications installed on
my computer that come with my Linux distribution and that use code with
identifiers and comments written in a non-English language which I would
like to understand. This is tough enough as it is, the same code with
non-ASCII characters that maybe do not even display on my screen would
be even tougher to understand, let alone modify. It does make a
difference wether I can at least recognize and type in the characters or
not. The expectation that such code will only be used for "very closed"
projects that noone else will ever want to get in touch with is unrealistic.

Atoms, Identifiers, and Primaries	21	Apr 16, 2013
Generating valid identifiers	8	Jul 26, 2012
Non-identifiers in dictionary keys for **expression syntax	3	May 23, 2013
Renaming identifiers & debugging	14	Feb 25, 2010
Looking for UNICODE to ASCII Conversioni Example Code	15	Oct 18, 2013
Python 3.5, bytes, and %-interpolation (aka PEP 461)	10	Feb 24, 2014
Is PEP-8 a Code or More of a Guideline?	52	May 26, 2007
Extended identifiers?	1	Jun 15, 2012

PEP 3131: Supporting Non-ASCII Identifiers

Ross Ridge

Guest

Carsten Haese

Guest

Stefan Behnel

Thorsten Kampe

Thorsten Kampe

Guest

Thorsten Kampe

Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch

Guest

Guest

Thorsten Kampe

Thorsten Kampe

Guest

HYRY

Carsten Haese

Guest

Guest

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads