Using non-ascii symbols

Christoph Zwerschke · Jan 24, 2006

On the page http://wiki.python.org/moin/Python3.0Suggestions
I noticed an interesting suggestion:

"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use Ã—
for cartesian products...)

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

-- Christoph

James Stroud · Jan 24, 2006

Christoph said:
On the page http://wiki.python.org/moin/Python3.0Suggestions
I noticed an interesting suggestion:

"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use Ã—
for cartesian products...)

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

-- Christoph

I can't find "â‰¤, â‰¥, or â‰ " on my keyboard.

James

Robert Kern · Jan 24, 2006

James said:
I can't find "â‰¤, â‰¥, or â‰ " on my keyboard.

Get a better keyboard? or OS?

On OS X,

â‰¤ is Alt-,
â‰¥ is Alt-.
â‰ is Alt-=

Fewer keystrokes than <= or >= or !=.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Giovanni Bajo · Jan 24, 2006

Robert said:
Get a better keyboard? or OS?

On OS X,

? is Alt-,
? is Alt-.
? is Alt-=

Fewer keystrokes than <= or >= or !=.

Sure, but I can't find OS X listed as a prerequisite for using Python. So,
while I don't give a damn if those symbols are going to be supported by Python,
I don't think the plain ASCII version should be deprecated. There are too many
situations where it's still useful (coding across old terminals and whatnot).

Steven D'Aprano · Jan 24, 2006

On the page http://wiki.python.org/moin/Python3.0Suggestions
I noticed an interesting suggestion:

"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use Ã—
for cartesian products...)

Or for multiplication

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

My earliest programming was on (classic) Macintosh, which supported a
number of special characters including â‰¤ â‰¥ â‰ with the obvious
meanings. They were easy to enter too: the Mac keyboard had (has?) an
option key, and holding the option key down while typing a character would
enter a special character. E.g. option-s gave Greek sigma, option-p gave
pi, option-less-than gave â‰¤, and so forth. Much easier than trying to
memorize character codes.

I greatly miss the Mac's ease of entering special characters, and I miss
the ability to use proper mathematical symbols for (e.g.) pi, not equal,
and so forth.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

I think the use of digraphs like != for not equal is a poor substitute for
a real not-equal symbol. I think the reliance of 7-bit ASCII is horrible
and primitive, but without easier, more intuitive ways of entering
non-ASCII characters, and better support for displaying non-ASCII
characters in the console, I can't see this suggestion going anywhere.

Claudio Grondi · Jan 24, 2006

Christoph said:
On the page http://wiki.python.org/moin/Python3.0Suggestions
I noticed an interesting suggestion:

"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use Ã—
for cartesian products...)

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

-- Christoph

One of issues in Python is cross-platform portability. Limiting the
range of symbols to lower ASCII and with specification of a code table
to ASCII is a good deal here. I think, that Unicode is not yet
everywhere and as long it is that way it makes not much sense to go for
it in Python.

Claudio

Ido Yehieli · Jan 24, 2006

Is this idea absurd or will one day our children think
Both... this idea will only become none-absurd when unicode will become
as prevalent as ascii, i.e. unicode keyboards, universal support under
almost every application, and so on. Even if you can easly type it on
your macintosh, good luck using it while using said macintosh to ssh or
telnet to a remote server and trying to type unicode...

Juho Schultz · Jan 24, 2006

Christoph said:
"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

I assume most python beginners know some other programming language, and
are familiar with the >= and friends. Those learning python as their
first programming language will benefit from learning the >= when they
learn a new language.

Unicode is not yet supported everywhere, so some editors/terminals might
display the suggested one-char operators as something else, effectively
"guess what operator I was thinking".

Fortran 90 allowed >, >= instead of .GT., .GE. of Fortran 77. But F90
uses ! as comment symbol and therefore need /= instead of != for
inequality. I guess just because they wanted. However, it is one more
needless detail to remember. Same with the suggested operators.

Rocco Moretti · Jan 24, 2006

Giovanni said:
Robert Kern wrote:

Posting code to newsgroups might get harder too.

Robert Kern · Jan 24, 2006

Rocco Moretti wrote:

[James Stroud wrote:]

Posting code to newsgroups might get harder too.

His post made it through fine. Your newsreader messed it up.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Christoph Zwerschke · Jan 24, 2006

Giovanni said:
Sure, but I can't find OS X listed as a prerequisite for using Python. So,
while I don't give a damn if those symbols are going to be supported by Python,
I don't think the plain ASCII version should be deprecated. There are too many
situations where it's still useful (coding across old terminals and whatnot).

I think we should limit the discussion to allowing non-ascii symbols
*alternatively* to (combinations of) ascii chars. Nobody should be
forced to use them since not all editors/OSs and keyboards support it.

Think about moving from ASCII to LATIN-1 or UTF-8 as similar to moving
from ISO 646 to ASCII (http://en.wikipedia.org/wiki/C_trigraph).

I think it is a legitimate question, after UTF-8 becomes more and more
supported.

Editors could provide means to easily enter these symbols once
programming languages start supporting them: Automatic expansion of
ascii combinations, Alt-Combinations (like in OS-X) or popup menus with
all supported symbols.

-- Christoph

Robert Kern · Jan 24, 2006

Ido said:
Both... this idea will only become none-absurd when unicode will become
as prevalent as ascii, i.e. unicode keyboards, universal support under
almost every application, and so on. Even if you can easly type it on
your macintosh, good luck using it while using said macintosh to ssh or
telnet to a remote server and trying to type unicode...

[~]$ ssh [email protected]
[email protected]'s password:
Linux rkernx2 2.6.12-9-amd64-generic #1 Mon Oct 10 13:27:39 BST 2005 x86_64
GNU/Linux

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
Last login: Mon Jan 9 12:40:28 2006 from 192.168.1.141
[~]$ cat > utf-8.txt
x + y â‰¥ z
[~]$ cat utf-8.txt
x + y â‰¥ z

Luck isn't involved.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Paul Watson · Jan 24, 2006

Christoph said:
On the page http://wiki.python.org/moin/Python3.0Suggestions
I noticed an interesting suggestion:

"These operators â‰¤ â‰¥ â‰ should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use Ã—
for cartesian products...)

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

-- Christoph

This will eventually happen in some form. The problem is that we are
still in the infancy of computing. We are using stones and chisels to
express logic. We are currently faced with text characters with which
to express intent. There will come a time when we are able to represent
a program in another form that is readily portable to many platforms.

In the meantime (probably 50 years or so), it would be advantageous to
use a universal character set for coding programs. To that end, the
input to the Python interpreter should be ISO-10646 or a subset such as
Unicode. If the # -*- coding: ? -*- line specifies something other than
ucs-4, then a preprocessor should convert it to ucs-4. When it is
desireable to avoid the overhead of the preprocessor, developers will
find a way to save source code in ucs-4 encoding.

The problem with using Unicode in utf-8 and utf-16 forms is that the
code will forever need to be written and forever execute additional
processing to handle the MBCS and MSCS (Multiple-Short Character Set)
situation.

Ok. Maybe computing is past infancy. But most development environments
are not much past toddler stage.

Christoph Zwerschke · Jan 24, 2006

Juho said:
Fortran 90 allowed >, >= instead of .GT., .GE. of Fortran 77. But F90
uses ! as comment symbol and therefore need /= instead of != for
inequality. I guess just because they wanted. However, it is one more
needless detail to remember. Same with the suggested operators.

The point is that it is just *not* the same. The suggested operators are
universal symbols (unicode). Nobody would use â‰ as a comment sign. No
need to remember was it .NE. or -ne or <> or != or /= ...

There is also this old dispute of using "=" for both the assignment
operator and equality and how it can confuse newcomers and cause errors.
A consequent use of unicode could solve this problem:

a â† b # Assignment (now "a = b" in Python, a := b in Pascal)
a = b # Eqality (now "a == b" in Python, a = b in Pascal)
a â‰¡ b # Identity (now "a is b" in Python, @a = @b in Pascal)
a â‰ˆ b # Approximately equal (may be interesting for floats)

(I know this goes one step further as it is incompatible to the existing
use of the = sign in Python).

Another aspect: Supporting such symbols would also be in accord with
Python's trait of being "executable pseudo code."

-- Christoph

Dave Hansen · Jan 24, 2006

On Tue, 24 Jan 2006 16:33:16 +0200 in comp.lang.python, Juho Schultz

[...]

Fortran 90 allowed >, >= instead of .GT., .GE. of Fortran 77. But F90
uses ! as comment symbol and therefore need /= instead of != for
inequality. I guess just because they wanted. However, it is one more
needless detail to remember. Same with the suggested operators.

C uses ! as a unary logical "not" operator, so != for "not equal" just
seems to follow, um, logically.

Pascal used <>, which intuitively (to me, anyway ;-) read "less than
or greater than," i.e., "not equal." Perl programmers might see a
spaceship.

Modula-2 used # for "not equal." I guess that wouldn't work well in
Python...

Regards,
-=Dave

Dave Hansen · Jan 24, 2006

On Tue, 24 Jan 2006 04:09:00 +0100 in comp.lang.python, Christoph

[...]

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?

The latter, IMHO. Especially variable names. Consider i vs. ì vs. í
vs. î vs. ï vs. ...

Regards,
-=Dave

Rocco Moretti · Jan 24, 2006

Robert said:
Rocco Moretti wrote:

[James Stroud wrote:]

Posting code to newsgroups might get harder too.

Click to expand...

His post made it through fine. Your newsreader messed it up.

I'm not exactally sure what happened - I can see the three charachters
just fine in your (Robert's) and the original (Christoph's) post. In
Giovanni's post, they're rendered as question marks.

My point still stands: _somewere_ along the way the rendering got messed
up for _some_ people - something that wouldn't have happened with the
<=, >= and != digraphs.

(FWIW, my newsreader is Thunderbird 1.0.6.)

Claudio Grondi · Jan 24, 2006

Christoph said:
The point is that it is just *not* the same. The suggested operators are
universal symbols (unicode). Nobody would use â‰ as a comment sign. No
need to remember was it .NE. or -ne or <> or != or /= ...

There is also this old dispute of using "=" for both the assignment
operator and equality and how it can confuse newcomers and cause errors.
A consequent use of unicode could solve this problem:

Being involved in the discussion about assignment and looking for new
terms which do not cause confusion when explaining what assignment does,
this proposal seems to be a kind of solution:

a â† b # Assignment (now "a = b" in Python, a := b in Pascal)

^-- this seems to me to be still open for further proposals and
discussion. There is no symbol coming to my mind, but I would be glad if
it would express, that 'a' becomes a reference to a Python object being

currently referred by the identifier 'b' (maybe some kind of said:
a = b # Eqality (now "a == b" in Python, a = b in Pascal)
a â‰¡ b # Identity (now "a is b" in Python, @a = @b in Pascal)
a â‰ˆ b # Approximately equal (may be interesting for floats)

^-- this three seem to me to be obvious and don't need to be
further discussed (only implemented as the time for such things will come).

Claudio

Christoph Zwerschke · Jan 24, 2006

Rocco said:
My point still stands: _somewere_ along the way the rendering got messed
up for _some_ people - something that wouldn't have happened with the
<=, >= and != digraphs.

Yes, but Python is already a bit handicapped concerning posting code
anyway because of its significant whitespace. Also, I believe once
Python will support this, the editors will allow converting "digraphs"
<=, >= and != to symbols back and forth, just as all editors learned to
convert tabs to spaces back and forth... And newsreaders and mailers are
also improving. Some years ago, I used to write all German Umlauts as
digraphs because you could never be sure how they arrived. Nowadays, I'm
using Umlauts as something very normal.

-- Christoph

Fredrik Lundh · Jan 24, 2006

Christoph said:
Yes, but Python is already a bit handicapped concerning posting code
anyway because of its significant whitespace. Also, I believe once
Python will support this, the editors will allow converting "digraphs"
<=, >= and != to symbols back and forth

umm. if you have an editor that can convert things back and forth, you
don't really need language support for "digraphs"...

</F>

Flatten an email Message with a non-ASCII body using 8bit CTE	0	Jan 24, 2013
Using __abstractmethod__ with non-methods	0	Jun 16, 2011
PEP 3131: Supporting Non-ASCII Identifiers	399	May 13, 2007
DBD::Oracle, Unicode, non-UTF8-non-ASCII strings	0	Jul 23, 2009
conversion of non-ascii characters with xslt?	3	Jun 20, 2007
Symbols garbage collector in Ruby1.9, fixed?	23	Mar 30, 2009
DeprecationWarning: Non-ASCII character '\xc0'	2	Feb 6, 2004
Funny story about symbols	5	Dec 19, 2004

Using non-ascii symbols

Christoph Zwerschke

James Stroud

Robert Kern

Giovanni Bajo

Steven D'Aprano

Claudio Grondi

Ido Yehieli

Juho Schultz

Rocco Moretti

Robert Kern

Christoph Zwerschke

Robert Kern

Paul Watson

Christoph Zwerschke

Dave Hansen

Dave Hansen

Rocco Moretti

Claudio Grondi

Christoph Zwerschke

Fredrik Lundh

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads