Problems of Symbol Congestion in Computer Languages

Xah Lee · Feb 18, 2011

On 2011-02-16, Xah Lee Â wrote:
â”‚ Vast majority of computer languages use ASCII as its character set.
â”‚ This means, it jams multitude of operators into about 20 symbols.
â”‚ Often, a symbol has multiple meanings depending on contex.

On 2011-02-17, rantingrick wrote:
â€¦

On 2011-02-17, Cthun wrote:
â”‚ And you omitted the #1 most serious objection to Xah's proposal,
â”‚ rantingrick, which is that to implement it would require unrealistic
â”‚ things such as replacing every 101-key keyboard with 10001-key
keyboards
â”‚ and training everyone to use them. Xah would have us all replace our
â”‚ workstations with machines that resemble pipe organs, rantingrick,
or
â”‚ perhaps the cockpits of the three surviving Space Shuttles. No doubt
â”‚ they'd be enormously expensive, as well as much more difficult to
learn
â”‚ to use, rantingrick.

keyboard shouldn't be a problem.

Look at APL users.
http://en.wikipedia.org/wiki/APL_(programming_language)
they are happy campers.

Look at Mathematica, which support a lot math symbols since v3 (~1997)
before unicode became popular.
see:
ã€ˆHow Mathematica does Unicode?ã€‰
http://xahlee.org/math/mathematica_unicode.html

word processors, also automatically do symbols such as â€œcurly quotesâ€,
trade mark sign â„¢, copyright sing Â©, arrow â†’, bullet â€¢, ellipsis â€¦
etc, and the number of people who produce document with these chars
are probably more than the number of programers.

in emacs, i recently also wrote a mode that lets you easily input few
hundred unicode chars.
ã€ˆEmacs Math Symbols Input Mode (xmsi-mode)ã€‰
http://xahlee.org/emacs/xmsi-math-symbols-input.html

the essence is that you just need a input system.

look at Chinese, Japanese, Korean, or Islamic. They happily type
without requiring that every symbol they use must have a corresponding
key on keyboard. Some lang, such as Chinese, that's impossible or
impractical.

when a input system is well designd, it could be actually more
efficient than
keyboard combinations to typo special symbols (such as in Mac OS X's
opt key, or
Windows's AltGraph). Because a input system can be context based, that
it looks
at adjacent text to guess what you want.

for example, when you type >= in python, the text editor can
automatically change it to â‰¥ (when it detects that it's appropriate,
e.g. there's a â€œifâ€ nearby)

Chinese phonetic input system use this
extensively. Abbrev system in word processors and emacs is also a form
of
this. I wrote some thought about this here:

ã€ˆDesigning a Math Symbols Input Systemã€‰
http://xahlee.org/comp/design_math_symbol_input.html

Xah Lee

Xah Lee · Feb 18, 2011

Chris Jones wrote:
«.. from a quite different perspective it may be worth noting that
practically all programming languages (not to mention the attached
documentation) are based on the English language. And interestingly
enough, most any software of note appears to have come out of cultures
where English is either the native language, or where the native
language is either relatively close to English.. Northern Europe
mostly.. and not to some small extent, countries where English is well-
established as a universal second language, such as India. Always
struck me as odd that a country like Japan for instance, with all its
achievements in the industrial realm, never came up with one single
major piece of software.»

btw, english is one of the two of India's official lang. It's used
between Indians, and i think it's rare or non-existent for a college
in india that uses local dialect. (this is second hand knowledeg. I
learned this in Wikipedia and experience with indian co-workers)

i also wondered about why japan doesn't seems to have created major
software or OS. Though, Ruby is invented in Japan. I do think they
have some OSes just not that popular... i think for special purposes
OSes, they have quite a lot ... from Mitsubishi, NEC, etc... in their
huge robotics industry among others. (again, this is all second hand
knowledge)

.... i recall having read non-english comp lang that appeared
recently...

Xah Lee

Chris Jones · Feb 18, 2011

I think you are badly misinformed.

The most widespread operating system in the world is not Windows. It's
something you've probably never heard of, from Japan, called TRON.

http://www.linuxinsider.com/story/31855.html
http://web-japan.org/trends/science/sci030522.html

Japan had an ambitious, but sadly failed, "Fifth Generation Computing"
project:

http://en.wikipedia.org/wiki/Fifth_generation_computer
http://vanemden.wordpress.com/2010/08/21/who-killed-prolog/

They did good work, but unfortunately were ahead of their time and the
project ended in failure.

Japan virtually *owns* the video game market. Yes, yes, Americans publish
a few high-profile first-person shooters. For every one of them, there's
about a thousand Japanese games that never leave the country.

There's no shortages of programming languages which have come out of
Japan:

http://hopl.murdoch.edu.au/findlanguages.prx?id=jp&which=ByCountry
http://no-sword.jp/blog/2006/12/programming-without-ascii.html

The one you're most likely to have used or at least know of is Ruby.

Food for thought.. Thanks much for the links..!

cj

rantingrick · Feb 18, 2011

No, evolution is the pursuit of something just barely better than what
the other guy has.

You fail to see from a larger perspective. You still see through the
eyes of part and not a whole. Each cog that is part of a machine is
just a means to an end. You are attempting to elevate one cog above
all other cogs, heck, you want to elevate one cog above the machine.
You are nothing, and until you accept your nothingness, only then will
you understand your place in the greater game.

Evolution is about gaining an edge, not gaining
perfection.

Evolution is about one cog gaining an edge over another, yes. However
the system itself moves toward perfection at the expense of any and
all cogs.

Perfect is the enemy of good.

No. Perfect is the enemy of YOU. You are not able to be perfect
therefor you claim that perfection is evil to justify your meaningless
existence. And who are YOU to weigh good an evil? What do YOU know
about the Universe that gives you these powers of judgment? Do you
think with your measly 60-100 years of lifetime you can contemplate
the meaning of good and evil in a Universe that has existed for eons?
Get REAL! We only see and know a small fraction of what the Universe
really is. A huge portion of the Universe cannot even be accounted
for. And you parade around like some all-knowing being with all the
answers and full of selfishness and vanity. Ha!

If perfection is evil then what is the pursuit of perfection: AKA:
gaining an edge?

rantingrick · Feb 18, 2011

Do you think that just because something has a negative impact towards
you (or your existence) that the *something* is then evil? Take a
animal for instance: We kill animals for sustenance. The act of
killing the animal is negative from the viewpoint of the animal
however it does not make the killing evil.

Of course if the animal could speak i am sure it will tell you that
you are evil for ending its life. However the animal would be speaking
from a selfish viewpoint. The animal fails to see that the human is
more important than itself in the evolutionary chain. And by consuming
the flesh of the animal the human can continue to evolve more
knowledge. However the animal's life was not in vain for it's flesh
contributed to the life of an intelligent agent who then was able to
further the knowledge base of evolution far beyond what the animal
could have ever achieved!

Likewise *we* as intelligent agents are the tools of an intelligent
evolution. When the human becomes insufficient (from the rise of AI)
then the human will become the consumable. You are like the animal,
not understanding your place in the universe. You fail to see the
Universe from OUTSIDE your *own* selfish interpretation. You cannot
wield the meanings of good and evil from a selfish and naive
interpretation of the Universe. You must look much deeper.

"You are not *something* unless you first realize you are *nothing*."

Cthun · Feb 18, 2011

On 2011-02-17, Cthun wrote:
â”‚ And you omitted the #1 most serious objection to Xah's proposal,
â”‚ rantingrick, which is that to implement it would require unrealistic
â”‚ things such as replacing every 101-key keyboard with 10001-key
keyboards

What does your classic unsubstantiated and erroneous claim have to do
with Lisp, Lee?

Stephen Hansen · Feb 18, 2011

O. that's what you call that long-winded nonsense? Education? You must
live in America. Can I hazard a guess that your universal language might
be english? Has it not ever occured to you that people take pride in
their language? It is part of their culture. And yet you rant on about
selfishness?

This is an old-rant, there's nothing new to it. Rick's racist and
imperialistic anti-Unicode rants have all been fully hashed out months
if not years ago, Tyler. There's really nothing more to say about it.

He doesn't get it.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iQEcBAEBAgAGBQJNXouOAAoJEKcbwptVWx/lM1IH/RHWv2B0NcNfejFndwqfoPhN
k1NJ9ETbht54k43dgaS46owmQzqoMlVszMlUVqyW8h2af0ddrDxKprdy2kt0HtCD
LogZRFAxnEO6CpY+E8jYYxWOP9JQhPilPAP+1IeSrRiGTXaFujjrG2D565+NcERZ
P8VFTAfvApzzI2u/vH1cyFdz77KJmV1q5aKFC6tvVfK5nHbHWi1FP8yKgmxcauOp
lr+DjJFYYaV2SYWFDeYcFtmYXpOIAvRUywKhTErjpI65pLPMI+LhU3GCtTfMNPDC
SPao/J5WNr+d3JXGJNLm3ixHwjgcPPSzlZ4TvLh3qki15obVHjh9hHhrkuX3iVI=
=0/Om
-----END PGP SIGNATURE-----

Westley Martínez · Feb 18, 2011

On 2011-02-16, Xah Lee wrote:
â”‚ Vast majority of computer languages use ASCII as its character set.
â”‚ This means, it jams multitude of operators into about 20 symbols.
â”‚ Often, a symbol has multiple meanings depending on contex.

On 2011-02-17, rantingrick wrote:
â€¦

On 2011-02-17, Cthun wrote:
â”‚ And you omitted the #1 most serious objection to Xah's proposal,
â”‚ rantingrick, which is that to implement it would require unrealistic
â”‚ things such as replacing every 101-key keyboard with 10001-key
keyboards
â”‚ and training everyone to use them. Xah would have us all replace our
â”‚ workstations with machines that resemble pipe organs, rantingrick,
or
â”‚ perhaps the cockpits of the three surviving Space Shuttles. No doubt
â”‚ they'd be enormously expensive, as well as much more difficult to
learn
â”‚ to use, rantingrick.

keyboard shouldn't be a problem.

Look at APL users.
http://en.wikipedia.org/wiki/APL_(programming_language)
they are happy campers.

Look at Mathematica, which support a lot math symbols since v3 (~1997)
before unicode became popular.
see:
ã€ˆHow Mathematica does Unicode?ã€‰
http://xahlee.org/math/mathematica_unicode.html

word processors, also automatically do symbols such as â€œcurly quotesâ€,
trade mark sign â„¢, copyright sing Â©, arrow â†’, bullet â€¢, ellipsis â€¦
etc, and the number of people who produce document with these chars
are probably more than the number of programers.

in emacs, i recently also wrote a mode that lets you easily input few
hundred unicode chars.
ã€ˆEmacs Math Symbols Input Mode (xmsi-mode)ã€‰
http://xahlee.org/emacs/xmsi-math-symbols-input.html

the essence is that you just need a input system.

look at Chinese, Japanese, Korean, or Islamic. They happily type
without requiring that every symbol they use must have a corresponding
key on keyboard. Some lang, such as Chinese, that's impossible or
impractical.

when a input system is well designd, it could be actually more
efficient than
keyboard combinations to typo special symbols (such as in Mac OS X's
opt key, or
Windows's AltGraph). Because a input system can be context based, that
it looks
at adjacent text to guess what you want.

for example, when you type >= in python, the text editor can
automatically change it to â‰¥ (when it detects that it's appropriate,
e.g. there's a â€œifâ€ nearby)

Chinese phonetic input system use this
extensively. Abbrev system in word processors and emacs is also a form
of
this. I wrote some thought about this here:

ã€ˆDesigning a Math Symbols Input Systemã€‰
http://xahlee.org/comp/design_math_symbol_input.html

Xah Lee

More people despise APL than like it.

Allowing non-ascii characters as operators is a silly idea simply
because if xorg breaks, which it's very likely to do with the current
video drivers, I'm screwed. Not only does the Linux virtual terminal not
support displaying these special characters, but there's also no way of
inputting them. On top of that, these special characters require more
bytes to display than ascii text, which would bloat source files
unnecessarily.

You say we have symbol congestion, but in reality we have our own symbol
bloat. Japanese has more or less than three punctuation marks, while
English has perhaps more than the alphabet! The fundamental point here
is using non-ascii operators violates the Zen of Python. It violates
"Simple is better than complex," as well as "There should be one-- and
preferably only one --obvious way to do it."

Steven D'Aprano · Feb 18, 2011

for example, when you type >= in python, the text editor can
automatically change it to â‰¥ (when it detects that it's appropriate,
e.g. there's a â€œifâ€ nearby)

You can't rely on the presence of an `if`.

flag = x >= y
value = lookup[x >= y]
filter(lambda x, y: x >= y, sequence)

Not that you need to. There are no circumstances in Python where the
meaning of >= is changed by an `if` statement.

Followups set to comp.lang.python.

Steven D'Aprano · Feb 18, 2011

Allowing non-ascii characters as operators is a silly idea simply
because if xorg breaks, which it's very likely to do with the current
video drivers, I'm screwed.

And if your hard drive crashes, you're screwed too. Why stop at "xorg
breaks"?

Besides, Windows and MacOS users will be scratching their head asking
"xorg? Why should I care about xorg?"

Programming languages are perfectly allowed to rely on the presence of a
working environment. I don't think general purpose programming languages
should be designed with reliability in the presence of a broken
environment in mind.

Given the use-cases people put Python to, it is important for the
language to *run* without a GUI environment. It's also important (but
less so) to allow people to read and/or write source code without a GUI,
which means continuing to support the pure-ASCII syntax that Python
already supports. But Python already supports non-ASCII source files,
with an optional encoding line in the first two lines of the file, so it
is already possible to write Python code that runs without X but can't be
easily edited without a modern, Unicode-aware editor.

Not only does the Linux virtual terminal not
support displaying these special characters, but there's also no way of
inputting them.

That's a limitation of the Linux virtual terminal. In 1984 I used to use
a Macintosh which was perfectly capable of displaying and inputting non-
ASCII characters with a couple of key presses. Now that we're nearly a
quarter of the way into 2011, I'm using a Linux PC that makes entering a
degree sign or a pound sign a major undertaking, if it's even possible at
all. It's well past time for Linux to catch up with the 1980s.

On top of that, these special characters require more
bytes to display than ascii text, which would bloat source files
unnecessarily.

Oh come on now, now you're just being silly. "Bloat source files"? From a
handful of double-byte characters? Cry me a river!

This is truly a trivial worry:
12

The source code to the decimal module in Python 3.1 is 205470 bytes in
size. It contains 63 instances of ">=" and 62 of "<=". Let's suppose
every one of those were changed to â‰¥ or â‰¤ characters. This would "bloat"
the file by 0.06%.

Oh the humanity!!! How will my 2TB hard drive cope?!?!

You say we have symbol congestion, but in reality we have our own symbol
bloat. Japanese has more or less than three punctuation marks, while
English has perhaps more than the alphabet! The fundamental point here
is using non-ascii operators violates the Zen of Python. It violates
"Simple is better than complex," as well as "There should be one-- and
preferably only one --obvious way to do it."

Define "simple" and "complex" in this context.

It seems to me that single character symbols such as â‰¥ are simpler than
digraphs such as >=, simply because the parser knows what the symbol is
after reading a single character. It doesn't have to read on to tell
whether you meant > or >=.

You can add complexity to one part of the language (hash tables are more
complex than arrays) in order to simplify another part (dict lookup is
simpler and faster than managing your own data structure in a list).

And as for one obvious way, there's nothing obvious about using a | b for
set union. Why not a + b? The mathematician in me wants to spell set
union and intersection as a â‹ƒ b â‹‚ c, which is the obvious way to me (even
if my lousy editor makes it a PITA to *enter* the symbols).

The lack of good symbols for operators in ASCII is a real problem. Other
languages have solved it in various ways, sometimes by using digraphs (or
higher-order symbols), and sometimes by using Unicode (or some platform/
language specific equivalent). I think that given the poor support for
Unicode in the general tools we use, the use of non-ASCII symbols will
have to wait until Python4. Hopefully by 2020 input methods will have
improved, and maybe even xorg be replaced by something less sucky.

I think that the push for better and more operators will have to come
from the Numpy community. Further reading:

http://mail.python.org/pipermail/python-dev/2008-November/083493.html

Westley Martínez · Feb 18, 2011

And if your hard drive crashes, you're screwed too. Why stop at "xorg
breaks"?

Because I can still edit text files in the terminal.

I guess you could manually control the magnet in the hard-drive if it
failed but that'd be horrifically tedious.

Besides, Windows and MacOS users will be scratching their head asking
"xorg? Why should I care about xorg?"

Why should I care if my programs run on Windows and Mac? Because I'm a
nice guy I guess....

Programming languages are perfectly allowed to rely on the presence of a
working environment. I don't think general purpose programming languages
should be designed with reliability in the presence of a broken
environment in mind.

Given the use-cases people put Python to, it is important for the
language to *run* without a GUI environment. It's also important (but
less so) to allow people to read and/or write source code without a GUI,
which means continuing to support the pure-ASCII syntax that Python
already supports. But Python already supports non-ASCII source files,
with an optional encoding line in the first two lines of the file, so it
is already possible to write Python code that runs without X but can't be
easily edited without a modern, Unicode-aware editor.

That's a limitation of the Linux virtual terminal. In 1984 I used to use
a Macintosh which was perfectly capable of displaying and inputting non-
ASCII characters with a couple of key presses. Now that we're nearly a
quarter of the way into 2011, I'm using a Linux PC that makes entering a
degree sign or a pound sign a major undertaking, if it's even possible at
all. It's well past time for Linux to catch up with the 1980s.

I feel it's unnecessary for Linux to "catch up" simply because we have
no need for these special characters! When I read Python code, I only
see text from Latin-1, which is easy to input and every *decent* font
supports it. When I read C code, I only see text from Latin-1. When I
read code from just about everything else that's plain text, I only see
text from Latin-1. Even Latex, which is designed for typesetting
mathematical formulas, only allows ASCII in its input. Languages that
accept non-ASCII input have always been somewhat esoteric. There's
nothing wrong with being different, but there is something wrong in
being so different that your causing problems or at least speed bumps
for particular users.

Oh come on now, now you're just being silly. "Bloat source files"? From a
handful of double-byte characters? Cry me a river!

This is truly a trivial worry:

12

The source code to the decimal module in Python 3.1 is 205470 bytes in
size. It contains 63 instances of ">=" and 62 of "<=". Let's suppose
every one of those were changed to â‰¥ or â‰¤ characters. This would "bloat"
the file by 0.06%.

Oh the humanity!!! How will my 2TB hard drive cope?!?!

A byte saved is a byte earned. What about embedded systems trying to
conserve as much resources as possible?

Define "simple" and "complex" in this context.

It seems to me that single character symbols such as â‰¥ are simpler than
digraphs such as >=, simply because the parser knows what the symbol is
after reading a single character. It doesn't have to read on to tell
whether you meant > or >=.

You can add complexity to one part of the language (hash tables are more
complex than arrays) in order to simplify another part (dict lookup is
simpler and faster than managing your own data structure in a list).

I believe dealing with ASCII is simpler than dealing with Unicode, for
reasons on both the developer's and user's side.

And as for one obvious way, there's nothing obvious about using a | b for
set union. Why not a + b? The mathematician in me wants to spell set
union and intersection as a â‹ƒ b â‹‚ c, which is the obvious way to me (even
if my lousy editor makes it a PITA to *enter* the symbols).

Not all programmers are mathematicians (in fact I'd say most aren't). I
know what those symbols mean, but some people might think "a u b n c ...
what?" | actually makes sense because it relates to bitwise OR in which
bits are turned on. Here's an example just for context:

01010101 | 10101010 = 11111111
{1, 2, 3} | {4, 5, 6} = {1, 2, 3, 4, 5, 6}

For me, someone who is deeply familiar with bitwise operations but not
very familiar with sets, I found the set syntax to be quite easy to
understand.

The lack of good symbols for operators in ASCII is a real problem. Other
languages have solved it in various ways, sometimes by using digraphs (or
higher-order symbols), and sometimes by using Unicode (or some platform/
language specific equivalent). I think that given the poor support for
Unicode in the general tools we use, the use of non-ASCII symbols will
have to wait until Python4. Hopefully by 2020 input methods will have
improved, and maybe even xorg be replaced by something less sucky.

I think that the push for better and more operators will have to come
from the Numpy community. Further reading:

http://mail.python.org/pipermail/python-dev/2008-November/083493.html

You have provided me with some well thought out arguments and have
stimulated my young programmer's mind, but I think we're coming from
different angles. You seem to come from a more math-minded, idealist
angle, while I come from a more practical angle. Being a person who has
had to deal with the Ã in my last name and Japanese text on a variety of
platforms, I've found the current methods of non-ascii input to be
largely platform-dependent and---for lack of a better word---crappy,
i.e. not suitable for a 'wide-audience' language like Python.

Paul Rubin · Feb 18, 2011

Westley MartÃnez said:
When I read Python code, I only
see text from Latin-1, which is easy to input and every *decent* font
supports it. When I read C code, I only see text from Latin-1. When I
read code from just about everything else that's plain text, I only see
text from Latin-1. Even Latex, which is designed for typesetting
mathematical formulas, only allows ASCII in its input. Languages that
accept non-ASCII input have always been somewhat esoteric.

Maybe we'll see more of them as time goes by. C, Python, and Latex all
predate Unicode by a number of years. If Latex were written today it
would probably accept Unicode for math symbols, accented and non-Latin
characters, etc.

Steven D'Aprano · Feb 19, 2011

Why should I care if my programs run on Windows and Mac? Because I'm a
nice guy I guess....

Python is a programming language that is operating system independent,
and not just a Linux tool. So you might not care about your Python
programs running on Windows, but believe me, the Python core developers
care about Python running on Windows and Mac OS. (Even if sometimes their
lack of resources make Windows and Mac somewhat second-class citizens.)

I feel it's unnecessary for Linux to "catch up" simply because we have
no need for these special characters!

Given that your name is Westley MartÃnez, that's an astonishing claim!
How do you even write your name in your own source code???

Besides, speak for yourself, not for "we". I have need for them.

When I read Python code, I only
see text from Latin-1, which is easy to input

Hmmm. I wish I knew an easy way to input it. All the solutions I've come
across are rubbish. How do you enter (say) Ã at the command line of a
xterm?

But in any case, ASCII != Latin-1, so you're already using more than
ASCII characters.

Languages that
accept non-ASCII input have always been somewhat esoteric.

Then I guess Python is esoteric, because with source code encodings it
supports non-ASCII literals and even variables:

[steve@sylar ~]$ cat encoded.py
# -*- coding: utf-8 -*-
rÃ©sumÃ© = "Some text here..."
print(rÃ©sumÃ©)

[steve@sylar ~]$ python3.1 encoded.py
Some text here...

[...]

A byte saved is a byte earned. What about embedded systems trying to
conserve as much resources as possible?

Then they don't have to use multi-byte characters, just like they can
leave out comments, and .pyo files, and use `ed` for their standard text
editor instead of something bloated like vi or emacs.

[...]

I believe dealing with ASCII is simpler than dealing with Unicode, for
reasons on both the developer's and user's side.

Really? Well, I suppose if you want to define "you can't do this AT ALL"
as "simpler", then, yes, ASCII is simpler.

Using pure-ASCII means I am forced to write extra code because there
aren't enough operators to be useful, e.g. element-wise addition versus
concatenation. It means I'm forced to spell out symbols in full, like
"British pound" instead of Â£, and use legally dubious work-arounds like
"(c)" instead of Â©, and mispell words (including people's names) because
I can't use the correct characters, and am forced to use unnecessarily
long and clumsy English longhand for standard mathematical notation.

If by simple you mean "I can't do what I want to do", then I agree
completely that ASCII is simple.

Not all programmers are mathematicians (in fact I'd say most aren't). I
know what those symbols mean, but some people might think "a u b n c ...
what?" | actually makes sense because it relates to bitwise OR in which
bits are turned on.

Not all programmers are C programmers who have learned that | represents
bitwise OR. Some will say "a | b ... what?". I know I did, when I was
first learning Python, and I *still* need to look them up to be sure I
get them right.

In other languages, | might be spelled as any of

bitor() OR .OR. || âˆ§

[...]

Being a person who has
had to deal with the Ã in my last name and Japanese text on a variety of
platforms, I've found the current methods of non-ascii input to be
largely platform-dependent and---for lack of a better word---crappy,

Agreed one hundred percent! Until there are better input methods for non-
ASCII characters, without the need for huge keyboards, Unicode is hard
and ASCII easy, and Python can't *rely* on Unicode tokens.

That doesn't mean that languages like Python can't support Unicode
tokens, only that they shouldn't be the only way to do things. For a long
time Pascal include (* *) as a synonym for { } because not all keyboards
included the { } characters, and C has support for trigraphs:

http://publications.gbdirect.co.uk/c_book/chapter2/alphabet_of_c.html

Eventually, perhaps in another 20 years, digraphs like != and <= will go
the same way as trigraphs. Just as people today find it hard to remember
a time when keyboards didn't include { and }, hopefully they will find it
equally hard to remember a time that you couldn't easily enter non-ASCII
characters.

Westley Martínez · Feb 19, 2011

Python is a programming language that is operating system independent,
and not just a Linux tool. So you might not care about your Python
programs running on Windows, but believe me, the Python core developers
care about Python running on Windows and Mac OS. (Even if sometimes their
lack of resources make Windows and Mac somewhat second-class citizens.)

You didn't seem to get my humor. It's ok; most people don't.

Given that your name is Westley MartÃnez, that's an astonishing claim!
How do you even write your name in your own source code???

Besides, speak for yourself, not for "we". I have need for them.

The Ã is easy to input. (Vim has a diacritic feature) It's the funky
mathematical symbols that are difficult.

Hmmm. I wish I knew an easy way to input it. All the solutions I've come
across are rubbish. How do you enter (say) Ã at the command line of a
xterm?

I use this in my xorg.conf:

Section "InputDevice"
Identifier "Keyboard0"
Driver "kbd"
Option "XkbLayout" "us"
Option "XkbVariant" "dvorak-alt-intl"
EndSection

Simply remove 'dvorak-' to get qwerty. It allows you to use the right
Alt key as AltGr. For example:
AltGr+' i = Ã
AltGr+c = Ã§
AltGr+s = ÃŸ

I don't work on Windows or Mac enough to have figured out how to do on
those platforms, but I'm sure there's a simple way.
Again, it's the funky symbols that would be difficult to input.

But in any case, ASCII != Latin-1, so you're already using more than
ASCII characters.

Languages that
accept non-ASCII input have always been somewhat esoteric.

Click to expand...

Then I guess Python is esoteric, because with source code encodings it
supports non-ASCII literals and even variables:

[steve@sylar ~]$ cat encoded.py
# -*- coding: utf-8 -*-
rÃ©sumÃ© = "Some text here..."
print(rÃ©sumÃ©)

[steve@sylar ~]$ python3.1 encoded.py
Some text here...

I should reword that to "Languages that require non-ASCII input have
always been somewhat esoteric" i.e. APL.

[...]

A byte saved is a byte earned. What about embedded systems trying to
conserve as much resources as possible?

Click to expand...

Then they don't have to use multi-byte characters, just like they can
leave out comments, and .pyo files, and use `ed` for their standard text
editor instead of something bloated like vi or emacs.

Hey, I've heard of jobs where all you do is remove comments from source
code, believe it or not!

[...]

I believe dealing with ASCII is simpler than dealing with Unicode, for
reasons on both the developer's and user's side.

Click to expand...

Really? Well, I suppose if you want to define "you can't do this AT ALL"
as "simpler", then, yes, ASCII is simpler.

Using pure-ASCII means I am forced to write extra code because there
aren't enough operators to be useful, e.g. element-wise addition versus
concatenation. It means I'm forced to spell out symbols in full, like
"British pound" instead of Â£, and use legally dubious work-arounds like
"(c)" instead of Â©, and mispell words (including people's names) because
I can't use the correct characters, and am forced to use unnecessarily
long and clumsy English longhand for standard mathematical notation.

If by simple you mean "I can't do what I want to do", then I agree
completely that ASCII is simple.

I guess it's a matter of taste. I don't mind seeing my name as
westley_martinez and am so use to seeing **, sqrt(), and / that seeing
the original symbols is a bit foreign!

Not all programmers are C programmers who have learned that | represents
bitwise OR. Some will say "a | b ... what?". I know I did, when I was
first learning Python, and I *still* need to look them up to be sure I
get them right.

In other languages, | might be spelled as any of

bitor() OR .OR. || âˆ§

Good point, but C is a very popular language.
I'm not saying we should follow C, but we should be aware that that's
where the majority of Python's users are probably coming from (or from
languages with C-like syntax)

[...]

Being a person who has
had to deal with the Ã in my last name and Japanese text on a variety of
platforms, I've found the current methods of non-ascii input to be
largely platform-dependent and---for lack of a better word---crappy,

Click to expand...

Agreed one hundred percent! Until there are better input methods for non-
ASCII characters, without the need for huge keyboards, Unicode is hard
and ASCII easy, and Python can't *rely* on Unicode tokens.

That doesn't mean that languages like Python can't support Unicode
tokens, only that they shouldn't be the only way to do things. For a long
time Pascal include (* *) as a synonym for { } because not all keyboards
included the { } characters, and C has support for trigraphs:

http://publications.gbdirect.co.uk/c_book/chapter2/alphabet_of_c.html

Eventually, perhaps in another 20 years, digraphs like != and <= will go
the same way as trigraphs. Just as people today find it hard to remember
a time when keyboards didn't include { and }, hopefully they will find it
equally hard to remember a time that you couldn't easily enter non-ASCII
characters.

That was good info. I think there is possibility for more symbols, but
not for a long while, and I'll probably never use them if they do become
available, because I don't really care.

BartC · Feb 20, 2011

You have provided me with some well thought out arguments and have
stimulated my young programmer's mind, but I think we're coming from
different angles. You seem to come from a more math-minded, idealist
angle, while I come from a more practical angle. Being a person who has
had to deal with the Ã in my last name

What purpose does the Ã serve in your last name, and how is it different
from i?

(I'd have guessed it indicated stress, but it looks Spanish and I thought
that syllable was stressed anyway.)

alex23 · Feb 20, 2011

rantingrick said:
You lack vision.

And you lack education.

Evolution is the pursuit of perfection at the expense of anything and
everything!

Evolution is the process by which organisms change over time through
genetically shared traits. There is no 'perfection', there is only
'fitness', that is, survival long enough to reproduce. Fitness is not
something any of your ideas possess.

The rest of your conjecture about my opinions and beliefs is just pure
garbage. You'd get far fewer accusations of being a troll if you
stopped putting words into other peoples mouths; then we'd just think
you're exuberantly crazy.

Also, Enough! With! The! Hyperbole! Already! "Visionary" is _never_ a
self-appointed title.

Steven D'Aprano · Feb 21, 2011

Also, Enough! With! The! Hyperbole! Already! "Visionary" is _never_ a
self-appointed title.

You only say that because you lack the vision to see just how visionary
rantingrick's vision is!!!!1!11!

Followups set to c.l.p.

Tim Wintle · Feb 21, 2011

2) Culture. In the West, a designer will decide the architecture of a
major system, and it is a basis
for debate and progress. If he gets it wrong, it is not a personal
disgrace or career limiting. If it is
nearly right, then that is a major success. In Japan, the architecture
has to be a debated and agreed.
This takes ages, costs lots, and ultimately fails. The failure is
because architecture is always a trade off -
there is no perfect answer.

I find this really interesting - we spend quite a lot of time studying
the Toyota production system and seeing how we can do programming work
in a similar way, and it's worked fairly well for us (Kanban, Genchi
Genbutsu, eliminating Muda & Mura, etc).

I would have expected Japanese software to have worked quite smoothly,
with continuous improvement taking in everybody's opinions etc -
although I suppose that if production never starts because the
improvements are done to a spec, rather than the product, it would be a
massive hindrance.

Tim Wintle

Westley Martínez · Feb 21, 2011

What purpose does the Ã serve in your last name, and how is it different
from i?

(I'd have guessed it indicated stress, but it looks Spanish and I thought
that syllable was stressed anyway.)

I don't know. I don't speak Spanish, but to my knowledge it's not a
critical diacritic like in some languages.

rantingrick · Feb 21, 2011

What purpose does the í serve in your last name, and how is it different
from i?

Simple, it does not have a purpose. Well, that is, except to give the
*impression* that a few pseudo intellectuals have a reason to keep
their worthless tenure at universities world wide. It's window
dressing, and nothing more.

(I'd have guessed it indicated stress, but it looks Spanish and I thought
that syllable was stressed anyway.)

The ascii char "i" would suffice. However some languages fell it
necessary to create an ongoing tutorial of the language. Sure French
and Latin can sound "pretty", however if all you seek is "pretty
music" then listen to music. Language should be for communication and
nothing more.

Guy Steele on Parallel Programing	1	Feb 5, 2011
Math Notations, Computer Languages, and the “Form” in Formalism	4	Aug 31, 2009
proliferation of computer languages	8	Jul 18, 2008
Python's Reference And Internal Model Of Computing Languages	7	Feb 2, 2010
With this artifact, everyone can easily invent new languages	5	Jan 10, 2014
Can't solve problems! please Help	0	Sep 26, 2022
math symbols in unicode (grouped by purpose)	2	Aug 13, 2010
[ANN] Python 3 Symbol Glossary	0	Nov 1, 2008

Problems of Symbol Congestion in Computer Languages

Xah Lee

Xah Lee

Chris Jones

rantingrick

rantingrick

Cthun

Stephen Hansen

Westley Martínez

Steven D'Aprano

Steven D'Aprano

Westley Martínez

Paul Rubin

Steven D'Aprano

Westley Martínez

BartC

alex23

Steven D'Aprano

Tim Wintle

Westley Martínez

rantingrick

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads