An assessment of the Unicode standard

R

r

I was reading the thread here...
http://groups.google.com/group/comp...85050b4c6c84e?hl=en&lnk=raot#b0385050b4c6c84e

and it raised some fundamental philophosical questions to me about
natural languages and Unicode. Which IMO * Unicode* is simply a monkey
patch for this soup of multiple languages we have to deal with in
programming and communication.

Unicode (*puke*) seems nothing more than a brain fart of morons. And
sadly it was created by CS majors who i assumed used logic and
deductive reasoning but i must be wrong. Why should the larger world
keep supporting such antiquated languages and character sets through
Unicode? What purpose does this serve? Are we merely trying to make
everyone happy? A sort of Utopian free-language-love-fest-kinda-
thing?

But there is a larger problem that is at the heart of Unicode itself,
and it has to do with this multi-language/multi-culture world we live
in. Even today in 2009 AD with all our technology and advancements we
still live in a dark ages of societal communication. Of the many
things that divide us such as race, color, religion, geography, blah,
the most perplexing and devastating seems to be why have we not
accepted a single global language for all to speak.

Take for instance the Chinese language with it's thousands of
characters and BS, it's more of an art than a language. Why do we
need such complicated languages in this day and time. Many languages
have been perfected, (although not perfect) far beyond that of Chinese
language. The A-Z char set is flawless!

Some may say well how can we possibly force countries/people to speak/
code in a uniform manner? Well that's simple, you just stop supporting
their cryptic languages by dumping Unicode and returning to the
beautiful ASCII and adopting English as the universal world language.
Why English? Well because it is so widely spoken. But whatever we
choose just choose one language and stick with it, perfect it, and
maintain it.

IMO Multiple languages are barriers to communication, collaboration,
and the continuation of our future evolution as intelligent Human
beings and this language multiplicity will comprise our future until
it is reigned in and utterly destroyed. And i think most of you are
missing how important this really is to our future. And so these
problems bring me to the main subject of this thread "Unicode Sucks"

Now you may say to yourself "I am not a sociologist i am a programmer,
so why should i give a flying fig about natural languages or Unicode,
i just accept things as they are". Well yes you could just go on
accepting the status quo that is surely the easy route, but your life
would be so much easier if you crusaded for change.

BUT STOP!, before i go any further i want to respond to what i know
will be condemnation from the sociology nuts out there. Yes
multiculturalism is great, yes art is great, but if you can't see how
the ability to communicate is severely damperd by multi-languages then
you only *feel* with your heart but you apparently have no ability to
reason with your mind intelligently.

[nested thoughts]
A few months ago i was watching some tear-jerking documentary called
something like "Save the Languages" or "The dying languages" blah! In
the documentary these two bleeding-heart-ivy-leaguers were running all
over the world to document some obscure languages that were on the
verge of extinction. And at one utterly amazing point in the episode
they start crying and moaning for the loss of these languages like
their own mother had died a horrible death. I watched all this in much
horror and disbelief with jaw-dropped and i am still scared by the
thought that people actually buy into this BS! Has the world gone mad?

[back on track]
The death of a language is as natural as the death of a flower, or a
fish, or even the ever long oxidation of aluminum coke can. When a
life form dies it brings life to the next generation. With languages,
each death compiles upon the last and we are one step closer to the
unifying language that we all so disparately need. We should herald
the death of the languages like the jazz funerals of New Orleans with
much happiness and fanfare for there is no reason to be sad.

It's called evolution people! Ever heard of science? So ditch the
useless Unicode and save us all a few keystrokes and bottles of
aspirin for the persistent headaches! Simplicity is beautiful!!
 
J

John Machin

Take for instance the Chinese language with it's thousands of
characters and BS, it's more of an art than a language.  Why do we
need such complicated languages in this day and time. Many languages
have been perfected, (although not perfect) far beyond that of Chinese
language.

The Chinese language is more widely spoken than English, is quite
capable of expression in ASCII ("r tongzhi shi sha gua") and doesn't
have those pesky it's/its problems.
The A-Z char set is flawless!

.... for expressing the sounds of a very limited number of languages,
and English is *NOT* one of those.
 
N

Neil Hodgson

r:
Unicode (*puke*) seems nothing more than a brain fart of morons. And
sadly it was created by CS majors who i assumed used logic and
deductive reasoning but i must be wrong. Why should the larger world
keep supporting such antiquated languages and character sets through
Unicode? What purpose does this serve? Are we merely trying to make
everyone happy? A sort of Utopian free-language-love-fest-kinda-
thing?

Wow, I like this world you live in: all that altruism! Unicode was
developed by corporations from the US left coast in order to sell their
products in foreign markets at minimal cost.

Neil
 
R

r

The Chinese language is more widely spoken than English, is quite
capable of expression in ASCII ("r tongzhi shi sha gua") and doesn't
have those pesky it's/its problems.

Oh yes of course it is the most widely spoken amongst Chinese people
since one in every five people on this earth are Chinese. What i meant
to say was that English language is more widespread outside of
"normal" English speaking countries -- of course as a result of
colonialism, and arguably, imperialism. ;)
 
D

Dennis Lee Bieber

The Chinese language is more widely spoken than English, is quite
capable of expression in ASCII ("r tongzhi shi sha gua") and doesn't
have those pesky it's/its problems.
<heh>

WHICH Chinese language? Mandarin or Cantonese (though it should be
pointed out that international air travel flight control is conducted in
English)

A land where everyone "reads" the same language, but doesn't speak
the same language <G>
 
R

r

   Wow, I like this world you live in: all that altruism!

Well if i don't who will? *shrugs*
Unicode was
developed by corporations from the US left coast in order to sell their
products in foreign markets at minimal cost.

So why the heck are we supporting such capitalistic implementations as
Unicode. Sure we must support a winders installer but Unicode, dump
it! We don't support a Python group in Chinese or French, so why this?
Makes no sense to me really. Let M$ deal with it.
 
B

Benjamin Peterson

Neil Hodgson <nyamatongwe+thunder <at> gmail.com> writes:

\\
Unicode was
developed by corporations from the US left coast in order to sell their
products in foreign markets at minimal cost.

Like Sanskrit or Snowman language?
 
N

Neil Hodgson

Benjamin Peterson:
Like Sanskrit or Snowman language?

Sanskrit is mostly written in Devanagari these days which is also
useful for selling things to people who speak Hindi and other Indian
languages.

Not sure if you are referring to the ☃ snowman character or Arctic
region languages like Canadian Aboriginal syllabic writing like á²á¦á’‘ᔨᕽ
which were added to Unicode 8 years after the initial version. I'd guess
that was added from political rather than marketing motives. ☃ was
required since it was present in Japanese character sets.

Neil
 
C

Chris Jones

Benjamin Peterson:
Sanskrit is mostly written in Devanagari these days which is also
useful for selling things to people who speak Hindi and other Indian
languages.

Is the implication that the principal usefulness of such languages as
Hindi and "other Indian languages" is us selling "things" to them..? I
am not from these climes but all the same, I do find you tone of voice
rather offensive, considering that you are referring to a culture that's
about 3000 years older and 3000 richer than ours and certainly deserves
our respect.

Maybe you didn't notice, but our plants shut down many years ago.. They
are selling _us_ their wares.
Not sure if you are referring to the snowman character or Arctic
region languages like Canadian Aboriginal syllabic writing like á²á¦á’‘ᔨᕽ
which were added to Unicode 8 years after the initial version. I'd
guess that was added from political rather than marketing motives. ☃
was required since it was present in Japanese character sets.

Oh.. so.. now Unicode is not only about marketing.. there is also a
political "aspect".. polytonic Greek, Runic, Shavian, Glagolitic,
Carian, Phoenician, Lydian, Cuneiform, not to mention Mathematical
symbols, Braille, Domino Tiles, the IPA..? What was I thinking..?

Nothing personal, I assure you.. maybe I misunderstood what you were
saying.

In any event, you shouldn't feed the troll, even if he's teething.

CJ
 
J

John Nagle

r said:
I was reading the thread here...
http://groups.google.com/group/comp...85050b4c6c84e?hl=en&lnk=raot#b0385050b4c6c84e

and it raised some fundamental philophosical questions

Rant ignored.

Actually, Python 3.x seems finally to have character sets right.
There's "bytes", for uninterpreted binary data, Unicode, and
proper ASCII, 0..127. Within Python, we finally got rid of
"upper code pages".

(I wish the HTML standards people would do the same. HTML 5
should have been ASCII only (with the "&" escapes if desired)
or Unicode. No "Latin-1", no upper code pages, no JIS, etc.)
[nested thoughts]
A few months ago i was watching some tear-jerking documentary called
something like "Save the Languages" or "The dying languages" blah!

It may be a bit much that Unicode supports Cretan Linear B.

John Nagle
 
T

Terry Reedy

r said:
natural languages and Unicode. Which IMO * Unicode* is simply a monkey
patch for this soup of multiple languages we have to deal with in
programming and communication.

A somewhat fair charactierization.

[snip]
everyone happy? A sort of Utopian free-language-love-fest-kinda-
thing?

Not utopian, but pragmatically political. Before unicode, and still
today, we had and have multiple codes. Multiple ascii extenstions for
European languages and even multiple codes just for Japanese. To get
people in the major computing countries, including Japan, to agree to
eventually replace their national codes with one worldwide code, some
kludgy compromises were made.
language. The A-Z char set is flawless!

Hardly. There are too few characters. A basic set should have at least
50. The international phonetic alphabet (IPA) has about 150. Here is a
true Utopian proposal for you (from a non-CS major ;-): develop an
extended IPA 256-character set with just a few control chars (rather
than 32) and punctuation and other markers. Then develop dictionaries to
translate texts in every languange and char set into and back out of
this universal character set.

Fat chance of approval, even if techical issues were resolved.
Some may say well how can we possibly force countries/people to speak/
code in a uniform manner? Well that's simple, you just stop supporting
their cryptic languages by dumping Unicode and returning to the
beautiful ASCII

But most everyone outside the US was not using ascii precisely because
it did not support their language.

Get over the imperfections of unicode. It improves on the prior status quo.

Terry Jan Reedy
 
S

steve

I was reading the thread here...
http://groups.google.com/group/comp...85050b4c6c84e?hl=en&lnk=raot#b0385050b4c6c84e
...
...
It's called evolution people! Ever heard of science? So ditch the
useless Unicode and save us all a few keystrokes and bottles of
aspirin for the persistent headaches! Simplicity is beautiful!!

You are right ! In the same vein, we should all also standardize on using the
Java language for programming, after all /everybody/ writes code in Java.

cheers,
- steve
 
S

Steven D'Aprano

Not sure if you are referring to the ☃ snowman character or Arctic
region languages like Canadian Aboriginal syllabic writing like á²á¦á’‘ᔨᕽ
which were added to Unicode 8 years after the initial version. I'd guess
that was added from political rather than marketing motives. ☃ was
required since it was present in Japanese character sets.


If I recall correctly, the snowman was specifically added at the request
of Japanese television producers, because it is a standard glyph used for
representing snow when showing the weather on TV.

Unicode's stated aim is to have a single universal standard for all
characters needed for communication. From the Unicode Consortium:

What is Unicode?
Unicode provides a unique number for every character, no matter what the
platform, no matter what the program, no matter what the language.

....
Even for a single language like English no single encoding was adequate
for all the letters, punctuation, and technical symbols in common use.

These encoding systems also conflict with one another. That is, two
encodings can use the same number for two different characters, or use
different numbers for the same character. Any given computer (especially
servers) needs to support many different encodings; yet whenever data is
passed between different encodings or platforms, that data always runs
the risk of corruption.

Unicode is changing all that!

Unicode provides a unique number for every character, no matter what the
platform, no matter what the program, no matter what the language.
[end quote]

And from the FAQs:

Unicode covers all the characters for all the writing systems of the
world, modern and ancient. It also includes technical symbols,
punctuations, and many other characters used in writing text.
[end quote]


It's not just about supporting languages used by foreigners too stupid to
speak English (sarcasm!). It's about supporting business users who want a
standard way of referring to dingbats and pictographs, historians who
need to deal with ancient writings and obsolete characters, scientists
and mathematicians who want to use mathematical symbols, editors and book
publishers who want to use their own typographic symbols, including
Braille, musical symbols, and even TV producers who want to include
snowmen on their weather charts.

The Unicode system replaces dozens of incompatible, clashing systems with
a single universal, extensible system. Why would anyone want to go back
to the Bad Old Days where you couldn't transfer data from one OS to
another, or even from one application to another, without quote marks
turning into mathematical symbols or boxes?
 
S

Steven D'Aprano

It may be a bit much that Unicode supports Cretan Linear B.

Thousands of historians who need to discuss Linear B would disagree.

Well, hundreds.


There are tens of thousands of characters available. If there's room for
chess pieces, dingbats with drop shadows and numbers inside circles,
there's room for actual characters from real (if extinct) languages.
 
N

Neil Hodgson

Chris Jones:
Is the implication that the principal usefulness of such languages as
Hindi and "other Indian languages" is us selling "things" to them..?

Unicode was developed by a group of US corporations: Xerox, Apple,
Sun, Microsoft, ... The main motivation was to avoid dealing with
multiple character set encodings since this was difficult, time
consuming and expensive.
I
am not from these climes but all the same, I do find you tone of voice
rather offensive, considering that you are referring to a culture that's
about 3000 years older and 3000 richer than ours and certainly deserves
our respect.

Eh? Was Unicode developed in India? China? What precisely is
direspectful here? Is there a significant population that regards
Unicode as their 'holy patrimony' that will suffer distress due to my
post?
Maybe you didn't notice, but our plants shut down many years ago.. They
are selling _us_ their wares.

Maybe your plants shut down but some of the plants I have worked at
(such as the steelworks at Port Kembla) are still successfully exporting
to Asia.

Neil
 
S

Steven D'Aprano

Is the implication that the principal usefulness of such languages as
Hindi and "other Indian languages" is us selling "things" to them..? I
am not from these climes but all the same, I do find you tone of voice
rather offensive,

I think Neil's point is that Unicode has succeeded in the wider world,
outside of academic circles, because of the commercial need to
communicate between cultures using different character sets. I suppose he
could have worded it better, but fundamentally he's right: without the
commercial need to trade across the world (information as well as
physical goods) I doubt Unicode would be anything more than an
interesting curiosity of use only to a few academics and linguists.

considering that you are referring to a culture that's
about 3000 years older and 3000 richer than ours and certainly deserves
our respect.

Older, certainly, but richer? There's a reason that Indians come to the
West rather than Westerners going to India. As Terry Pratchet has
written, age is not linked to wisdom -- just because somebody is old,
doesn't mean they're wise, perhaps they've just been stupid for a very
long time. The same goes for cultures: old doesn't mean better.

Indian culture has been responsible for many wonderful things over the
millennia, but the cast system is not one of them, and any culture which
glorified sati (suttee) as an act of piety is not one we should look up
to. Sati was probably rare even at the height of it's popularity, and
vanishingly rare now, and arguably could even be defended as the right of
an adult to end their own life when they see fit, but dowry-burning is
outright murder and is sadly very common across the Indian sub-continent:
some estimates suggest that in the mid-1990s there were nearly 6000 such
murders a year in India.

If we are to be truly non-racist, we must recognise that the West does
not have a monopoly on wickedness, ignorance, spite and sheer awfulness.

In any case, I'm not sure we should be talking about Indian culture in
the singular -- India is about as large as Western Europe, significantly
more varied, and the culture has changed over time. The India which
treated the Karma Sutra as a holy book is hardly the same India where
people literally rioted in the street because Richard Gere gave the
actress Shilpa Shetty a couple of rather theatrical and silly kisses on
the cheek.
 
G

garabik-news-2005-05

r said:
Some may say well how can we possibly force countries/people to speak/
code in a uniform manner? Well that's simple, you just stop supporting
their cryptic languages by dumping Unicode and returning to the
beautiful ASCII and adopting English as the universal world language.
v> Why English? Well because it is so widely spoken. But whatever we
choose just choose one language and stick with it, perfect it, and
maintain it.

Y’know, it is naïve to think that the “beautiful†ASCII is
sufficient for English…

Besides, there is the APL... (though, you are right, we should dump
those crappy old languages and use Python exclusively)

--
-----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!
 
T

Thorsten Kampe

* r (Sat, 29 Aug 2009 18:30:34 -0700 (PDT))
We don't support a Python group in Chinese or French, so why this?

"We" do - you don't (or to be more realistic, you simply didn't know
it).
Makes no sense to me really.

Like probably 99.99999% of all things you hear, read, see and encounter
during the day.

By the way: the dumbness of your "Unicode rant" would have even ashamed
the great XL himself.

Thorsten
 
T

Thorsten Kampe

* Neil Hodgson (Sun, 30 Aug 2009 06:17:14 GMT)
Chris Jones:


Eh? Was Unicode developed in India? China?

Chris was obviously talking about Sanskrit...

Thorsten
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top