using unicode symbols for variable names

Ashish · Dec 4, 2003

hi,

i just read that in java, everything is translated to unicode. so we can
use characters from various languages as variable names, strings etc.
do you know any editor which supports data entry etc. using unicode ...
what i want to do is, to be able to enter the symbol for "pi"
(3.1415927....) as a variable name in my program.

this is used in an example on page 8 of "The Java Programming Language".

thanks,
ashish

Michael Borgwardt · Dec 4, 2003

Ashish said:
i just read that in java, everything is translated to unicode. so we can
use characters from various languages as variable names, strings etc.
do you know any editor which supports data entry etc. using unicode ...
what i want to do is, to be able to enter the symbol for "pi"
(3.1415927....) as a variable name in my program.

It's less a problem of finding an editor that supports it - modern editors
(certainly eclipse and even the lowly notepad, on Windows XP, at least) allow
you to use any encoding you like. The real problem is finding a keyboard map
and/or entry method that allows you to enter all the characters you want.
These are a part of the OS, not the editor, and usually language-specific, so
there is no "catch-all".

Since pi is a Greek letter, installing Greek language support should enable you
to enter it. Then you need to use a file encoding that supports all the
characters you use (UTF-8 *is* the catch-all for that) and when compiling,
tell the compiler to use the same encoding.

Bent C Dalager · Dec 4, 2003

These are a part of the OS, not the editor, and usually language-specific, so
there is no "catch-all".

Well, yes and no. Emacs' iso-accents-mode is one way to bridge the
gap.

Cheers
Bent D

Michael Borgwardt · Dec 4, 2003

Bent said:
Well, yes and no. Emacs' iso-accents-mode is one way to bridge the
gap.

Not really. It merely allows one to enter the non-ASCII characters of a small number of
european languages but does not cover Greek, cyrillic or any of the asian scripts.

Bent C Dalager · Dec 4, 2003

Not really. It merely allows one to enter the non-ASCII characters of a
small number of
european languages but does not cover Greek, cyrillic or any of the
asian scripts.

It is the method I find interesting. There isn't anything in the
method to limit which character sets can be employed.

You could have an everything-accents-mode that covered everything in
much the same manner. For all I know there may already be one.

Cheers
Bent D

David Alex Lamb · Dec 4, 2003

Not really. It merely allows one to enter the non-ASCII characters of a small number of
european languages but does not cover Greek, cyrillic or any of the asian scripts.

Emacs has MULE (MUlti-Lingual Environment) for coping with different character
sets; I haven't used it but it might be useful to you.

Michael Borgwardt · Dec 4, 2003

Bent said:
It is the method I find interesting. There isn't anything in the
method to limit which character sets can be employed.

Except that it would be very uncomfortable to use for entire non-Latin
scripts and downright unusable for Chinese or Japanese with their
thousands of different characters.

Michael Borgwardt · Dec 4, 2003

Emacs has MULE (MUlti-Lingual Environment) for coping with different character
sets; I haven't used it but it might be useful to you.

OK, I have to correct myself: the entry methods *can* be part of the editor, and there
*is* a catch-all, and it's EMACS. Then again, that webbrowser-mailclient-adventuregame-
psychatrist-kitchensink behemoth can't really be called "editor" anymore.

Bent C Dalager · Dec 4, 2003

Except that it would be very uncomfortable to use for entire non-Latin
scripts and downright unusable for Chinese or Japanese with their
thousands of different characters.

That really depends on your definition of "unusable". The most naïve
approach that could work quite well would be to simply type in the
Unicode code, much as you can in Java source code to get Unicode
chars. And much as you could (can?) use alt+keypad in Windows to get
(extended?) ASCII chars.

But, certainly, the more focused the supported char set is, the more
likely it is that there will be a convenient user interface for
accessing it.

Cheers
Bent D

Bent C Dalager · Dec 4, 2003

OK, I have to correct myself: the entry methods *can* be part of the
editor, and there
*is* a catch-all, and it's EMACS. Then again, that
webbrowser-mailclient-adventuregame-
psychatrist-kitchensink behemoth can't really be called "editor" anymore.

It is _also_ an editor

Cheers
Bent D

Harald Hein · Dec 4, 2003

Ashish said:
what i want to do is, to be able to enter the symbol for "pi"
(3.1415927....) as a variable name in my program.

Good idea - if you want to make maintenance as difficult as possible.
Roedy has some more suggestions for really messing up code at his site.

Michael Borgwardt · Dec 5, 2003

Bent said:
That really depends on your definition of "unusable". The most naïve
approach that could work quite well would be to simply type in the
Unicode code

That's exactly what I call "unusable" since nobody can remember all those
numerical codes.

I'll tell you how Japanese input actually does work: you type in the
pronounciation of a character or word, then you're given a list of all
the characters or words that match that pronounciation and choose the
right one. Context analysis is used to put the more likely candidates
at the top of the list (most of the time the first choice is the
richt one, thanks to that).

Sam Brightman · Dec 5, 2003

Michael Borgwardt wrote:

-snip-

I'll tell you how Japanese input actually does work: you type in the
pronounciation of a character or word, then you're given a list of all
the characters or words that match that pronounciation and choose the
right one. Context analysis is used to put the more likely candidates
at the top of the list (most of the time the first choice is the
richt one, thanks to that).

A little off-topic but kind of related: Does anyone know why predictive
text on mobile phones never used either a simple frequency log or
context analysis. I've not studied it properly but they seem to always
get the wrong choice for some words! Just limited memory?

Bent C Dalager · Dec 5, 2003

That's exactly what I call "unusable" since nobody can remember all those
numerical codes.

Luckily, that isn't a requirement for this method to be useful.

It is clear that if you spent your time writing a novel in Japanese,
you would probably want a specialized setup for writing Japanese. If,
on the other hand, you are writing predominately English text and need
to insert the odd Japanese symbol every now and then it could be quite
useful.

This crude method for entering foreign symbols is still infinitely
much better than having no method to do it at all, which might very
well be the only alternative on offer.

Cheers
Bent D

Michael Borgwardt · Dec 5, 2003

Bent said:
It is clear that if you spent your time writing a novel in Japanese,
you would probably want a specialized setup for writing Japanese. If,
on the other hand, you are writing predominately English text and need
to insert the odd Japanese symbol every now and then it could be quite
useful.

Even then it would be very painful.

This crude method for entering foreign symbols is still infinitely
much better than having no method to do it at all, which might very
well be the only alternative on offer.

Fortunately, modern OSes support multiple languages and allow you
to switch between the different entry methods.

math symbols in unicode (grouped by purpose)	2	Aug 13, 2010
How do I display unicode value stored in a string variable using ord()	133	Aug 16, 2012
Unicode (UTF-8) in C	13	Mar 16, 2014
Python Unicode handling wins again -- mostly	67	Nov 30, 2013
Converting EBCDIC to Unicode	3	Sep 28, 2010
Unicode fonts in Java	3	Mar 19, 2007
File names, character sets and Unicode	1	Dec 12, 2008
Using non-ascii symbols	61	Jan 24, 2006

using unicode symbols for variable names

Ashish

Michael Borgwardt

Bent C Dalager

Michael Borgwardt

Bent C Dalager

David Alex Lamb

Michael Borgwardt

Michael Borgwardt

Bent C Dalager

Bent C Dalager

Harald Hein

Michael Borgwardt

Sam Brightman

Bent C Dalager

Michael Borgwardt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads