multiple versions of "Extended ASCII characters"(No. 128 to 255)

wob · Jul 21, 2005

Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?

You can see it here:

http://www.kturby.com/cables/ascii2.htm

http://www.idevelopment.info/data/Programming/ascii_table/PROGRAMMING_ascii_table.shtml

osmium · Jul 21, 2005

wob said:
Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are
more than one version of "Extended ASCII characters"(No. 128 to 255) .
e.g., in one version No. 224 is the symbol alpha, in another, it's a "a"
with a ` on it... How come?

The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.

Lew Pitcher · Jul 21, 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

:

The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset.

Precicely!

IMHO, the phrase "Extended ASCII" should be banned from any discussion. People
too often say "Extended ASCII" when they mean "some unknown characterset that
shares a common set of characters with ASCII", and expect a precise answer
relating to ASCII.

There are probably hundreds of these.

One of the ISO working committees keeps a website just as a catalog of
charactersets. The URL is http://anubis.dkuug.dk/i18n/charmaps/

ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.

"coded character set" or "coded characterset"
Also, related to "characterset translation"

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.

See http://anubis.dkuug.dk/i18n/charmaps/ASCII for an ASCII-to-Unicode table.
While you /can/ purchase the ASCII specs from ISO, the ECMA provides identical
specs for free at
http://www.ecma-international.org/publications/files/ecma-st/ECMA-006.pdf,
http://www.ecma-international.org/publications/files/ecma-st/ECMA-048.pdf, and
http://www.ecma-international.org/publications/files/ecma-st/ECMA-035.pdf

- --
Lew Pitcher
IT Specialist, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFC39pQagVFX4UWr64RArjIAKDtK42C9728hfxIaF100LGQ9DEWrwCg88iN
3b2x+QqZcRbjDb5KOGn2WYQ=
=BwwV
-----END PGP SIGNATURE-----

Chris Croughton · Jul 21, 2005

Many thanks for those who responded to my question of "putting greek char
into C string". In searching for an solution, I noticed that there are more
than one version of "Extended ASCII characters"(No. 128 to 255) . e.g., in
one version No. 224 is the symbol alpha, in another, it's a "a" with a ` on
it... How come?

There is no such thing as "Extended ASCII" in any meaningful form. It's
like "C with extensions", the extended parts are done by whoever wants
them.

ASCII defines /only/ characters using the bottom 7 bits, thus the
characters numbered 0 to 127. Various people have decided that they
want more, so they allocated them to codes above 127 as they felt like
it. Line drawing characters, European accented characters (at least
four versions used commonly in Europe), mathematical symbols, Cyrillic
(Russuan) characters, Greek, funny faces, you name it. And of course
Microsoft came up with its own ones different from any others.

Recently (i.e. in the last 20 years) there have been attempts to
standardise, but because all of the characters can't fit into the
'spare' 128 available positions there are lots of variants in the
ISO-8859 standard (at least 10 variants). See for instance

http://czyborra.com/charsets/iso8859.html

It was realised that what was really wanted was a much expanded
character space, to allow for the thousands of Chinese characters and
other languages to be added, so Unicode was born. This uses fixed-width
characters of either 16 or 32 bits, with each character assigned to only
one position (some of the characters look alike but are in different
national or specific sets so they are treated as different characters).

Because much software still uses 8 bit strings (and 8 bit transport
paths), Unicode also specifies a method of converting a 'wide' (16 or
32 bit) character into an string of 8 bit characters. This system,
UTF (Unicode Transformation Format) 8 keeps the ASCII characters as
individual 7 bits with the top bit of the 8 bit character zero, so it is
compatible with 7 bit ASCII, and characterss with the top bit set are
not valid on their own, only as part of a "multi-byte character" string.

The web page above has descriptions of the ISO 8859 variants, and also
points to articles and descriptions of Unicode, UTF-8 and other matters.

This is relevant to C in the support for 'wide' characters and multibyte
characters, and the functions which transform and output them.

Chris C

Dave Thompson · Aug 1, 2005

The phrase "extended ASCII" has come to mean that the new character set
contains ASCII as a subset. There are probably hundreds of these. ISTM
there should have been a better way to express that thought, but it doesn't
leap out at me. Related words that might help you pursue this subject in
google: font, code page.
Right.

There is now, and always has been only one ASCII and it contains 128
characters, basically the American version of the latin alphabet, plus
digits and punctuation and control characters. There is no established
graphic to identify the control characters.

There is only one ASCII now, but it has changed significantly at least
once, when lowercase and other 6/x and 7/x was added, IIRC about 1968.
And to be pedantic it went through periods of being designated USASCII
and ANSCII as the name of the organization changed, but this did not
imply any substantive change. The American alphabet is the (modern)
English alphabet, at least for America = US plus most of CA; there are
other American countries (primarily) using other languages.

There _is_ a standard for graphical representations for control
characters, albeit at least mostly just two-letter mnemonics jammed
together, not "graphical" in the common sense of pictorial or iconic:
ISO 2047, IIRC based on and superseding an X3.n like 646 versus ASCII;
but it certainly hasn't been widely used or even known. I have seen
what I believe(d) were displays obeying it on various datascopes, and
a few (real) terminals back-in-the-day in "show controls" mode.

- David.Thompson1 at worldnet.att.net

conversion of non-ascii characters with xslt?	3	Jun 20, 2007
Running Multiple Versions of Ruby on Debian?	4	Oct 10, 2008
PEP 3131: Supporting Non-ASCII Identifiers	399	May 13, 2007
Problems of Symbol Congestion in Computer Languages	54	Feb 16, 2011
Use of Unicode in Python 2.5 source code literals	3	May 3, 2009
comp.lang.c FAQ list Table of Contents	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
need help with a cart I inherited, need to increase number of total characters allowed	3	Oct 22, 2007

multiple versions of "Extended ASCII characters"(No. 128 to 255)

wob

osmium

Lew Pitcher

Chris Croughton

Dave Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads