Trigraphs

D

Daniel Rudy

What exactly are trigraphs, and why are they undesireable?


--
Daniel Rudy

Email address has been base64 encoded to reduce spam
Decode email address using b64decode or uudecode -m

Why geeks like computers: look chat date touch grep make unzip
strip view finger mount fcsk more fcsk yes spray umount sleep
 
C

Coos Haak

Op Mon, 07 Nov 2005 00:49:47 GMT schreef Daniel Rudy:
What exactly are trigraphs, and why are they undesireable?

K&R2 page 229.
And Google: 135000 hits!
 
D

Daniel Rudy

At about the time of 11/6/2005 5:04 PM, Coos Haak stated the following:
Op Mon, 07 Nov 2005 00:49:47 GMT schreef Daniel Rudy:




K&R2 page 229.
And Google: 135000 hits!

Following your advised to look at google (I forgot about Google for some
reason...), I found this on the web:

http://www-ccs.ucsd.edu/c/charset.html#Trigraphs

Back in the day, it might have been useful, but not today. Thanks for
the direction.


--
Daniel Rudy

Email address has been base64 encoded to reduce spam
Decode email address using b64decode or uudecode -m

Why geeks like computers: look chat date touch grep make unzip
strip view finger mount fcsk more fcsk yes spray umount sleep
 
K

Keith Thompson

Daniel Rudy said:
At about the time of 11/6/2005 5:04 PM, Coos Haak stated the following:
Op Mon, 07 Nov 2005 00:49:47 GMT schreef Daniel Rudy:
[snip]
http://www-ccs.ucsd.edu/c/charset.html#Trigraphs

Back in the day, it might have been useful, but not today. Thanks for
the direction.

But you still have to be careful about unintentional trigraphs.
For example, this:

printf("What's going on here??!\n");

won't do what you expect (unless you expect it to print a '|'
character).

Some compilers issue warnings for trigraphs, which seems like a good
idea. I've seen more accidental trigraphs than deliberate ones in
real code.

IMHO a better approach would have been to have trigraphs enabled for a
given source file *only* if there's an explicit directive in the file
itself, allowing the 99% of us who have no use for trigraphs to ignore
them.
 
D

Daniel Rudy

At about the time of 11/6/2005 5:40 PM, Keith Thompson stated the following:
Daniel Rudy said:
At about the time of 11/6/2005 5:04 PM, Coos Haak stated the following:
Op Mon, 07 Nov 2005 00:49:47 GMT schreef Daniel Rudy:

What exactly are trigraphs, and why are they undesireable?
[snip]

http://www-ccs.ucsd.edu/c/charset.html#Trigraphs

Back in the day, it might have been useful, but not today. Thanks for
the direction.


But you still have to be careful about unintentional trigraphs.
For example, this:

printf("What's going on here??!\n");

won't do what you expect (unless you expect it to print a '|'
character).

Some compilers issue warnings for trigraphs, which seems like a good
idea. I've seen more accidental trigraphs than deliberate ones in
real code.

I know that gcc will if the -Wtrigraphs option is specified on the
command line. Most of my C work is done with -std=c89 -ansi flags, so
-trigraphs is automatically enabled. The documentation give no clue
about how to disable them either.
IMHO a better approach would have been to have trigraphs enabled for a
given source file *only* if there's an explicit directive in the file
itself, allowing the 99% of us who have no use for trigraphs to ignore
them.

I like you approach better than what gcc and many other compilers do. I
personally see no use for trigraphs in this day and age.

--
Daniel Rudy

Email address has been base64 encoded to reduce spam
Decode email address using b64decode or uudecode -m

Why geeks like computers: look chat date touch grep make unzip
strip view finger mount fcsk more fcsk yes spray umount sleep
 
F

Flash Gordon

Daniel said:
At about the time of 11/6/2005 5:40 PM, Keith Thompson stated the following:



I know that gcc will if the -Wtrigraphs option is specified on the
command line. Most of my C work is done with -std=c89 -ansi flags, so
-trigraphs is automatically enabled. The documentation give no clue
about how to disable them either.

That's because every conforming compiler is *required* to implement them.
I like you approach better than what gcc and many other compilers do. I
personally see no use for trigraphs in this day and age.

I agree, but unfortunately the standard is as it is. Of course, the
standard could add a "no-trigraphs" pragma so that they can be disabled
without breaking any existing code.
 
S

Skarmander

Mark said:
They're not.

You just try writing C code on a Mac G4 with a US keyboard. :-(

Even when trigraphs solve the problem, they are arguably an inferior
solution compared to just about every alternative: redefined keyboard
layouts, editors with macro capabilities, search-and-replace of
system-specific characters that are not so easily mistaken for genuine
character sequences, getting a platform with support for all the
characters in that newfangled contraption "ASCII". (Just in case. :)

It's hard to see why this admittedly annoying problem of input should be
solved by extending the C language. Mandating that every implementation
should have on-demand support for trigraphs would have been just about
acceptable, but unfortunately the standard just requires them to be
always on.

"Trigraphs: for when even ASCII is too much to ask for..."

S.
 
J

Jordan Abel

They're not.

You just try writing C code on a Mac G4 with a US keyboard. :-(

US keyboards aren't the problem. What characters are missing from your
Mac's US keyboard? The problem is with non-US ones, or, more like,
non-us iso646 codesets, which actually don't contain some of
$[\]{|}~_^@`
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jordan said:
[snip]
The problem is with non-US ones, or, more like,
non-us iso646 codesets, which actually don't contain some of
$[\]{|}~_^@`

Or even US non-iso646 codesets, like the EBCDIC-* variations that IBM
mainframes use.


- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFDb+NwagVFX4UWr64RAiZRAJsEDC1GljQ2/+MflJK+XoewXV4TEACeIgnv
PiZVFXuLGno6y50EtERhZQA=
=KMCL
-----END PGP SIGNATURE-----
 
S

Skarmander

Lew said:
Jordan said:
On Mon, 07 Nov 2005 00:49:47 GMT, in comp.lang.c , Daniel Rudy

What exactly are trigraphs, and why are they undesireable?
[snip]

The problem is with non-US ones, or, more like,
non-us iso646 codesets, which actually don't contain some of
$[\]{|}~_^@`
$, @ and ` are never necessary to write portable C. How you get your
application's strings encoded is another problem.
Or even US non-iso646 codesets, like the EBCDIC-* variations that IBM
mainframes use.
EBCDIC has many funky versions, but do you actually know ones that do
not have the characters of the C trigraph set (# [ \ ] ^ { | } ~)? (This
is a genuine question, not an attempt at being snarky.)

Now, the ISO 646 variation of the Danish national character set, for
example, does not in fact have [ \ ] { | }; in their stead are Ä Ö Ü ä ö ü.

I sincerely doubt Danish C programmers need trigraphs so they can use
this character set to code in, however; most will probably have ISO
8859-1 compatible platforms. With cross-compilers, if necessary. Or else
they ought to have compilers that read Ä as [. :)

S.
 
K

Keith Thompson

Skarmander said:
Even when trigraphs solve the problem, they are arguably an inferior
solution compared to just about every alternative: redefined keyboard
layouts, editors with macro capabilities, search-and-replace of
system-specific characters that are not so easily mistaken for genuine
character sequences, getting a platform with support for all the
characters in that newfangled contraption "ASCII". (Just in case. :)

It's hard to see why this admittedly annoying problem of input should
be solved by extending the C language. Mandating that every
implementation should have on-demand support for trigraphs would have
been just about acceptable, but unfortunately the standard just
requires them to be always on.

"Trigraphs: for when even ASCII is too much to ask for..."

The 'A' in ASCII stands for American. There are character sets that
are similar to ASCII except for the substitution of accented letters
for some of the punctuation marks. I *think* these have largely been
superseded by 8-bit and larger character sets that have plenty of room
for both puncutation characters and accented letters, so the need for
trigraphs is less than it was when they were introduced back in the
last 1980s.

If I recall correctly, one of the European national bodies insisted on
some solution for the national character set problem before they would
ratify the standard (I'm not sure whether they insisted on trigraphs
in particular). It was basically a political decision. I'm sure the
details are googleable.
 
S

Skarmander

Keith said:
The 'A' in ASCII stands for American.

I'll admit to a little pro-American bias in this case...

Actually, I stress that I used ASCII here only as a lowest-common
denominator of which the characters (not the encoding itself) are
accepted as lowest-common denominator just about everywhere. That is,
even if you're not using ASCII, you'll very likely be able to represent
the printable ASCII characters in your character set.
There are character sets that are similar to ASCII except for the
substitution of accented letters for some of the punctuation marks.

There are, but they're not common, and not used for coding your C source
for exactly this reason.
I *think* these have largely been superseded by 8-bit and larger
character sets that have plenty of room for both puncutation
characters and accented letters, so the need for trigraphs is less
than it was when they were introduced back in the last 1980s.

Yes. Notwithstanding this, my remark on superior solutions stands. These
would have been workable in the 1980s too.
If I recall correctly, one of the European national bodies insisted on
some solution for the national character set problem before they would
ratify the standard (I'm not sure whether they insisted on trigraphs
in particular). It was basically a political decision. I'm sure the
details are googleable.

Politics would make more sense than technical limitations, yes. I'll
look up the full story someday. (Haven't found it yet, but I'm sure it's
been discussed before.)

S.
 
C

Christian Bau

Jordan Abel said:
They're not.

You just try writing C code on a Mac G4 with a US keyboard. :-(

US keyboards aren't the problem. What characters are missing from your
Mac's US keyboard? The problem is with non-US ones, or, more like,
non-us iso646 codesets, which actually don't contain some of
$[\]{|}~_^@`

You could enter all these characters even on a German Apple II computer
around 1980 or so (the only problem is that there was no C compiler, but
writing Pascal programs would have been a major pain without all these
characters).
 
J

Jordan Abel

Lew said:
Jordan said:
On Mon, 07 Nov 2005 00:49:47 GMT, in comp.lang.c , Daniel Rudy

What exactly are trigraphs, and why are they undesireable?
[snip]

The problem is with non-US ones, or, more like,
non-us iso646 codesets, which actually don't contain some of
$[\]{|}~_^@`
$, @ and ` are never necessary to write portable C. How you get your
application's strings encoded is another problem.

I listed those for completeness. Unfortunately, such completeness was
hurt by the fact that i forgot #.
 
R

Richard Tobin

US keyboards aren't the problem. What characters are missing from your
Mac's US keyboard?

There's no hash key..... [/QUOTE]

Really? I thought that was just on UK keyboards. What do you have
on shift-3?

Mac UK keyboards have the sterling symbol, and hash is available as
alt-3, which is a real pain if you use normal keyboards too.

-- Richard
 
K

Keith Thompson

Mark McIntyre said:
There's no hash key.....

Was "US keyboard" a typo for "UK keyboard"?

(Even if I had to use a keyboard with no '#' key, I'd try to find a
solution other than trigraphs. I'd at least filter the source code
before distributing it.)
 
P

Peter Nilsson

Mark said:
There's no hash key.....

If the alternative keystrokes are too annoying, just change the
keyboard mapping.

I'm suprised you didn't mention the situation where trigraphs were
commonly
used on macs, namely with 32-bit character constants like '???\?'.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top