printf in glorious colour

X

Xavier Roche

Le 01/03/2013 11:04, Malcolm McLean a écrit :
How widely supported are the escape ... m colour codes for text?

This is not a C-related question, but it depends on your terminal
capabilities (ie. TERM environment variable on Unix ; man -i 5
terminfo). Most Unix flavor terminal have vt220 flavored escape sequence
support (http://en.wikipedia.org/wiki/ANSI_escape_code), for example.

You may want to use a more abstract way to do that, however (such as
http://en.wikipedia.org/wiki/Curses_(programming_library))
 
B

BartC

Malcolm McLean said:
How widely supported are the escape ... m colour codes for text?

They don't seem to work under Windows unless you have something like
'ansicon' installed (but which introduces problems of its own.)
 
N

Nobody

How widely supported are the escape ... m colour codes for text?

There are enough exceptions that hard-coding those sequences can't be
explained as anything other than the programmer not knowing about termcap
or terminfo.

Any Unix system will have either or both of those libraries. The Windows
console doesn't support escape sequences.

On a related note, even \r (carriage return), \b (backspace) and \a (bell)
aren't universally supported. And even where \r and \b are supported,
their behaviour differs between video and hardcopy terminals.
 
M

Malcolm McLean

There are enough exceptions that hard-coding those sequences can't be
explained as anything other than the programmer not knowing about termcap
or terminfo.

On a related note, even \r (carriage return), \b (backspace) and \a (bell)
aren't universally supported. And even where \r and \b are supported,
their behaviour differs between video and hardcopy terminals.
Then situation is that an embedded system running embedded Linux is spitting out ASCII text to a console which runs under Windows. I'm writing both the embedded side code to get the text, and the Windows side code to communicate with the device and read it. So the question is where to strip the escape sequences out. There's no option other than to hard-code them because I can't control the embedded side programs that are producing the output stream.
 
J

James Kuyper

On 03/01/2013 07:29 AM, Nobody wrote:
....
On a related note, even \r (carriage return), \b (backspace) and \a (bell)
aren't universally supported. ...

If it's a conforming implementation of C, those characters must be
supported: "In the basic execution character set, there shall be control
characters representing alert, backspace, carriage return, and new
line." (5.2.1p3). If it's not a conforming implementation, there's no
limit to how far it's failure to conform can go, so there's not much
point in talking about it here. The associated semantics are covered by
5.2.2p2:

"Alphabetic escape sequences representing nongraphic characters in the
execution character set are intended to produce actions on display
devices as follows:
\a (alert) Produces an audible or visible alert without changing the
active position.
\b (backspace) Moves the active position to the previous position on the
current line. If
....
\r (carriage return) Moves the active position to the initial position
of the current line."

Of course, "are intended to produce" is sufficiently vague to allow
implementations that don't meaningfully implement the specified semantics.
 
G

glen herrmannsfeldt

James Kuyper said:
On 03/01/2013 07:29 AM, Nobody wrote:
If it's a conforming implementation of C, those characters must be
supported: "In the basic execution character set, there shall be control
characters representing alert, backspace, carriage return, and new
line." (5.2.1p3). If it's not a conforming implementation, there's no
limit to how far it's failure to conform can go, so there's not much
point in talking about it here. The associated semantics are covered by
5.2.2p2:

The corresponding ASCII characters are reasonably well defined.

There is an EBCDIC backspace, so that should also work.

EBCDIC has CR, LF, and NL. I am not sure which one is used for '\n'
for EBCDIC C systems. I don't know of EBCDIC systems with a bell.
(There is a BEL control character, but I don't know what it does
on any system.)

The IBM 2741, the non-ASCII terminal most commonly used with IBM
mainframes, isn't EBCDIC. (It has its own character set based on
the position of the characters on the selectric type ball.)
As I remember, and as wikipedial says, there is no character
carriage return or bell character. There is BS, LF, and NL.

-- glen
 
B

Bart van Ingen Schenau

Then situation is that an embedded system running embedded Linux is
spitting out ASCII text to a console which runs under Windows. I'm
writing both the embedded side code to get the text, and the Windows
side code to communicate with the device and read it. So the question is
where to strip the escape sequences out. There's no option other than to
hard-code them because I can't control the embedded side programs that
are producing the output stream.

This is going off-topic for c.l.c, but there might be a possibility to
control the programs that produce the output.
In Unix-like systems, it is very common to redirect the output from one
program to another program, and the programs are not expected to be able
to handle escape sequences.
Because of this, most (all?) programs that can produce colorized console-
output will do so only if they determine that their output goes to an
interactive terminal.
It might be worth it to investigate if the Linux-side of your remote
console can act as a non-interactive device.

Bart v Ingen Schenau
 
N

Nobody

On 03/01/2013 07:29 AM, Nobody wrote: ...

If it's a conforming implementation of C, those characters must be
supported:

That only requires that the implementation map those sequences to the
appropriate codes in the platform's "native" encoding (e.g. 13, 8 and 7
respectively for ASCII, 13, 22 and 47 for EBCDIC).

The behaviour of terminals is outside the scope of the C standard.
 
J

James Kuyper

That only requires that the implementation map those sequences to the
appropriate codes in the platform's "native" encoding (e.g. 13, 8 and 7
respectively for ASCII, 13, 22 and 47 for EBCDIC).

No, it doesn't even require that. A fully conforming implementation
could map those sequences to inappropriate codes.
The behaviour of terminals is outside the scope of the C standard.

Perhaps - but 5.2.2 does in fact specify the intended behavior of
display devices.
 
J

James Kuyper

That only requires that the implementation map those sequences to the
appropriate codes in the platform's "native" encoding (e.g. 13, 8 and 7
respectively for ASCII, 13, 22 and 47 for EBCDIC).

No, it doesn't even require that. A fully conforming implementation
could map those sequences to inappropriate codes.
The behaviour of terminals is outside the scope of the C standard.

Perhaps - but 5.2.2 does in fact specify the intended behavior of
display devices.
 
I

Ivan Shmakov

[Cross-posting to news:comp.unix.misc, just in case.]
There are enough exceptions that hard-coding those sequences can't be
explained as anything other than the programmer not knowing about
termcap or terminfo.

Seconded. However, it should be noted that, say, both ls(1) of
GNU Coreutils and vlc(1) of the VideoLAN project have these
hardcoded. (And while it may be reasonable for the first, I
doubt that it is for the second.)
Any Unix system will have either or both of those libraries.

As well as a variant of the Curses library.
The Windows console doesn't support escape sequences.

Still, there're PDCurses.

[...]
 
I

Ivan Shmakov

[Cross-posting to and setting Followup-To:
there, for that'd be a more appropriate newsgroup.]
Then situation is that an embedded system running embedded Linux is
spitting out ASCII text to a console which runs under Windows. I'm
writing both the embedded side code to get the text, and the Windows
side code to communicate with the device and read it. So the
question is where to strip the escape sequences out.

FWIW, "pure" ASCII text isn't supposed to contain such escape
sequences.

However, I wonder if it's possible to use either MinTTY or PuTTY
for the task at hand? Both of them are capable of ECMA-48
escape sequences, including that for colors. (And PuTTY also
supports serial port communication, BTW.)

[...]
 
N

Nick Keighley

Then situation is that an embedded system running embedded Linux is spitting out ASCII text to a console which runs under Windows. I'm writing both the embedded side code to get the text, and the Windows side code to communicate with the device and read it. So the question is where to strip the escape sequences out. There's no option other than to hard-code them because I can't control the embedded side programs that are producing the output stream.

my instinct would be to leave the embedded side to do its own thing
and process the data into something escape free on the host (is it
always going to be windows?) side.

{red}hello! {blue,bold}world.

HTML I suppose if you must

Your console driver then translates it back into whatever the terminal
supports.

I tend to hate implementation detail from the edges of the system
leaking into the core application. I've seen highly compressed radio
protocol values stored in server side RdBs and no one being able to
see a problem (until the bit representation on the air interface
changed...)
 
M

Malcolm McLean

On Mar 1, 2:13 pm, Malcolm McLean <[email protected]>

my instinct would be to leave the embedded side to do its own thing
and process the data into something escape free on the host (is it
always going to be windows?) side.

I tend to hate implementation detail from the edges of the system
leaking into the core application. I've seen highly compressed radio
protocol values stored in server side RdBs and no one being able to
see a problem (until the bit representation on the air interface
changed...)
The embedded side runs embedded Linux. Initially my console just supported passing comand line arguments to the shell, to be executed via popen().
But that didn't allow for interactive commands, which interleave reads
from stdin with output to stdout.
It turns out that supporting these on Linux is a real hassle, and I ended
up using a pseudo terminal (the program pty). However when you run ls
under pty its behaviour changes, and suddenly it starts outputting the
colour codes.
For development we mostly use Linux machines, but some of the tools work
only under Windows.
 
D

David Thompson

The corresponding ASCII characters are reasonably well defined.
Although the 'alternate' interpretation of LF as NL was often an issue
-- quite a few terminals needed a switch/jumper/etc. for it.
There is an EBCDIC backspace, so that should also work.

EBCDIC has CR, LF, and NL. I am not sure which one is used for '\n'
for EBCDIC C systems. I don't know of EBCDIC systems with a bell.
(There is a BEL control character, but I don't know what it does
on any system.)

The IBM 2741, the non-ASCII terminal most commonly used with IBM
mainframes, isn't EBCDIC. (It has its own character set based on
the position of the characters on the selectric type ball.)
As I remember, and as wikipedial says, there is no character
carriage return or bell character. There is BS, LF, and NL.
And HT, although the tab stops were manually set and often wrong.

But I think 327X/328Xs were are at least as common as 2741's. And in
practice all EBCDIC, although theoretically there were ASCII models to
match S/360's not-really-usable PSW.ASCII bit.

327X displays use specialized meanings of DC1-4 ESC and I think one or
two other controls. 328X printers could just use 327X-style buffers,
or 'stream' data with IIRC FF CR LF NL but not BS VT. IDRC HT.
 
G

glen herrmannsfeldt

(snip, someone wrote)
(snip, then I wrote)
Although the 'alternate' interpretation of LF as NL was often an issue
-- quite a few terminals needed a switch/jumper/etc. for it.

I remember the Tektronix scope terminals with many possible combinations
that could be selected by jumper.
And HT, although the tab stops were manually set and often wrong.

WYLBUR knows what to do with tabs, either on a 2741, or an ASCII
terminal with settable tabs. For data entry or normal listing,
there are columns for line numbers, so those aren't counted.
I did at least used to set a column 7 tab for Fortran programs,
and maybe also 73. SHO TABS VERIFY spaces out, printing a '1' at
each tab position, then tabs out and prints a '1'. If they line
up, you know that the tabs are set correctly. Especially nice for
entry or printing of tables. In normal use, WYLBUR doesn't store tabs
in a data set, but instead stores in its special blank compressed form.
But I think 327X/328Xs were are at least as common as 2741's. And in
practice all EBCDIC, although theoretically there were ASCII models to
match S/360's not-really-usable PSW.ASCII bit.

Yes, I was thinking about the S/360 and early S/370 days. In later
years, the 327x were more popular. But 3270 and such are very different
from the usual serial terminals. For one, the controller does much of
the work that would, in the case of a usual serial terminal, be done by
the terminal. I believe that the data as sent to the controller is
EBCDIC, I am not so sure about what is actually sent to the 3270 itself,
especially in the case of control characters.

As well as I understand it, when the S/360 ASCII bit was used for the
EC mode bit on S/370, there were no IBM systems that ever used the bit.
It could only be set in supervisor mode. It is possible that some
non-IBM system could have set it.

But even so, it was not for the usual ASCII-7 (seven bit ASCII that we
all know), but a proposed but never used ASCII-8. ASCII-8 is not just
ASCII-7 with additional characters, but half of the characters
(I haven't looked at a table recently) were moved up. ASCII mode changes
the zones generated by UNPK, and the signs generated by decimal
instructions. All sign values are allowed as input to decimal
instructions, independent of the bit. Rather than use the mode bit, it
is usual to just change them after they are generated.

It might be that if ASCII-8 was standardized at the right time, that
IBM would have converted all of OS/360 over. That is, before there was
any installed base outside IBM. There are comments in the source for
the PL/I (F) compiler and library, in each source file, indicating that
it is either independent of the source character set, or if converted
to a different character set, will assembler in that character set.

To get back to C, some years ago someone sent be C source for a S/370
disassembler. It was pretty much impossible to use on an ASCII system,
and I only had ASCII systems available. There were many character
constants that would be used for printing messages, along with data
from the input program. But on an ASCII system, the constants are in
ASCII but the data was still in EBCDIC!. (Such as from %c or %s.)

I might have been able to do it by converting all character constants
to EBCDIC using \x escapes, then converting the EBCDIC output file back
to ASCII, but I hever did that.
327X displays use specialized meanings of DC1-4 ESC and I think one or
two other controls. 328X printers could just use 327X-style buffers,
or 'stream' data with IIRC FF CR LF NL but not BS VT. IDRC HT.

I barely ever used a 327x terminal, and never a printer for one.
I do have some of the manuals describing them, though. As well as I
know it, data is transfered a whole line, or even whole screen, at
a time. There are control characters to describe fields where data
entry is allowed, though, and the terminal (or controller) allows
one to modify such data, and for the host to read it back.

The result is a much lower interrupt overhead than is usual for ASCII
(or 2741) style terminals. It was also somewhat common to have a box
that emulated the 327x data stream on one side, and ASCII on the other,
(especially for remote terminals) such that users could have the 327x
experience with cheaper ASCII terminals.

-- glen
 
K

Keith Thompson

glen herrmannsfeldt said:
But even so, it was not for the usual ASCII-7 (seven bit ASCII that we
all know), but a proposed but never used ASCII-8. ASCII-8 is not just
ASCII-7 with additional characters, but half of the characters
(I haven't looked at a table recently) were moved up. ASCII mode changes
the zones generated by UNPK, and the signs generated by decimal
instructions. All sign values are allowed as input to decimal
instructions, independent of the bit. Rather than use the mode bit, it
is usual to just change them after they are generated.

It might be that if ASCII-8 was standardized at the right time, that
IBM would have converted all of OS/360 over. That is, before there was
any installed base outside IBM. There are comments in the source for
the PL/I (F) compiler and library, in each source file, indicating that
it is either independent of the source character set, or if converted
to a different character set, will assembler in that character set.
[...]

Interesting. Do you have a reference for this proposed ASCII-8?
It turns out to be difficult to Google; most of the references I've
found incorrectly refer to things like Latin-1 or Windows-1252 as
"8-bit ASCII".

If ASCII-8 had caught on, with some common characters requiring 8
bits, I wonder if UTF-8 would have been possible.
 
G

glen herrmannsfeldt

(snip, I wrote)
But even so, it was not for the usual ASCII-7 (seven bit ASCII that we
all know), but a proposed but never used ASCII-8. ASCII-8 is not just
ASCII-7 with additional characters, but half of the characters
(I haven't looked at a table recently) were moved up. ASCII mode changes
the zones generated by UNPK, and the signs generated by decimal
instructions. All sign values are allowed as input to decimal
instructions, independent of the bit. Rather than use the mode bit, it
is usual to just change them after they are generated.
It might be that if ASCII-8 was standardized at the right time, that
IBM would have converted all of OS/360 over. That is, before there was
any installed base outside IBM. There are comments in the source for
the PL/I (F) compiler and library, in each source file, indicating that
it is either independent of the source character set, or if converted
to a different character set, will assembler in that character set. [...]

Interesting. Do you have a reference for this proposed ASCII-8?
It turns out to be difficult to Google; most of the references I've
found incorrectly refer to things like Latin-1 or Windows-1252 as
"8-bit ASCII".
If ASCII-8 had caught on, with some common characters requiring 8
bits, I wonder if UTF-8 would have been possible.

Look in the Appendix of the S/360 Principles of Operation. Later
versions have a better description of it, such as the -7 (Dec 1967)
version from bitsavers.

There are still plenty of code points, they just moved them around.

Interesting also in the descirption is that ASCII bits are numbered
from 7 down to 1 (MSB to LSB) and EBCDIC from 0 to 7 (That is,
big-endian bit numbering for EBCDIC.)

Also interesting, IBM would have replaced the '^' with the "not" sign,
and the '!' with the vertical bar (logical OR) sign. (That is,
instead of the '|' character that most of us thing of as a vertical
bar, but which is actually a split vertical bar in ASCII.)
(It is split on the Dell keyboard I am using now, but not on the
screen display.)

-- glen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top