fgetc - end of line - where is 0xD?

F

Flash Gordon

Bartc wrote, On 07/12/08 13:19:
I've just remembered the reason: I was calling C's printf() from a language
that expanded "\n" to CR,LF actually in the string literal.

That is a bad design choice for the language. After all, it means it
does not follow the conventions of the system it is running on unless
that happens to be DOS/Windows (well, anything using that convention,
but not any of the many other systems).
Because printf writes to stdout and stdout is in text mode, the LF results
in an extra expansion.

Another argument for why the design of that language was bad.
But the CR,CR,LF is only seen when directed to a
file.

It is probably "seen" otherwise, at least by the OS, just not so visible
to the average user.
So not a C problem other than stdout being awkward to set to binary mode.

I would say it was a problem with the design of the other language. Had
the other language either followed the same convention that C does (as
many languages do) or implemented its own library (as others do) then it
would not be a problem. In any case, I don't think stdout was really
provided for binary output, it was (I think) provided to provide the
main textual output to the user of the program (via whatever mechanism
such output might arrive, be it email from a cron job, output on a
telytype or whatever).
 
G

Guest

Bartc wrote:

...


If there were only a few possible choices, that would make sense. But
what about, for instance, files from systems where end-of-line is
indicated by padding to a fixed block length with '\0'?  That's just one
just one of several real-world options that involve neither CR nor LF.

I think some VMS file formats used a <byte count><data>...<data>
format. ie. there was no actual EOL character
 
J

James Kuyper

Joe said:
On a C implementation? Which pray tell.

It was a mainframe implementation in C. If I remember correctly, the
fact that text files were block-oriented was built into the operating
system at a fundamental level, and all text-oriented programs for that
platform expected such a format, whether or not written in C. I believe
that it was the existence of such platforms that was one reason why the
standard was written to be flexible enough to accommodate such an
implementation. It is a platform I've never used, and as a result I
don't remember which one it is. I hope that someone who can speak more
authoritatively that I can will be able to give you more details.

However, I hope it's clear that there's no serious problem with creating
a fully-conforming C implementation for such a platform. When reading in
text mode, the padding is converted into a single '\n' character; when
writing in text mode, '\n' characters are expanded into padding up to
the next multiple of the block size.

As far as I'm concerned, the fact that such an implementation would be
perfectly conforming is more important than the question of whether or
not any such implementation exists. However, I'm pretty sure it does exist.
 
B

Ben Bacarisse

James Kuyper said:
It was a mainframe implementation in C. If I remember correctly, the
fact that text files were block-oriented was built into the operating
system at a fundamental level, and all text-oriented programs for that
platform expected such a format, whether or not written in C. I
believe that it was the existence of such platforms that was one
reason why the standard was written to be flexible enough to
accommodate such an implementation. It is a platform I've never used,
and as a result I don't remember which one it is. I hope that someone
who can speak more authoritatively that I can will be able to give you
more details.

I hope someone will. In the meant time I can say that I have used
such a mainframe system (an IBM 370) with other languages and I know
that a C implementation became available but only after I stopped
using that system. However, it should have been able to cope with a
file format inherited from punched cards. These have no line ending
characters, and when they were stored on tape or disc it was usually
done as fixed-length, null-padded records.

Someone else talked of VMS and I /have/ used a C compiler on VMS but
not with it's record-oriented files, so I can't therefore swear that
it could open them.

This still has the whiff of an urban legend ("a friend of a friend
actually used such a system") but there must be people who have used a
C compiler with such a system who can say for sure.
However, I hope it's clear that there's no serious problem with
creating a fully-conforming C implementation for such a platform.

This is the key point. At the time of standardising, such file
organisations were not uncommon and C was permitted to deal with them
even if, by some fluke of history, no confoming C library was ever
produced to do so.
 
R

Richard Tobin

Harald van Dijk said:
Well, I don't know if there are, but according to K&R, there were. It
describes two common conventions: end of file is indicated by -1, or by 0.
The latter was later disallowed by ANSI C, and I have no idea if those
implementations that used it have been changed, and if so, what value for
EOF they have changed to.

In C before "stdio", getchar() returned '\0', but getc() returned -1.

The 7th edition - post stdio - manual page doesn't specify the numeric
value of EOF; it makes getchar() equivalent to getc(stdin), and
documents the end-of-file return value of getchar() as being
incompatible with editions 1-6. The manual page itself doesn't say
that EOF is negative, but it's implied that it's distinct from the
values returned for real characters, and "converting from the 6th
edition" in the introduction says it's -1.

It's possible that there were implementations that returned signed
values from getc(), and might therefore have used a different value
for EOF (e.g. -129), but I'd be surprised if any stdio implementations
used 0. And once the C standard specified that getc() returns an
unsigned char converted to an int there was no reason for it not be
-1.

-- Richard
 
D

David Thompson

On Mon, 08 Dec 2008 02:17:35 +0000, Ben Bacarisse
I hope someone will. In the meant time I can say that I have used
such a mainframe system (an IBM 370) with other languages and I know
that a C implementation became available but only after I stopped
using that system. However, it should have been able to cope with a
file format inherited from punched cards. These have no line ending
characters, and when they were stored on tape or disc it was usually
done as fixed-length, null-padded records.
OS/360 et seq (and DOS/360 ditto) padded with blank = EBCDIC 0x40.
Some earlier IBM machines used 6-bit BCDIC in which blank is 0x00, and
I believe did pad with that. Such machines obviously would have had
trouble supporting C, which fortunately hadn't been thought of yet.

I believe some of the competing 8+ bit mainframes, from the so-called
BUNCH (Burroughs, Univac, NCR, CDC, Honeywell), either padded with
0x00 null or used (other) charsets with 0x00 blank, but I don't have
personal experience of them.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top