Writing files

  • Thread starter Edward Rutherford
  • Start date
E

Edward Rutherford

Hello

When I call fclose() on a file I've been writing to, does fclose always
write the EOF character before closing the file? Or do I need to write
the EOF myself?

Another question: If I've got a file opened in update (r+) mode, how can
I delete a character in the file? I want to do this without needing to
create a second file and copy all but the characters I want to delete.
Is there some kind of control sequence that will do this?

Any help greatly appreciated.
 
K

Keith Thompson

Edward Rutherford said:
When I call fclose() on a file I've been writing to, does fclose always
write the EOF character before closing the file? Or do I need to write
the EOF myself?

If the system requires some special character to be written to the end
of a file, fclose() should take care of writing it.

In general, there is no EOF character. EOF is a value returned by
certain input functions to indicate that no character is available.

On most systems, especially for binary files, the end of a file is
simply the end of the file, typically recorded in the directory
entry as the file's size in bytes.
Another question: If I've got a file opened in update (r+) mode, how can
I delete a character in the file? I want to do this without needing to
create a second file and copy all but the characters I want to delete.
Is there some kind of control sequence that will do this?

There's no portable way to do this other than re-writing the file,
or at least the portion of it following the deleted character.
There is typically no non-portable way to do it either.
 
B

Ben Pfaff

Edward Rutherford said:
When I call fclose() on a file I've been writing to, does fclose always
write the EOF character before closing the file? Or do I need to write
the EOF myself?

EOF is just a special non-character value returned by certain
standard library functions to indicate error or end-of-file. It
is not a character, so you can't write it to a file. fclose()
will take care of whatever needs to happen here.
Another question: If I've got a file opened in update (r+) mode, how can
I delete a character in the file? I want to do this without needing to
create a second file and copy all but the characters I want to delete.
Is there some kind of control sequence that will do this?

There's no standard way to do that, and most operating systems
don't provide any way to do it, other than the technique you
describe.

This is a FAQ:

19.14: How can I insert or delete a line (or record) in the middle of a
file?

A: Short of rewriting the file, you probably can't. The usual
solution is simply to rewrite the file. (Instead of deleting
records, you might consider simply marking them as deleted, to
avoid rewriting.) Another possibility, of course, is to use a
database instead of a flat file. See also questions 12.30 and
19.13.
 
J

Jens Thoms Toerring

Edward Rutherford said:
When I call fclose() on a file I've been writing to, does fclose always
write the EOF character before closing the file? Or do I need to write
the EOF myself?

A "EOF character" does not exist. You may expect such a
character to exist since e.g. getc() will return the value
named 'EOF' when the end of the file is reached. But that
doesn't mean that there was a character like that in the
file, this is only a special value returned when the end
of the file has been reached. Incidentally, since getc()
returns the non-character vale EOF under this condition
getc()'s return value is int and not char - since EOF isn't
a char it otherwise would not be possible to return this
value.

Some of the confusion may come from the fact that, if I remember
correctly, in DOS there could be a special character ('^Z') that
in text files was treated by DOS as an end of file marker. But
that character doesn't have anything to do with EOF. If you
want that kind of character in the file then you have to write
it in there tourself, it isn't written into the file by itself.
Another question: If I've got a file opened in update (r+) mode, how can
I delete a character in the file? I want to do this without needing to
create a second file and copy all but the characters I want to delete.
Is there some kind of control sequence that will do this?

There aren't any control sequences to do that. The only way
to do part of the job is by copying all the stuff that comes
after the character you want renoved one place nearer to the
beginning of the file. But then you still need to truncate
the file by one character and, as far as I know, that's only
possible using system specific functions (e.g. under UNIX with
truncate() or ftruncate(), Windows will probably have something
similar). So creating a copy without the character you want to
get rid off, then deleting the old file with remove() and fi-
nally renaming the new file using rename() to give it the
original name is the only way I am aware of when restricting
yourself to using no system-dependent functions and not rely-
ing on implementation-defined behaviour.

Regards, Jens
 
K

Keith Thompson

Ben Pfaff said:
EOF is just a special non-character value returned by certain
standard library functions to indicate error or end-of-file. It
is not a character, so you can't write it to a file. fclose()
will take care of whatever needs to happen here.

You can't meaningfully write EOF to a file, but if you try the compiler
is likely to let you get away with it (and do something you probably
didn't expect).

For example, this program compiles and runs with no apparent problems
on my system:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *out = fopen("out.txt", "w");
/* error checking not shown for now */

fputs("Hello, world\n", out);
fputc(EOF, out);
fclose(out);

return 0;
}

But the resulting out.txt file contains an extra 0xff character at
the end (at least on my system). Why? Because EOF happens to be
(-1) on my system (and most others, but the standard only guarntees
that it's negative), and fputc() converts the value -1 to a bytes
0xff (255).

This particular byte value has nothing to do with marking the end of
a file on my system (or, probably, on yours).

[...]
 
K

Keith Thompson

Some of the confusion may come from the fact that, if I remember
correctly, in DOS there could be a special character ('^Z') that
in text files was treated by DOS as an end of file marker. But
that character doesn't have anything to do with EOF.
Right.

If you
want that kind of character in the file then you have to write
it in there tourself, it isn't written into the file by itself.

I'm not sure that's correct. My understanding is that, if a special
character is required to mark the end of a file, the system is
responsible for writing that character to the file; for example,
it might be done implicitly by fclose(). Which means that, when
reading the same file, the ^Z character (control-Z, ASCII code 26)
will trigger an end-of-file condition, which will cause fgetc()
to return EOF. (In other words, you'll never see the ^Z character,
either on input or on output, assuming you read and write the file
as text.)

On the other hand, I'm not sure whether DOS actually *requires*
the ^Z character. A quick experiment on Windows indicates that a
^Z in a text file acts like the end of the file, but if there's no
^Z you can just read to the physical end of the file.

Bottom line: C defines a model for text files in which there is no
EOF character; rather EOF is an out-of-band signalling mechanism.
The implementation does whatever is necessary to map the underlying
file semantics onto the C model.

[...]
 
B

Ben Pfaff

Keith Thompson said:
For example, this program compiles and runs with no apparent problems
on my system:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *out = fopen("out.txt", "w");
/* error checking not shown for now */

fputs("Hello, world\n", out);
fputc(EOF, out);
fclose(out);

return 0;
}

But the resulting out.txt file contains an extra 0xff character at
the end (at least on my system). Why? Because EOF happens to be
(-1) on my system (and most others, but the standard only guarntees
that it's negative), and fputc() converts the value -1 to a bytes
0xff (255).

This particular byte value has nothing to do with marking the end of
a file on my system (or, probably, on yours).

It's kind of interesting that some systems *do* use a particular
byte value to mark the end of a text file (DOS uses byte value
26, for example), but putc(EOF, stream) won't write that value.
I guess they could choose a value that would work out correctly
modulo 256, e.g. -230 (because -230 + 256 = 26). That would be a
great way to confuse people like the OP.
 
S

Seebs

When I call fclose() on a file I've been writing to, does fclose always
write the EOF character before closing the file? Or do I need to write
the EOF myself?

What is this "EOF character"?
Another question: If I've got a file opened in update (r+) mode, how can
I delete a character in the file? I want to do this without needing to
create a second file and copy all but the characters I want to delete.
Is there some kind of control sequence that will do this?

No. "Control sequences" are an unrelated concept.

I would give more detailed answers, but the flood of trolling from
aioe.org recently has left me uninclined to put in the effort.

-s
 
G

Gene

It's kind of interesting that some systems *do* use a particular
byte value to mark the end of a text file (DOS uses byte value
26, for example), but putc(EOF, stream) won't write that value.
I guess they could choose a value that would work out correctly
modulo 256, e.g. -230 (because -230 + 256 = 26).  That would be a
great way to confuse people like the OP.

I'm really pushing old memories here, but the Ctl-Z business in
Windows is a legacy of early MSDOS and ultimately CPM, from which many
of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
back- rather than forward slashes...). In those ancient floppy disk-
oriented systems, directory information was maintained in units of 128
byte sectors, so an end of file character was needed to determine
where within the final sector the end of the file actually lay. To
maintain compatibility with such files, much later systems (like
modern C stdio routines for Windows text mode files) still honor the
Ctl-Z. Thus spake Bill Gates...
 
R

robertwessel2

I'm really pushing old memories here, but the Ctl-Z business in
Windows is a legacy of early MSDOS and ultimately CPM, from which many
of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
back- rather than forward slashes...).  In those ancient floppy disk-
oriented systems, directory information was maintained in units of 128
byte sectors, so an end of file character was needed to determine
where within the final sector the end of the file actually lay.  To
maintain compatibility with such files, much later systems (like
modern C stdio routines for Windows text mode files) still honor the
Ctl-Z.  Thus spake Bill Gates...


Actually that was just CP/M. MS-DOS has tracked file lengths to the
byte from day one, but retained the common use of control-Z as EOF for
compatibility. But for the most part it was things at the application
level (including, of course, things like the OS command line
utilities) that had to interpret the control-Z, the MS-DOS file I/O
functions were mostly oblivious. And somewhat inconsistent control-Z
handling was there in MS-DOS applications from day one, although
almost all applications that read text files would successfully read
one with a "missing" control-Z. Control-Z's in the middle of files
were often issues - many text editors allowed you to enter a control-Z
in the middle of a file, and then you're file would get truncated the
next time it would get loaded. These days most applications will
accept, and silently ignore, a trailing control-Z on an input text
file, or just treat it as a special character.
 
M

Malcolm McLean

I'm really pushing old memories here, but the Ctl-Z business in
Windows is a legacy of early MSDOS and ultimately CPM, from which many
of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
back- rather than forward slashes...).
Which must have cost tens of millions of dollars in problems with
software not being compatible between Unix and PCs, purely because of
the path specification issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top