Writing files

Discussion in 'C Programming' started by Edward Rutherford, Dec 6, 2010.

  1. Hello

    When I call fclose() on a file I've been writing to, does fclose always
    write the EOF character before closing the file? Or do I need to write
    the EOF myself?

    Another question: If I've got a file opened in update (r+) mode, how can
    I delete a character in the file? I want to do this without needing to
    create a second file and copy all but the characters I want to delete.
    Is there some kind of control sequence that will do this?

    Any help greatly appreciated.
     
    Edward Rutherford, Dec 6, 2010
    #1
    1. Advertising

  2. Edward Rutherford <> writes:
    > When I call fclose() on a file I've been writing to, does fclose always
    > write the EOF character before closing the file? Or do I need to write
    > the EOF myself?


    If the system requires some special character to be written to the end
    of a file, fclose() should take care of writing it.

    In general, there is no EOF character. EOF is a value returned by
    certain input functions to indicate that no character is available.

    On most systems, especially for binary files, the end of a file is
    simply the end of the file, typically recorded in the directory
    entry as the file's size in bytes.

    > Another question: If I've got a file opened in update (r+) mode, how can
    > I delete a character in the file? I want to do this without needing to
    > create a second file and copy all but the characters I want to delete.
    > Is there some kind of control sequence that will do this?


    There's no portable way to do this other than re-writing the file,
    or at least the portion of it following the deleted character.
    There is typically no non-portable way to do it either.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Dec 6, 2010
    #2
    1. Advertising

  3. Edward Rutherford

    Ben Pfaff Guest

    Edward Rutherford <> writes:

    > When I call fclose() on a file I've been writing to, does fclose always
    > write the EOF character before closing the file? Or do I need to write
    > the EOF myself?


    EOF is just a special non-character value returned by certain
    standard library functions to indicate error or end-of-file. It
    is not a character, so you can't write it to a file. fclose()
    will take care of whatever needs to happen here.

    > Another question: If I've got a file opened in update (r+) mode, how can
    > I delete a character in the file? I want to do this without needing to
    > create a second file and copy all but the characters I want to delete.
    > Is there some kind of control sequence that will do this?


    There's no standard way to do that, and most operating systems
    don't provide any way to do it, other than the technique you
    describe.

    This is a FAQ:

    19.14: How can I insert or delete a line (or record) in the middle of a
    file?

    A: Short of rewriting the file, you probably can't. The usual
    solution is simply to rewrite the file. (Instead of deleting
    records, you might consider simply marking them as deleted, to
    avoid rewriting.) Another possibility, of course, is to use a
    database instead of a flat file. See also questions 12.30 and
    19.13.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Dec 6, 2010
    #3
  4. Edward Rutherford <> wrote:
    > When I call fclose() on a file I've been writing to, does fclose always
    > write the EOF character before closing the file? Or do I need to write
    > the EOF myself?


    A "EOF character" does not exist. You may expect such a
    character to exist since e.g. getc() will return the value
    named 'EOF' when the end of the file is reached. But that
    doesn't mean that there was a character like that in the
    file, this is only a special value returned when the end
    of the file has been reached. Incidentally, since getc()
    returns the non-character vale EOF under this condition
    getc()'s return value is int and not char - since EOF isn't
    a char it otherwise would not be possible to return this
    value.

    Some of the confusion may come from the fact that, if I remember
    correctly, in DOS there could be a special character ('^Z') that
    in text files was treated by DOS as an end of file marker. But
    that character doesn't have anything to do with EOF. If you
    want that kind of character in the file then you have to write
    it in there tourself, it isn't written into the file by itself.

    > Another question: If I've got a file opened in update (r+) mode, how can
    > I delete a character in the file? I want to do this without needing to
    > create a second file and copy all but the characters I want to delete.
    > Is there some kind of control sequence that will do this?


    There aren't any control sequences to do that. The only way
    to do part of the job is by copying all the stuff that comes
    after the character you want renoved one place nearer to the
    beginning of the file. But then you still need to truncate
    the file by one character and, as far as I know, that's only
    possible using system specific functions (e.g. under UNIX with
    truncate() or ftruncate(), Windows will probably have something
    similar). So creating a copy without the character you want to
    get rid off, then deleting the old file with remove() and fi-
    nally renaming the new file using rename() to give it the
    original name is the only way I am aware of when restricting
    yourself to using no system-dependent functions and not rely-
    ing on implementation-defined behaviour.

    Regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
     
    Jens Thoms Toerring, Dec 6, 2010
    #4
  5. Ben Pfaff <> writes:
    > Edward Rutherford <> writes:
    >
    >> When I call fclose() on a file I've been writing to, does fclose always
    >> write the EOF character before closing the file? Or do I need to write
    >> the EOF myself?

    >
    > EOF is just a special non-character value returned by certain
    > standard library functions to indicate error or end-of-file. It
    > is not a character, so you can't write it to a file. fclose()
    > will take care of whatever needs to happen here.


    You can't meaningfully write EOF to a file, but if you try the compiler
    is likely to let you get away with it (and do something you probably
    didn't expect).

    For example, this program compiles and runs with no apparent problems
    on my system:

    #include <stdio.h>
    #include <stdlib.h>
    int main(void)
    {
    FILE *out = fopen("out.txt", "w");
    /* error checking not shown for now */

    fputs("Hello, world\n", out);
    fputc(EOF, out);
    fclose(out);

    return 0;
    }

    But the resulting out.txt file contains an extra 0xff character at
    the end (at least on my system). Why? Because EOF happens to be
    (-1) on my system (and most others, but the standard only guarntees
    that it's negative), and fputc() converts the value -1 to a bytes
    0xff (255).

    This particular byte value has nothing to do with marking the end of
    a file on my system (or, probably, on yours).

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Dec 6, 2010
    #5
  6. (Jens Thoms Toerring) writes:
    [...]
    > Some of the confusion may come from the fact that, if I remember
    > correctly, in DOS there could be a special character ('^Z') that
    > in text files was treated by DOS as an end of file marker. But
    > that character doesn't have anything to do with EOF.


    Right.

    > If you
    > want that kind of character in the file then you have to write
    > it in there tourself, it isn't written into the file by itself.


    I'm not sure that's correct. My understanding is that, if a special
    character is required to mark the end of a file, the system is
    responsible for writing that character to the file; for example,
    it might be done implicitly by fclose(). Which means that, when
    reading the same file, the ^Z character (control-Z, ASCII code 26)
    will trigger an end-of-file condition, which will cause fgetc()
    to return EOF. (In other words, you'll never see the ^Z character,
    either on input or on output, assuming you read and write the file
    as text.)

    On the other hand, I'm not sure whether DOS actually *requires*
    the ^Z character. A quick experiment on Windows indicates that a
    ^Z in a text file acts like the end of the file, but if there's no
    ^Z you can just read to the physical end of the file.

    Bottom line: C defines a model for text files in which there is no
    EOF character; rather EOF is an out-of-band signalling mechanism.
    The implementation does whatever is necessary to map the underlying
    file semantics onto the C model.

    [...]

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Dec 6, 2010
    #6
  7. Edward Rutherford

    Ben Pfaff Guest

    Keith Thompson <> writes:

    > For example, this program compiles and runs with no apparent problems
    > on my system:
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    > int main(void)
    > {
    > FILE *out = fopen("out.txt", "w");
    > /* error checking not shown for now */
    >
    > fputs("Hello, world\n", out);
    > fputc(EOF, out);
    > fclose(out);
    >
    > return 0;
    > }
    >
    > But the resulting out.txt file contains an extra 0xff character at
    > the end (at least on my system). Why? Because EOF happens to be
    > (-1) on my system (and most others, but the standard only guarntees
    > that it's negative), and fputc() converts the value -1 to a bytes
    > 0xff (255).
    >
    > This particular byte value has nothing to do with marking the end of
    > a file on my system (or, probably, on yours).


    It's kind of interesting that some systems *do* use a particular
    byte value to mark the end of a text file (DOS uses byte value
    26, for example), but putc(EOF, stream) won't write that value.
    I guess they could choose a value that would work out correctly
    modulo 256, e.g. -230 (because -230 + 256 = 26). That would be a
    great way to confuse people like the OP.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Dec 6, 2010
    #7
  8. Edward Rutherford

    Seebs Guest

    On 2010-12-06, Edward Rutherford <> wrote:
    > When I call fclose() on a file I've been writing to, does fclose always
    > write the EOF character before closing the file? Or do I need to write
    > the EOF myself?


    What is this "EOF character"?

    > Another question: If I've got a file opened in update (r+) mode, how can
    > I delete a character in the file? I want to do this without needing to
    > create a second file and copy all but the characters I want to delete.
    > Is there some kind of control sequence that will do this?


    No. "Control sequences" are an unrelated concept.

    I would give more detailed answers, but the flood of trolling from
    aioe.org recently has left me uninclined to put in the effort.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
     
    Seebs, Dec 7, 2010
    #8
  9. Edward Rutherford

    Gene Guest

    On Dec 6, 5:59 pm, Ben Pfaff <> wrote:
    > Keith Thompson <> writes:
    > > For example, this program compiles and runs with no apparent problems
    > > on my system:

    >
    > > #include <stdio.h>
    > > #include <stdlib.h>
    > > int main(void)
    > > {
    > >     FILE *out = fopen("out.txt", "w");
    > >     /* error checking not shown for now */

    >
    > >     fputs("Hello, world\n", out);
    > >     fputc(EOF, out);
    > >     fclose(out);

    >
    > >     return 0;
    > > }

    >
    > > But the resulting out.txt file contains an extra 0xff character at
    > > the end (at least on my system).  Why?  Because EOF happens to be
    > > (-1) on my system (and most others, but the standard only guarntees
    > > that it's negative), and fputc() converts the value -1 to a bytes
    > > 0xff (255).

    >
    > > This particular byte value has nothing to do with marking the end of
    > > a file on my system (or, probably, on yours).

    >
    > It's kind of interesting that some systems *do* use a particular
    > byte value to mark the end of a text file (DOS uses byte value
    > 26, for example), but putc(EOF, stream) won't write that value.
    > I guess they could choose a value that would work out correctly
    > modulo 256, e.g. -230 (because -230 + 256 = 26).  That would be a
    > great way to confuse people like the OP.


    I'm really pushing old memories here, but the Ctl-Z business in
    Windows is a legacy of early MSDOS and ultimately CPM, from which many
    of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
    back- rather than forward slashes...). In those ancient floppy disk-
    oriented systems, directory information was maintained in units of 128
    byte sectors, so an end of file character was needed to determine
    where within the final sector the end of the file actually lay. To
    maintain compatibility with such files, much later systems (like
    modern C stdio routines for Windows text mode files) still honor the
    Ctl-Z. Thus spake Bill Gates...
     
    Gene, Dec 7, 2010
    #9
  10. Edward Rutherford

    Guest

    On Dec 6, 6:57 pm, Gene <> wrote:
    > On Dec 6, 5:59 pm, Ben Pfaff <> wrote:
    >
    >
    >
    >
    >
    > > Keith Thompson <> writes:
    > > > For example, this program compiles and runs with no apparent problems
    > > > on my system:

    >
    > > > #include <stdio.h>
    > > > #include <stdlib.h>
    > > > int main(void)
    > > > {
    > > >     FILE *out = fopen("out.txt", "w");
    > > >     /* error checking not shown for now */

    >
    > > >     fputs("Hello, world\n", out);
    > > >     fputc(EOF, out);
    > > >     fclose(out);

    >
    > > >     return 0;
    > > > }

    >
    > > > But the resulting out.txt file contains an extra 0xff character at
    > > > the end (at least on my system).  Why?  Because EOF happens to be
    > > > (-1) on my system (and most others, but the standard only guarntees
    > > > that it's negative), and fputc() converts the value -1 to a bytes
    > > > 0xff (255).

    >
    > > > This particular byte value has nothing to do with marking the end of
    > > > a file on my system (or, probably, on yours).

    >
    > > It's kind of interesting that some systems *do* use a particular
    > > byte value to mark the end of a text file (DOS uses byte value
    > > 26, for example), but putc(EOF, stream) won't write that value.
    > > I guess they could choose a value that would work out correctly
    > > modulo 256, e.g. -230 (because -230 + 256 = 26).  That would be a
    > > great way to confuse people like the OP.

    >
    > I'm really pushing old memories here, but the Ctl-Z business in
    > Windows is a legacy of early MSDOS and ultimately CPM, from which many
    > of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
    > back- rather than forward slashes...).  In those ancient floppy disk-
    > oriented systems, directory information was maintained in units of 128
    > byte sectors, so an end of file character was needed to determine
    > where within the final sector the end of the file actually lay.  To
    > maintain compatibility with such files, much later systems (like
    > modern C stdio routines for Windows text mode files) still honor the
    > Ctl-Z.  Thus spake Bill Gates...



    Actually that was just CP/M. MS-DOS has tracked file lengths to the
    byte from day one, but retained the common use of control-Z as EOF for
    compatibility. But for the most part it was things at the application
    level (including, of course, things like the OS command line
    utilities) that had to interpret the control-Z, the MS-DOS file I/O
    functions were mostly oblivious. And somewhat inconsistent control-Z
    handling was there in MS-DOS applications from day one, although
    almost all applications that read text files would successfully read
    one with a "missing" control-Z. Control-Z's in the middle of files
    were often issues - many text editors allowed you to enter a control-Z
    in the middle of a file, and then you're file would get truncated the
    next time it would get loaded. These days most applications will
    accept, and silently ignore, a trailing control-Z on an input text
    file, or just treat it as a special character.
     
    , Dec 7, 2010
    #10
  11. On Dec 7, 2:57 am, Gene <> wrote:
    >
    > I'm really pushing old memories here, but the Ctl-Z business in
    > Windows is a legacy of early MSDOS and ultimately CPM, from which many
    > of the ideas of MSDOS were shamelessly copied (well, MSDOS _did_ use
    > back- rather than forward slashes...).
    >

    Which must have cost tens of millions of dollars in problems with
    software not being compatible between Unix and PCs, purely because of
    the path specification issue.
     
    Malcolm McLean, Dec 7, 2010
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page