Portable EOL?

Discussion in 'C Programming' started by bwaichu@yahoo.com, Aug 19, 2006.

  1. Guest

    I have written the program below to just create and populate an html
    file. I am running into a problem when viewing the created file in vi.
    I am told by vi that the file does not have an end of line.

    Here's the code:

    int
    main(int argc, char **argv) {

    FILE *fd;
    int x;

    char buff[8196];
    x = 0;

    bzero(buff, sizeof(buff));

    (void)strlcpy(buff,"<HTML>", sizeof(buff));

    while (x < 1024) {
    (void)strlcat(buff,"A",sizeof(buff));
    x++;
    }
    (void)strlcat(buff, "</HTML>", sizeof(buff));

    fd = fopen("test5.html", "w+");
    if (fd == NULL)
    errx(-1, "failed to open");
    (void)fprintf(fd, "%s", buff);
    (void)fclose(fd);
    exit(EXIT_SUCCESS);
    }

    Now, vi will not warn about noeol if I change this line:

    (void)fprintf(fd, "%s", buff);

    to

    (void)fprintf(fd, "%s\n", buff);

    Where I am confused is that I thought streams automatically terminated
    the file with an CR (0a) in a unix environment. What bothers me is
    that the above would in theory make the code less portable. Am I
    overlooking something?
     
    , Aug 19, 2006
    #1
    1. Advertising

  2. Ben Pfaff Guest

    "" <> writes:

    > Now, vi will not warn about noeol if I change this line:
    >
    > (void)fprintf(fd, "%s", buff);
    >
    > to
    >
    > (void)fprintf(fd, "%s\n", buff);
    >
    > Where I am confused is that I thought streams automatically terminated
    > the file with an CR (0a) in a unix environment.


    No, that's just wrong. The last line in a text stream needs to
    be explicitly terminated with a new-line character.
    --
    "Programmers have the right to be ignorant of many details of your code
    and still make reasonable changes."
    --Kernighan and Plauger, _Software Tools_
     
    Ben Pfaff, Aug 19, 2006
    #2
    1. Advertising

  3. On 19 Aug 2006 11:08:32 -0700, "" <>
    wrote:

    >I have written the program below to just create and populate an html
    >file. I am running into a problem when viewing the created file in vi.
    > I am told by vi that the file does not have an end of line.
    >
    >Here's the code:
    >
    >int
    >main(int argc, char **argv) {
    >
    > FILE *fd;
    > int x;
    >
    > char buff[8196];
    > x = 0;
    >
    > bzero(buff, sizeof(buff));
    >
    > (void)strlcpy(buff,"<HTML>", sizeof(buff));


    Why a non-standard function? What does it do that you could not do
    with strcpy or strncpy? Since the source string is only seven
    characters, do you really want to process 8,196.

    >
    > while (x < 1024) {
    > (void)strlcat(buff,"A",sizeof(buff));
    > x++;
    > }
    > (void)strlcat(buff, "</HTML>", sizeof(buff));


    At this point, buff contains a six character header, 1024 copies of
    the letter A, a seven character trailer, a sting terminating '\0', and
    some 7000+ characters of no interest. At no point did you ever place
    a '\n' in this array.

    >
    > fd = fopen("test5.html", "w+");
    > if (fd == NULL)
    > errx(-1, "failed to open");
    > (void)fprintf(fd, "%s", buff);


    Your file now contains the characters described above, up to but not
    including the '\0'.

    > (void)fclose(fd);
    > exit(EXIT_SUCCESS);
    >}
    >
    >Now, vi will not warn about noeol if I change this line:
    >
    >(void)fprintf(fd, "%s", buff);
    >
    >to
    >
    >(void)fprintf(fd, "%s\n", buff);


    Your format string now includes an additional character which will get
    written to the file immediately after your data.

    >
    >Where I am confused is that I thought streams automatically terminated
    >the file with an CR (0a) in a unix environment. What bothers me is


    If that were true, then multiple calls to fprintf could not be used to
    build up a line from separate pieces.

    >that the above would in theory make the code less portable. Am I
    >overlooking something?


    Less portable than what? The \n is C's portable way of indicating
    that the output should contain an end of line indicator at this point.
    The run time library will perform the magic necessary for your system.
    This may be an 0a for unix or 0a0d (or is it 0d0a) for windows. It
    will be something completely different for my IBM mainframe depending
    on whether the RECFM is U, V, or F. The point is it is portable. The
    code doesn't have to care and the compiler doesn't care much, if at
    all.


    Remove del for email
     
    Barry Schwarz, Aug 19, 2006
    #3
  4. Guest

    Ben Pfaff wrote:

    > No, that's just wrong. The last line in a text stream needs to
    > be explicitly terminated with a new-line character.


    Thanks. That is where I am wrong.

    Are there any good reasons to use streams versus unix I/O besides
    portability? I realize that if I use unix I/O I have to do my own
    buffering,
    but that's not a big deal.

    Oh, the 8196 buffer was left over from re-writing this from unix I/O to
    streams.
    And strlcat is vastly superior to strcat and strncat.
     
    , Aug 19, 2006
    #4
  5. Malcolm Guest

    <> wrote in message
    >I have written the program below to just create and populate an html
    > file. I am running into a problem when viewing the created file in vi.
    > I am told by vi that the file does not have an end of line.
    >
    > Here's the code:
    >
    > int
    > main(int argc, char **argv) {
    >
    > FILE *fd;
    > int x;
    >
    > char buff[8196];
    > x = 0;
    >
    > bzero(buff, sizeof(buff));
    >
    > (void)strlcpy(buff,"<HTML>", sizeof(buff));
    >
    > while (x < 1024) {
    > (void)strlcat(buff,"A",sizeof(buff));
    > x++;
    > }
    > (void)strlcat(buff, "</HTML>", sizeof(buff));
    >
    > fd = fopen("test5.html", "w+");
    > if (fd == NULL)
    > errx(-1, "failed to open");
    > (void)fprintf(fd, "%s", buff);
    > (void)fclose(fd);
    > exit(EXIT_SUCCESS);
    > }
    >
    > Now, vi will not warn about noeol if I change this line:
    >
    > (void)fprintf(fd, "%s", buff);
    >
    > to
    >
    > (void)fprintf(fd, "%s\n", buff);
    >
    > Where I am confused is that I thought streams automatically terminated
    > the file with an CR (0a) in a unix environment. What bothers me is
    > that the above would in theory make the code less portable. Am I
    > overlooking something?
    >

    So OSes might be a bit hazy about text files that don't end with a newline
    character.
    The C standard doesn't specify that fclose() will automatically append a
    newline if missing, so the only sensible thing to do is to add it yourself.
    All file systems will handle

    fp = fopen("temp.txt", "w");
    fprintf(fp, "Hello world\n");
    fclose(fp);

    in a sensible way. If you don't add the newline you may or may not cause
    problems.
    --
    www.personal.leeds.ac.uk/~bgy1mm
    freeware games to download.
     
    Malcolm, Aug 19, 2006
    #5
  6. Ben Pfaff <> writes:
    > "" <> writes:
    >> Now, vi will not warn about noeol if I change this line:
    >>
    >> (void)fprintf(fd, "%s", buff);
    >>
    >> to
    >>
    >> (void)fprintf(fd, "%s\n", buff);
    >>
    >> Where I am confused is that I thought streams automatically terminated
    >> the file with an CR (0a) in a unix environment.

    >
    > No, that's just wrong. The last line in a text stream needs to
    > be explicitly terminated with a new-line character.


    More precisely, "Whether the last line requires a terminating new-line
    character is implementation-defined." (C99 7.19.2p2)

    <OT>
    In Unix, there's nothing *inherently* wrong with having a text file
    without a terminating new-line, but it's rarely what you want. vi
    will most likely complain about it, emacs can be configured to behave
    in any of several ways, and other utilities might or might not
    misbehave in various ways.
    </OT>

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Aug 19, 2006
    #6
  7. "" <> writes:
    > Ben Pfaff wrote:
    >
    >> No, that's just wrong. The last line in a text stream needs to
    >> be explicitly terminated with a new-line character.

    >
    > Thanks. That is where I am wrong.
    >
    > Are there any good reasons to use streams versus unix I/O besides
    > portability? I realize that if I use unix I/O I have to do my own
    > buffering, but that's not a big deal.


    The question is, are there any good reasons to use Unix I/O rather
    than standard C streams?

    One good reason *not* to use Unix-specific I/O is that it's less
    portable; there are some extra features, but I don't think you're
    using any of them. Another good reason is that we don't discuss
    system-specific features here.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Aug 19, 2006
    #7
  8. Jack Klein Guest

    On Sat, 19 Aug 2006 21:19:19 GMT, Keith Thompson <> wrote
    in comp.lang.c:

    > Ben Pfaff <> writes:
    > > "" <> writes:
    > >> Now, vi will not warn about noeol if I change this line:
    > >>
    > >> (void)fprintf(fd, "%s", buff);
    > >>
    > >> to
    > >>
    > >> (void)fprintf(fd, "%s\n", buff);
    > >>
    > >> Where I am confused is that I thought streams automatically terminated
    > >> the file with an CR (0a) in a unix environment.

    > >
    > > No, that's just wrong. The last line in a text stream needs to
    > > be explicitly terminated with a new-line character.

    >
    > More precisely, "Whether the last line requires a terminating new-line
    > character is implementation-defined." (C99 7.19.2p2)


    Actually, in this particular case, this is not really relevant. It
    has nothing to do with an "ordinary" text file, and everything to do
    with the format of the particular file type he is writing.

    The file he produces might or might not be a valid text file on his
    platform. Even if it is, it is not a valid HTML file because it
    violates the HTML standard. That standard, like the C standard for C
    source files, requires that the last line of a file have a terminating
    newline.

    > <OT>
    > In Unix, there's nothing *inherently* wrong with having a text file
    > without a terminating new-line, but it's rarely what you want. vi
    > will most likely complain about it, emacs can be configured to behave
    > in any of several ways, and other utilities might or might not
    > misbehave in various ways.
    > </OT>


    Right, it is merely some HTML validation tool pointing out a violation
    of the HTML standard.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://c-faq.com/
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Aug 20, 2006
    #8
  9. Guest

    Jack Klein wrote:

    > Actually, in this particular case, this is not really relevant. It
    > has nothing to do with an "ordinary" text file, and everything to do
    > with the format of the particular file type he is writing.
    >
    > The file he produces might or might not be a valid text file on his
    > platform. Even if it is, it is not a valid HTML file because it
    > violates the HTML standard. That standard, like the C standard for C
    > source files, requires that the last line of a file have a terminating
    > newline.


    My thought that streams automatically terminate with CR's followed by a
    NULL value was incorrect. The file definitely does not follow the
    requirements for the HTTP Protocol.

    Now, since the string is a C string, I would follow the terminating new
    line
    with \0, right?

    So following the C standard, I would be:

    <text><\n><\0>

    where the C standard just requires the string to be terminated with a
    NULL value,
    right?
     
    , Aug 20, 2006
    #9
  10. "" <> writes:
    > Jack Klein wrote:
    >
    >> Actually, in this particular case, this is not really relevant. It
    >> has nothing to do with an "ordinary" text file, and everything to do
    >> with the format of the particular file type he is writing.
    >>
    >> The file he produces might or might not be a valid text file on his
    >> platform. Even if it is, it is not a valid HTML file because it
    >> violates the HTML standard. That standard, like the C standard for C
    >> source files, requires that the last line of a file have a terminating
    >> newline.

    >
    > My thought that streams automatically terminate with CR's followed by a
    > NULL value was incorrect. The file definitely does not follow the
    > requirements for the HTTP Protocol.
    >
    > Now, since the string is a C string, I would follow the terminating new
    > line
    > with \0, right?
    >
    > So following the C standard, I would be:
    >
    > <text><\n><\0>
    >
    > where the C standard just requires the string to be terminated with a
    > NULL value,
    > right?


    Um, no.

    First of all, you're misusing the word NULL. NULL is a macro, defined
    in <stddef.h>, that expands to a null pointer constant. It is *not*
    the same thing as a null character, '\0' (sometimes called NUL).

    A C string, stored in memory, is terminated by a trailing '\0'
    character. A text file consists of a sequence of lines, where each
    line is terminated by an end-of-line marker that appears as '\n' when
    you read or write it in a C program. Text files normally do not
    contain null characters.

    For example:

    char s[] = "hello, world";
    /*
    * The compiler implicitly adds a '\0' to the end of s, making it
    * a valid string.
    */
    fprintf(some_file, "%s\n", s);
    /*
    * This writes the characters of s, *not* including the trailing
    * '\0', to some_file. The added '\n' makes it a line.
    */

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Aug 20, 2006
    #10
  11. Guest

    wrote:

    > Where I am confused is that I thought streams automatically terminated
    > the file with an CR (0a) in a unix environment.


    CR short for carriage return is 0c in hexadecimal. NL short
    for newline or LF short for linefeed is 0a. And as others have
    pointed out neither is guaranteed to be the last byte in a Unix
    file. Try typing
    cat > some-file
    type a few characters and press Control-D twice. You'll get a
    file without a newline in the end.

    By the way , all the above is out of topic here.
     
    , Aug 20, 2006
    #11
  12. Joe Wright Guest

    wrote:
    > wrote:
    >
    >> Where I am confused is that I thought streams automatically terminated
    >> the file with an CR (0a) in a unix environment.

    >
    > CR short for carriage return is 0c in hexadecimal. NL short
    > for newline or LF short for linefeed is 0a. And as others have
    > pointed out neither is guaranteed to be the last byte in a Unix
    > file. Try typing
    > cat > some-file
    > type a few characters and press Control-D twice. You'll get a
    > file without a newline in the end.
    >
    > By the way , all the above is out of topic here.
    >


    Both spibou and bwaichu seem unable to read a simple ASCII code chart.
    The CR character is 0D (13). The Unix newline char NL is (10). It is the
    formfeed character FF which is 0C or (12).

    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Aug 20, 2006
    #12
  13. Guest

    Joe Wright wrote:

    >
    > Both spibou and bwaichu seem unable to read a simple ASCII code chart.
    > The CR character is 0D (13). The Unix newline char NL is (10). It is the
    > formfeed character FF which is 0C or (12).


    What the deal with the insults? If we had all the answers, we wouldn't
    be asking questions.
    And NL is 0a.

    Here's the acsii table:

    http://www.lookuptables.com/

    I made a typo. Just like I wrote NULL instead of NUL. Typos happen.
    That is why editors have jobs.

    Jeez.
     
    , Aug 20, 2006
    #13
  14. Joe Wright Guest

    wrote:
    > Joe Wright wrote:
    >
    >> Both spibou and bwaichu seem unable to read a simple ASCII code chart.
    >> The CR character is 0D (13). The Unix newline char NL is (10). It is the
    >> formfeed character FF which is 0C or (12).

    >
    > What the deal with the insults? If we had all the answers, we wouldn't
    > be asking questions.
    > And NL is 0a.
    >
    > Here's the acsii table:
    >
    > http://www.lookuptables.com/
    >
    > I made a typo. Just like I wrote NULL instead of NUL. Typos happen.
    > That is why editors have jobs.
    >


    I didn't mean it as a personal insult. I was correcting errors. I don't
    like to correct careless errors, typographical or otherwise. I expect
    you to proof-read your article and correct your own errors as best you
    can before posting it. Too much to expect?

    I wrote NL is (10) and you correct it to 0a? What's that about? As to
    correctness, 10 is ten. 0x0a is ten. 012 is ten.

    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Aug 21, 2006
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eli Bendersky
    Replies:
    1
    Views:
    1,197
    Mike Treseler
    Mar 1, 2006
  2. Daves

    replacing EOL with <br>

    Daves, Apr 13, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    528
    Karl Seguin
    Apr 14, 2005
  3. Tim Tyler
    Replies:
    7
    Views:
    23,132
    danielson317
    Sep 15, 2011
  4. Xah Lee

    replacing two EOL chars by one

    Xah Lee, Dec 20, 2003, in forum: Python
    Replies:
    20
    Views:
    695
  5. Replies:
    7
    Views:
    937
Loading...

Share This Page