Questions about K&R (Kernighan and Ritchi)

Discussion in 'C Programming' started by sandeep, Apr 22, 2010.

  1. sandeep

    sandeep Guest

    Hello friends ~

    I am learning C from the K&R book. I have questions about Section 8.5
    ("an implementation of Fopen and Getc"). Although this section is UNIX(r)
    specific I think all my questions are really about standard C... so the
    ISO taliban can relax... :D

    1> Look at this Macro
    #define feof(p) ((p)->flag & _EOF) != 0)

    My question is: feof is only specified to return 0 or not 0. There is no
    requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    to force it to be 0 or 1? This seems very inefficient, after all feof is
    likely to be called many times.

    2> Here is another macro
    #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
    Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
    function call and one bracket to stop expansion of sideeffects in p?

    3> In a comment on that getc Macro, K&R say: "The characters are returned
    unsigned, which ensures that all characters will be positive". I don't
    really understand the point of this, I usually use char not unsigned char
    for characters. And in K&R, all strings are of type char* not unsigned
    char*.

    Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    Regards ~
     
    sandeep, Apr 22, 2010
    #1
    1. Advertising

  2. sandeep <> writes:
    > I am learning C from the K&R book. I have questions about Section 8.5
    > ("an implementation of Fopen and Getc"). Although this section is UNIX(r)
    > specific I think all my questions are really about standard C... so the
    > ISO taliban can relax... :D


    I see the smiley, but referring to those of us who prefer to
    discuss ISO C as "taliban" is a bit insulting, don't you think?
    (And yes, I know the word literally means "students", but I doubt
    that that's what you meant.)

    > 1> Look at this Macro
    > #define feof(p) ((p)->flag & _EOF) != 0)
    >
    > My question is: feof is only specified to return 0 or not 0. There is no
    > requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    > to force it to be 0 or 1? This seems very inefficient, after all feof is
    > likely to be called many times.


    Yes, the "!= 0" could be omitted, but it's not likely to be a big deal.
    Since it's a macro, a compiler is likely to omit the extra calculation
    anyway.

    And no, feof() isn't likely to be called many times in well written
    code. The way to determine whether you've reached the end of an input
    stream is by checking the result of the reading function (for example,
    getc() returns the value EOF). *After* that's happened, you can call
    feof() to determine whether you reached end-of-file or encountered an
    error.

    > 2> Here is another macro
    > #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
    > Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
    > function call and one bracket to stop expansion of sideeffects in p?


    No, extra parentheses aren't needed. As long as the name of the macro
    parameter is immediately surrounded by parentheses (or brackets),
    there's no problem with operator precedence.

    And it's not about "expansion of side effects", it's about operator
    precedence, i.e., which operators are associated with which operands.
    Any side effects will occur anyway.

    > 3> In a comment on that getc Macro, K&R say: "The characters are returned
    > unsigned, which ensures that all characters will be positive". I don't
    > really understand the point of this, I usually use char not unsigned char
    > for characters. And in K&R, all strings are of type char* not unsigned
    > char*.
    >
    > Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    > UCHARMAX will clash with EOF == -1 when it gets promoted to int.


    getc() returns a result of type int, not char. For example, if
    UCHAR_MAX is 255, then getc() will return the value 255 if you read a
    '\xff' character, and the value -1 (assuming EOF==-1) if you encounter
    the end of the stream or an error. They clash only if you store the
    result in something smaller than an int. So don't do that.

    See section 12 of the comp.lang.c FAQ,
    <http://www.c-faq.com/stdio/index.html>, especially the first few
    questions.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 22, 2010
    #2
    1. Advertising

  3. sandeep

    Seebs Guest

    On 2010-04-22, sandeep <> wrote:
    > 1> Look at this Macro
    > #define feof(p) ((p)->flag & _EOF) != 0)
    >
    > My question is: feof is only specified to return 0 or not 0. There is no
    > requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    > to force it to be 0 or 1? This seems very inefficient, after all feof is
    > likely to be called many times.


    I've seen code like this written for the same reason that some people
    write
    if (p != NULL)
    instead of
    if (p)

    It's clearer to the user. The compiler may well notice that no one uses
    the specific value and just run past it.

    > 2> Here is another macro
    > #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
    > Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
    > function call and one bracket to stop expansion of sideeffects in p?


    No, because there's no such thing as "expansion of sideffects". Parentheses
    are used *only* to control grouping -- they have no effect on side
    effects. As such, the () around p are sufficient whether or not they're
    also part of the function call.

    > 3> In a comment on that getc Macro, K&R say: "The characters are returned
    > unsigned, which ensures that all characters will be positive". I don't
    > really understand the point of this, I usually use char not unsigned char
    > for characters. And in K&R, all strings are of type char* not unsigned
    > char*.


    char may well be unsigned.

    The point of this is that converting everything to unsigned char means
    that every char value is necessarily non-negative, guaranteeing that no
    value returned which represents a character can compare equal to EOF,
    which is negative.

    > Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    > UCHARMAX will clash with EOF == -1 when it gets promoted to int.


    (Not necessarily, but I see your point.)

    I am not aware of an implementation where this can actually happen;
    specifically, I'm under the impression that such implementations are likely
    to simply only ever yield values in some smaller range from getchar(),
    so that EOF can never occur. A typical choice might be to have a 32-bit
    char object, but to only store 8 bits at a time in files or retrieve
    8 bits at a time when reading files.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Apr 22, 2010
    #3
  4. sandeep

    Alan Curry Guest

    In article <hqqco9$2t1$>,
    sandeep <> wrote:
    |Hello friends ~
    |
    |I am learning C from the K&R book. I have questions about Section 8.5
    |("an implementation of Fopen and Getc"). Although this section is UNIX(r)
    |specific I think all my questions are really about standard C... so the
    |ISO taliban can relax... :D

    Was this sample implementation written from scratch for the 2nd edition, or
    is it just an updated version of some code that predates the C standard?
    That would explain some of the things you're seeing...


    |
    |1> Look at this Macro
    |#define feof(p) ((p)->flag & _EOF) != 0)
    |
    |My question is: feof is only specified to return 0 or not 0. There is no
    |requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    |to force it to be 0 or 1? This seems very inefficient, after all feof is
    |likely to be called many times.

    If this implementation predates the standard, then what feof was "specified"
    to return might have been less clear, so making it return 0 or 1 would have
    been the safe thing to do.

    |
    |3> In a comment on that getc Macro, K&R say: "The characters are returned
    |unsigned, which ensures that all characters will be positive". I don't
    |really understand the point of this, I usually use char not unsigned char
    |for characters. And in K&R, all strings are of type char* not unsigned
    |char*.
    |
    |Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    |UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    Regardless of whether this implementation predates the standard, I think it's
    safe to say that sizeof(char) == sizeof(int) was not even considered a remote
    possibility when getc was designed.

    --
    Alan Curry
     
    Alan Curry, Apr 22, 2010
    #4
  5. sandeep

    Eric Sosman Guest

    On 4/22/2010 5:26 PM, Keith Thompson wrote:
    > sandeep<> writes:
    >> [...]
    >> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    >
    > getc() returns a result of type int, not char. For example, if
    > UCHAR_MAX is 255, then getc() will return the value 255 if you read a
    > '\xff' character, and the value -1 (assuming EOF==-1) if you encounter
    > the end of the stream or an error. They clash only if you store the
    > result in something smaller than an int. So don't do that.


    I think you've misunderstood the question. On a system
    where UCHAR_MAX > INT_MAX, getc() et al. have a problem: It
    is possible to read unsigned char values that won't fit in an
    int and hence can't be returned properly. What happens later
    is of little importance, since the damage has been done within
    getc() itself.

    On such a system, I think we can deduce (for hosted
    implementations)

    - Conversion of values in (INT_MAX, UCHAR_MAX] doesn't raise
    a signal or do anything untoward, but instead yields some
    implementation-defined value. (At least, it does so inside
    getc() et al, which need not be written in C.)

    - Each unsigned char value converts to a distinct int value;
    even the out-of-range conversions preserve information.

    - Since there must be as many values in [INT_MIN, -1] as in
    the span of out-of-range values, INT_MIN + INT_MAX == -1.
    That is, two's complement is mandatory.

    To cater to such systems (should one feel it necessary), the
    familiar

    int ch;
    while ((ch = getc(stream)) != EOF) ...

    needs to be rewritten as

    int ch;
    whie ((ch = getc(stream) != EOF
    || !(feof(stream) || ferror(stream))) ...

    because getc() must map one valid input character value to
    the int value EOF.

    Let us now ponder the perils of in-band signalling.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Apr 22, 2010
    #5
  6. Eric Sosman <> writes:
    > On 4/22/2010 5:26 PM, Keith Thompson wrote:
    >> sandeep<> writes:
    >>> [...]
    >>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    >>
    >> getc() returns a result of type int, not char. For example, if
    >> UCHAR_MAX is 255, then getc() will return the value 255 if you read a
    >> '\xff' character, and the value -1 (assuming EOF==-1) if you encounter
    >> the end of the stream or an error. They clash only if you store the
    >> result in something smaller than an int. So don't do that.

    >
    > I think you've misunderstood the question.


    I think you're right. I managed to miss the "sizeof(char) ==
    sizeof(int)" part of the question.

    Well, I answered *some* qusetion, just not the one the OP asked.

    [snip]

    > Let us now ponder the perils of in-band signalling.


    And of a language design that encourages it (by, for example, not
    providing a decent way for functions to return multiple values).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 22, 2010
    #6
  7. sandeep <> wrote:
    > I am learning C from the K&R book. I have questions about
    > Section 8.5 ("an implementation of Fopen and Getc").
    > Although this section is UNIX(r) specific I think all my
    > questions are really about standard C... so the
    > ISO taliban can relax... :D


    Ahh, the Jacob Navia school of begining by insulting the
    very people you're seeking comments from. Sure has worked
    well for him, hasn't it... ;)

    > 1> Look at this Macro
    > #define feof(p) ((p)->flag & _EOF) != 0)
    >
    > My question is: feof is only specified to return 0 or not 0.
    > There is no requirement for it to only return 0 or 1. So why
    > the unnecessary "!= 0" to force it to be 0 or 1? This seems
    > very inefficient, after all feof is likely to be called many
    > times.


    True, but it's most likely to be called in a conditional. Most
    compilers are quite capable of implementing expr != 0 without
    actually evaluating the != operator.

    > 2> Here is another macro
    > #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)
    > ->ptr++ :_fillbuf(p))
    > Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one
    > bracket for the function call and one bracket to stop
    > expansion of sideeffects in p?


    What do you mean by expansion of sideeffects?

    Note that function call parentheses and commas separating
    parameters are syntactical, so there's (generally) no need
    to 'protect' function parameters that represent expressions.

    If someone wants to pass an argument with a comma operator
    they'll have to supply parentheses to avoid a constraint
    violation on calling a function macro with too many
    arguments. [Although C99 now supports variadic macros.]

    > 3> In a comment on that getc Macro, K&R say: "The
    > characters are returned unsigned, which ensures that all
    > characters will be positive". I don't really understand
    > the point of this, I usually use char not unsigned char
    > for characters.


    Character codes are non-negative, hence getc's return.
    Plain char was invented for hysterical reasons.

    > And in K&R, all strings are of type char* not unsigned
    > char*.


    Plain char is a bain of C. It should have had two 'byte'
    types and char should have been a typedef char_t. But it
    isn't...

    > Also if sizeof(char) == sizeof(int)


    Then there are all sorts of problems for hosted
    implementations. Despite what some members of the
    Committee may say, many aspects of the standard
    library were not designed with that implementation
    in mind.

    > then the character (unsigned char) UCHARMAX will clash with EOF
    > == -1 when it gets promoted to int.


    The mapping is implementation defined, but yes, there will be
    overlap with EOF (which needn't be -1 BTW.) General practice
    though is to ignore such systems as hosted environments.

    --
    Peter
     
    Peter Nilsson, Apr 23, 2010
    #7
  8. Seebs <> writes:

    > On 2010-04-22, sandeep <> wrote:

    <snip>
    >> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    >
    > (Not necessarily, but I see your point.)
    >
    > I am not aware of an implementation where this can actually happen;
    > specifically, I'm under the impression that such implementations are likely
    > to simply only ever yield values in some smaller range from getchar(),
    > so that EOF can never occur. A typical choice might be to have a 32-bit
    > char object, but to only store 8 bits at a time in files or retrieve
    > 8 bits at a time when reading files.


    That may be reasonable from a practical point of view, but I don't think
    it is conforming. In

    int i;
    fread(&i, sizeof i, 1, fp);

    fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    i times. getchar is also (indirectly) defined in terms of fgetc so I
    don't think there can be any special dispensation for it.

    --
    Ben.
     
    Ben Bacarisse, Apr 23, 2010
    #8
  9. Ben Bacarisse <> writes:
    > Seebs <> writes:
    >> On 2010-04-22, sandeep <> wrote:

    > <snip>
    >>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    >>
    >> (Not necessarily, but I see your point.)
    >>
    >> I am not aware of an implementation where this can actually happen;
    >> specifically, I'm under the impression that such implementations are likely
    >> to simply only ever yield values in some smaller range from getchar(),
    >> so that EOF can never occur. A typical choice might be to have a 32-bit
    >> char object, but to only store 8 bits at a time in files or retrieve
    >> 8 bits at a time when reading files.

    >
    > That may be reasonable from a practical point of view, but I don't think
    > it is conforming. In
    >
    > int i;
    > fread(&i, sizeof i, 1, fp);
    >
    > fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    > i times. getchar is also (indirectly) defined in terms of fgetc so I
    > don't think there can be any special dispensation for it.


    I don't think that by itself makes Seebs's hypothetical implementation
    non-conforming.

    What does make it non-conforming is that you wouldn't be able to
    write a byte with any value in the range 256..UCHAR_MAX to a file
    (in binary mode) and then read it back (also in binary mode) and
    get the same value.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 23, 2010
    #9
  10. sandeep

    Seebs Guest

    On 2010-04-22, Ben Bacarisse <> wrote:
    > That may be reasonable from a practical point of view, but I don't think
    > it is conforming. In
    >
    > int i;
    > fread(&i, sizeof i, 1, fp);
    >
    > fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    > i times. getchar is also (indirectly) defined in terms of fgetc so I
    > don't think there can be any special dispensation for it.


    Interesting point. Hadn't thought of that.

    That brings us to the other answer, which is the frequent assertion that
    the requirement for EOF to be a distinct value means that you can't really
    have a fully conforming hosted implementation where sizeof(int) == 1.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
     
    Seebs, Apr 23, 2010
    #10
  11. Seebs <> writes:
    > On 2010-04-22, Ben Bacarisse <> wrote:
    >> That may be reasonable from a practical point of view, but I don't think
    >> it is conforming. In
    >>
    >> int i;
    >> fread(&i, sizeof i, 1, fp);
    >>
    >> fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    >> i times. getchar is also (indirectly) defined in terms of fgetc so I
    >> don't think there can be any special dispensation for it.

    >
    > Interesting point. Hadn't thought of that.
    >
    > That brings us to the other answer, which is the frequent assertion that
    > the requirement for EOF to be a distinct value means that you can't really
    > have a fully conforming hosted implementation where sizeof(int) == 1.


    I don't think so, but such an implementation would be inconvenient and
    would break some code that's widely assumed to be portable.

    The problem is that fgetc() would return EOF either if the stream is at
    end-of-file, or a read error occurs, *or* the next character happens to
    have a value that converts to the value of EOF.

    I don't see anything in the standard that says this is illegal. But it
    does mean that a program meant to be portable to such a system can't
    just check whether fgetc() returns EOF; it would then also have to check
    both feof() and ferror(). If it doesn't, encountering that character
    (say, '\xFFFFFFFF') in an input file will fool it into thinking it's
    reached the end of the file.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 23, 2010
    #11
  12. Keith Thompson <> writes:

    > Ben Bacarisse <> writes:
    >> Seebs <> writes:
    >>> On 2010-04-22, sandeep <> wrote:

    >> <snip>
    >>>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >>>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.
    >>>
    >>> (Not necessarily, but I see your point.)
    >>>
    >>> I am not aware of an implementation where this can actually happen;
    >>> specifically, I'm under the impression that such implementations are likely
    >>> to simply only ever yield values in some smaller range from getchar(),
    >>> so that EOF can never occur. A typical choice might be to have a 32-bit
    >>> char object, but to only store 8 bits at a time in files or retrieve
    >>> 8 bits at a time when reading files.

    >>
    >> That may be reasonable from a practical point of view, but I don't think
    >> it is conforming. In
    >>
    >> int i;
    >> fread(&i, sizeof i, 1, fp);
    >>
    >> fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    >> i times. getchar is also (indirectly) defined in terms of fgetc so I
    >> don't think there can be any special dispensation for it.

    >
    > I don't think that by itself makes Seebs's hypothetical implementation
    > non-conforming.


    I can't see how an implementation that does what Seebs suggests can
    conform to 7.19.8.1 p2:

    "The fread function reads, into the array pointed to by ptr, up to
    nmemb elements whose size is specified by size, from the stream
    pointed to by stream. For each object, size calls are made to the
    fgetc function and the results stored, in the order read, in an array
    of unsigned char exactly overlaying the object. The file position
    indicator for the stream (if defined) is advanced by the number of
    characters successfully read. If an error occurs, the resulting value
    of the file position indicator for the stream is indeterminate. If a
    partial element is read, its value is indeterminate."

    That seem so say that when sizeof(int) == sizeof(char)

    fread(&i, sizeof i, 1, fp);

    must be equivalent to

    ((unsigned char *)&i)[0] = fgetc(fp);

    I'd have though that someone reading 7.19.8.1 p2 would be able to expect
    that this second form is equivalent to the first in a conforming
    implementation.

    > What does make it non-conforming is that you wouldn't be able to
    > write a byte with any value in the range 256..UCHAR_MAX to a file
    > (in binary mode) and then read it back (also in binary mode) and
    > get the same value.


    That's a simpler argument!

    --
    Ben.
     
    Ben Bacarisse, Apr 23, 2010
    #12
  13. On Thu, 22 Apr 2010 16:49:13 -0400, sandeep <> wrote:

    > 1> Look at this Macro
    > #define feof(p) ((p)->flag & _EOF) != 0)
    >
    > My question is: feof is only specified to return 0 or not 0. There is no
    > requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    > to force it to be 0 or 1? This seems very inefficient, after all feof is
    > likely to be called many times.


    You're right that the "!= 0" is unnecessary, but in practice it's not
    particularly inefficient, firstly because feof is almost only ever called
    inside a conditional, where its value would be implicitly compared with
    0 anyway, and any decent compiler would generate the same code for

    "if(expr)" and "if(expr != 0)"

    and secondly because when feof is used correctly, it's usually called at
    most once per file, only after fread or fgetc or other input function has
    returned a failure indication.

    --
    Morris Keesan --
     
    Morris Keesan, Apr 23, 2010
    #13
  14. sandeep

    spinoza1111 Guest

    On Apr 23, 4:49 am, sandeep <> wrote:
    > Hello friends ~
    >
    > I am learning C from the K&R book. I have questions about Section 8.5
    > ("an implementation of Fopen and Getc"). Although this section is UNIX(r)
    > specific I think all my questions are really about standard C... so the
    > ISO taliban can relax... :D


    At least the *taliban* of Afghanistan, as they listen to the *imam*
    groan *fatwa* in the *madrassah*, are concerned with the important
    question. Whereas the ISO *taliban* blaspheme for they make things of
    men their God, and show no compassion to others who speak not their
    *shibboleth*.
    >
    > 1> Look at this Macro
    > #define feof(p) ((p)->flag & _EOF) != 0)
    >
    > My question is: feof is only specified to return 0 or not 0. There is no
    > requirement for it to only return 0 or 1. So why the unnecessary "!= 0"
    > to force it to be 0 or 1? This seems very inefficient, after all feof is
    > likely to be called many times.
    >
    > 2> Here is another macro
    > #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
    > Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
    > function call and one bracket to stop expansion of sideeffects in p?
    >
    > 3> In a comment on that getc Macro, K&R say: "The characters are returned
    > unsigned, which ensures that all characters will be positive". I don't
    > really understand the point of this, I usually use char not unsigned char
    > for characters. And in K&R, all strings are of type char* not unsigned
    > char*.
    >
    > Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    > UCHARMAX will clash with EOF == -1 when it gets promoted to int.
    >
    > Regards ~
     
    spinoza1111, Apr 23, 2010
    #14
  15. Ben Bacarisse <> writes:
    > Keith Thompson <> writes:
    >> Ben Bacarisse <> writes:
    >>> Seebs <> writes:
    >>>> On 2010-04-22, sandeep <> wrote:
    >>> <snip>
    >>>>> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >>>>> UCHARMAX will clash with EOF == -1 when it gets promoted to int.
    >>>>
    >>>> (Not necessarily, but I see your point.)
    >>>>
    >>>> I am not aware of an implementation where this can actually happen;
    >>>> specifically, I'm under the impression that such implementations
    >>>> are likely to simply only ever yield values in some smaller range
    >>>> from getchar(), so that EOF can never occur. A typical choice
    >>>> might be to have a 32-bit char object, but to only store 8 bits at
    >>>> a time in files or retrieve 8 bits at a time when reading files.
    >>>
    >>> That may be reasonable from a practical point of view, but I don't think
    >>> it is conforming. In
    >>>
    >>> int i;
    >>> fread(&i, sizeof i, 1, fp);
    >>>
    >>> fread's behaviour is defined in terms of fgetc: fgetc is called sizeof
    >>> i times. getchar is also (indirectly) defined in terms of fgetc so I
    >>> don't think there can be any special dispensation for it.

    >>
    >> I don't think that by itself makes Seebs's hypothetical implementation
    >> non-conforming.

    >
    > I can't see how an implementation that does what Seebs suggests can
    > conform to 7.19.8.1 p2:
    >
    > "The fread function reads, into the array pointed to by ptr, up to
    > nmemb elements whose size is specified by size, from the stream
    > pointed to by stream. For each object, size calls are made to the
    > fgetc function and the results stored, in the order read, in an array
    > of unsigned char exactly overlaying the object. The file position
    > indicator for the stream (if defined) is advanced by the number of
    > characters successfully read. If an error occurs, the resulting value
    > of the file position indicator for the stream is indeterminate. If a
    > partial element is read, its value is indeterminate."
    >
    > That seem so say that when sizeof(int) == sizeof(char)
    >
    > fread(&i, sizeof i, 1, fp);
    >
    > must be equivalent to
    >
    > ((unsigned char *)&i)[0] = fgetc(fp);
    >
    > I'd have though that someone reading 7.19.8.1 p2 would be able to expect
    > that this second form is equivalent to the first in a conforming
    > implementation.


    Sure, they're equivalent. Neither one (in Seebs's hypothetical
    implementation) can ever read a value greater than 255. The standard
    doesn't say that you can necessarily read such a value from a file --
    unless you previously wrote it there, which brings us to:

    >> What does make it non-conforming is that you wouldn't be able to
    >> write a byte with any value in the range 256..UCHAR_MAX to a file
    >> (in binary mode) and then read it back (also in binary mode) and
    >> get the same value.

    >
    > That's a simpler argument!


    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 23, 2010
    #15
  16. On 22 Apr, 22:26, Keith Thompson <> wrote:
    > sandeep <> writes:


    > > [...] Although this section is UNIX(r)
    > > specific I think all my questions are really about standard C... so the
    > > ISO taliban can relax... :D

    >
    > I see the smiley, but referring to those of us who prefer to
    > discuss ISO C as "taliban" is a bit insulting, don't you think?


    I prefer to think of myself as part of the Congregation for the
    Doctrine of the Faith

    <snip>
     
    Nick Keighley, Apr 23, 2010
    #16
  17. Keith Thompson <> writes:

    > Ben Bacarisse <> writes:

    <snip quote of 7.19.8.1 p2>
    >> That seem so say that when sizeof(int) == sizeof(char)
    >>
    >> fread(&i, sizeof i, 1, fp);
    >>
    >> must be equivalent to
    >>
    >> ((unsigned char *)&i)[0] = fgetc(fp);
    >>
    >> I'd have though that someone reading 7.19.8.1 p2 would be able to expect
    >> that this second form is equivalent to the first in a conforming
    >> implementation.

    >
    > Sure, they're equivalent. Neither one (in Seebs's hypothetical
    > implementation) can ever read a value greater than 255. The standard
    > doesn't say that you can necessarily read such a value from a file --
    > unless you previously wrote it there, which brings us to:
    >
    >>> What does make it non-conforming is that you wouldn't be able to
    >>> write a byte with any value in the range 256..UCHAR_MAX to a file
    >>> (in binary mode) and then read it back (also in binary mode) and
    >>> get the same value.


    Right. I see your point. My example simply shows the limited utility
    of such an implementation, not it's actual non-conformance.

    --
    Ben.
     
    Ben Bacarisse, Apr 23, 2010
    #17
  18. sandeep

    Eric Sosman Guest

    On 4/22/2010 8:16 PM, Seebs wrote:
    >
    > That brings us to the other answer, which is the frequent assertion that
    > the requirement for EOF to be a distinct value means that you can't really
    > have a fully conforming hosted implementation where sizeof(int) == 1.


    EOF is required to be an int and required to be negative,
    but where is it *required* to be distinct from "legitimate"
    unsigned char values converted to int? 7.4p1 says that the\
    argument to a <ctype.h> function must be representable as an
    unsigned char "or" equal to EOF, but unless one takes the "or"
    as "xor" I can't see a requirement for distinctness.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Apr 23, 2010
    #18
  19. sandeep

    Phil Carmody Guest

    Keith Thompson <> writes:
    > sandeep <> writes:
    >> I am learning C from the K&R book. I have questions about Section 8.5
    >> ("an implementation of Fopen and Getc"). Although this section is UNIX(r)
    >> specific I think all my questions are really about standard C... so the
    >> ISO taliban can relax... :D

    >
    > I see the smiley, but referring to those of us who prefer to
    > discuss ISO C as "taliban" is a bit insulting, don't you think?
    > (And yes, I know the word literally means "students", but I doubt
    > that that's what you meant.)

    ....
    >> 2> Here is another macro
    >> #define getc(p) (--(p)->cnt>=0 ?(unsigned char)*(p)->ptr++ :_fillbuf(p))
    >> Doesn't that _fillbuf(p) ought to be _fillbuf((p)), one bracket for the
    >> function call and one bracket to stop expansion of sideeffects in p?

    >
    > No, extra parentheses aren't needed. As long as the name of the macro
    > parameter is immediately surrounded by parentheses (or brackets),
    > there's no problem with operator precedence.
    >
    > And it's not about "expansion of side effects", it's about operator
    > precedence, i.e., which operators are associated with which operands.
    > Any side effects will occur anyway.


    That's not quite the whole story, as the comma operator would
    break the attempted _fillbuf call, were it to be able to reach
    it, as it would not be interpreted as a comma operator. However,
    that's not important because the processing of the macro can
    never reach that stage, as it would require something that looks
    like an invocation of the getc with 2 parameters, which wouldn't
    be recognised as an instantiation of the above.

    >> 3> In a comment on that getc Macro, K&R say: "The characters are returned
    >> unsigned, which ensures that all characters will be positive". I don't
    >> really understand the point of this, I usually use char not unsigned char
    >> for characters. And in K&R, all strings are of type char* not unsigned
    >> char*.
    >>
    >> Also if sizeof(char) == sizeof(int) then the character (unsigned char)
    >> UCHARMAX will clash with EOF == -1 when it gets promoted to int.

    >
    > getc() returns a result of type int, not char. For example, if
    > UCHAR_MAX is 255, then


    then sizeof(char) != sizeof(int), which was the predicate that
    we were asked to address.

    Phil
    --
    I find the easiest thing to do is to k/f myself and just troll away
    -- David Melville on r.a.s.f1
     
    Phil Carmody, Apr 23, 2010
    #19
  20. sandeep

    Eric Sosman Guest

    On 4/23/2010 3:17 PM, Kenneth Brody wrote:
    > On 4/23/2010 2:49 PM, Phil Carmody wrote:
    > [...]
    >> That's not quite the whole story, as the comma operator would
    >> break the attempted _fillbuf call, were it to be able to reach
    >> it, as it would not be interpreted as a comma operator. However,
    >> that's not important because the processing of the macro can
    >> never reach that stage, as it would require something that looks
    >> like an invocation of the getc with 2 parameters, which wouldn't
    >> be recognised as an instantiation of the above.

    >
    > Consider: getc((foo,bar))


    Considered, but I don't see your point. What breakage
    do you believe would ensue?

    --
    Eric Sosman
    lid
     
    Eric Sosman, Apr 23, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. niclane
    Replies:
    9
    Views:
    370
    André Brière
    Jun 20, 2005
  2. Albert
    Replies:
    4
    Views:
    380
    Mike Wahler
    Dec 30, 2005
  3. kaili
    Replies:
    1
    Views:
    363
    Simon Biber
    Jan 1, 2007
  4. kaili
    Replies:
    8
    Views:
    385
    Maraw
    Jan 3, 2007
  5. Replies:
    13
    Views:
    4,294
    rideema
    Dec 17, 2008
Loading...

Share This Page