cast musings

Discussion in 'C Programming' started by mathog, Mar 19, 2013.

  1. mathog

    mathog Guest

    { Introductory material, skip down to the end bracket if you don't
    care how I got onto this subject.

    I have recently been taking some heat for using constructs like this:

    float some_float;
    /* code sets some_float */
    int i = (int) round(some_float);

    in C++ code. Some programmers claim that if it is not rewritten as

    int i = static_cast<int> round(some_float);

    the sky will fall. So far none of the folks claiming this
    have actually presented an example where using C style casts on these
    sorts of simple data types actually does something unexpected. They
    have a point where inheritance is involved, but there is none of that in
    the code in question. I like () better here because it doesn't waste 11
    characters on every cast, which becomes a factor if more than one cast
    must fit on the same line.
    }

    (For everything below here assume that float and int are the same size.)

    This got me thinking about C casts, where

    int i = (int) round(some_float); /* 1a */

    specifies a type conversion, whereas this

    int i = *(int *)(&some_float); /* 2a */

    specifies a reinterpretation of the bits in memory. I always thought
    the second form was a pretty awkward way to accomplish this, and that
    there was room for improvement there. It isn't that the meaning is not
    clear, it is just that having to resort to using a pointer to the data
    in order to get its bits reinterpreted (still) feels like a kludge.
    There is also "const" and "volatile" to consider, which are fine in
    declarations but have always seemed too lengthy when placed inside casts.

    So bear me with me and consider this more generalized form for a C cast
    (obviously this is entirely hypothetical, there are no compilers that do
    this):

    ([task] [access] type)

    Where type must be specified and task and access are optional,
    and:

    Task specifies the action of the cast, one of:
    to Short for "convert to". The target is to be
    converted to the type specified.
    Default when no [task] is present.
    as Short for "use as". The target's bits in memory
    are to be interpreted as specified by this cast.
    Access specifies read/write access to the result of the cast one of:
    rw read/write access.
    Default when no [access] is specified.
    (Is there currently a keyword to specify this?)
    ro read only access (equivalent to "const")
    wo write only access (equivalent to "volatile")
    Type is the "output" type of the cast, like int or double, just
    as in current C casts.

    The general form allows these alternate casts:

    int i = (int) round(some_float); /* 1a */
    int i = (to int) round(some_float); /* 1b */
    int i = (to rw int) round(some_float); /* 1c */

    int i = *(int *)(&some_float); /* 2a */
    int i = (as int) some_float; /* 2b */

    const int i = *(const int *)(&some_float); /* 3a */
    const int i = (as ro int)some_float; /* 3b */

    int i = (int) some_float; /* 4a */
    int i = (to int) some_float; /* 4b */

    function((volatile float *)&some_float); /* 5a */
    function((wo float *)&some_float); /* 5b */

    Not a big win in clarity for 1b or 1c vs. 1a, since the extra
    text is just specifying defaults. 1b might be a bit clearer
    for a beginner, but after a few weeks those training wheels
    would come off. I think 2b is clearer than 2a, and 3b clearer
    than 3a. Incorrect code, with the wrong task employed, might
    be easier to spot if the 4b and 2b forms were uniformly employed, but
    otherwise not a clean win for 4b over 4a. 5b versus 5a feels like a wash.

    Is something along these lines worth incorporating into the C language,
    or no? Clearly we can get along without it - I am just wondering if
    adding this would be beneficial.

    Regards,

    David Mathog
    mathog, Mar 19, 2013
    #1
    1. Advertising

  2. mathog

    Shao Miller Guest

    On 3/19/2013 16:29, mathog wrote:
    >
    > (For everything below here assume that float and int are the same size.)
    >
    > This got me thinking about C casts, where
    >
    > int i = (int) round(some_float); /* 1a */
    >
    > specifies a type conversion, whereas this
    >
    > int i = *(int *)(&some_float); /* 2a */
    >
    > specifies a reinterpretation of the bits in memory. I always thought
    > the second form was a pretty awkward way to accomplish this, and that
    > there was room for improvement there. It isn't that the meaning is not
    > clear, it is just that having to resort to using a pointer to the data
    > in order to get its bits reinterpreted (still) feels like a kludge.
    > There is also "const" and "volatile" to consider, which are fine in
    > declarations but have always seemed too lengthy when placed inside casts.
    >


    You don't often want to do this. The result of the cast might not
    address properly-aligned memory, which would be undefined behaviour.
    Getting past that, the indirection might break effective type rules,
    which would be undefined behaviour. Sometimes, when you know what
    you're doing, there isn't a risk of undefined behaviour, but better
    constructs are available.

    > So bear me with me and consider this more generalized form for a C cast
    > (obviously this is entirely hypothetical, there are no compilers that do
    > this):
    >
    > ([task] [access] type)
    >
    > Where type must be specified and task and access are optional,
    > and:
    >
    > Task specifies the action of the cast, one of:
    > to Short for "convert to". The target is to be
    > converted to the type specified.
    > Default when no [task] is present.
    > as Short for "use as". The target's bits in memory
    > are to be interpreted as specified by this cast.
    > Access specifies read/write access to the result of the cast one of:
    > rw read/write access.
    > Default when no [access] is specified.
    > (Is there currently a keyword to specify this?)
    > ro read only access (equivalent to "const")
    > wo write only access (equivalent to "volatile")


    "wo" would be a misnomer, since a 'volatile' can be read.

    > Type is the "output" type of the cast, like int or double, just
    > as in current C casts.
    >
    > The general form allows these alternate casts:
    >
    > int i = (int) round(some_float); /* 1a */
    > int i = (to int) round(some_float); /* 1b */
    > int i = (to rw int) round(some_float); /* 1c */
    >
    > int i = *(int *)(&some_float); /* 2a */
    > int i = (as int) some_float; /* 2b */
    >
    > const int i = *(const int *)(&some_float); /* 3a */
    > const int i = (as ro int)some_float; /* 3b */
    >
    > int i = (int) some_float; /* 4a */
    > int i = (to int) some_float; /* 4b */
    >
    > function((volatile float *)&some_float); /* 5a */
    > function((wo float *)&some_float); /* 5b */
    >
    > Not a big win in clarity for 1b or 1c vs. 1a, since the extra
    > text is just specifying defaults. 1b might be a bit clearer
    > for a beginner, but after a few weeks those training wheels
    > would come off. I think 2b is clearer than 2a, and 3b clearer
    > than 3a. Incorrect code, with the wrong task employed, might
    > be easier to spot if the 4b and 2b forms were uniformly employed, but
    > otherwise not a clean win for 4b over 4a. 5b versus 5a feels like a wash.
    >
    > Is something along these lines worth incorporating into the C language,
    > or no? Clearly we can get along without it - I am just wondering if
    > adding this would be beneficial.
    >


    Interesting stuff, but I don't understand what the "access" is supposed
    to specify... We already have 'const' and 'volatile', so...? Also, the
    result of a cast is not an lvalue, so it can't be modified, so...?

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
    Shao Miller, Mar 19, 2013
    #2
    1. Advertising

  3. mathog <> wrote:
    > { Introductory material, skip down to the end bracket if you don't
    > care how I got onto this subject.


    (snip on C++, casts, and such)

    > (For everything below here assume that float and int are the same size.)


    > This got me thinking about C casts, where


    > int i = (int) round(some_float); /* 1a */


    > specifies a type conversion, whereas this


    > int i = *(int *)(&some_float); /* 2a */


    > specifies a reinterpretation of the bits in memory.


    Well, to me it is that the cast specifies a conversion
    for the pointer. While on most systems the bit representation
    for pointers to different types are the same, on some machines
    they are different. Specifically, on word addressed machines
    where char is smaller than a machine word.

    > I always thought the second form was a pretty awkward way to
    > accomplish this, and that there was room for improvement there.


    I suppose so. In Fortran, we used to do with with EQUIVALENCE,
    but now there is TRANSFER.

    > It isn't that the meaning is not clear, it is just that
    > having to resort to using a pointer to the data in order
    > to get its bits reinterpreted (still) feels like a kludge.


    I suppose so, but in most cases where you use it, it is
    a kludge.

    But OK, PL/I has the UNSPEC function and pseudo-variable so
    you can write:

    UNSPEC(F)=UNSPEC(I);

    as you note, assuming that they are the same size. UNSPEC converts
    to or from a bit string.

    Java has floatToIntBits, doubleToLongBits, and their inverse functions.
    Otherwise, the kludges, by definition, don't work in Java.


    (snip related to const and volatile)

    -- glen
    glen herrmannsfeldt, Mar 19, 2013
    #3
  4. mathog

    James Kuyper Guest

    On 03/19/2013 04:29 PM, mathog wrote:
    > { Introductory material, skip down to the end bracket if you don't
    > care how I got onto this subject.
    >
    > I have recently been taking some heat for using constructs like this:
    >
    > float some_float;
    > /* code sets some_float */
    > int i = (int) round(some_float);
    >
    > in C++ code. Some programmers claim that if it is not rewritten as
    >
    > int i = static_cast<int> round(some_float);
    >
    > the sky will fall. So far none of the folks claiming this
    > have actually presented an example where using C style casts on these
    > sorts of simple data types actually does something unexpected. They
    > have a point where inheritance is involved, but there is none of that in
    > the code in question. I like () better here because it doesn't waste 11
    > characters on every cast, which becomes a factor if more than one cast
    > must fit on the same line.
    > }


    The fundamental problem is that all of the safe conversions occur
    implicitly without requiring a cast. Any conversion for which a cast is
    actually needed is dangerous. Therefore, they should be made easy to
    find and easy to notice, so they can receive the careful attention that
    they require.

    That is part of the reason why C++'s named casts were made as long as
    they are: to make it easier to search for them, and to discourage their
    unnecessary use. It is unfortunately too late to do the same for the
    C-style casts, which had to be allowed for backwards compatibility with
    C (and, at this late date, for backwards compatibility with older
    versions of C++ well).

    Each named cast can only do certain kinds of conversions, which is
    useful, because if the type of the the thing you are converting is
    different from what you thought it was when your wrote the code, the
    cast might have unexpected consequences. The C style cast can do almost
    anything that can be done with a named cast, and a couple of obscure
    additional things as well, and that's part of what makes it so
    dangerous. Consistently using the named casts rather than a C-style cast
    turns much of the otherwise-dangerous code into unexpected constraint
    violations requiring a diagnostic, because that particular named cast
    cannot be used to perform the specified conversion.

    In C++, function overloading and templates make it a commonplace for the
    type of an expression to depend in a complex fashion upon far-distant
    code, which gives this feature greater value than it would have in C.
    However, even in C, typedefs in third-party headers can make it hard to
    be sure of the type of an expression.
    James Kuyper, Mar 19, 2013
    #4
  5. mathog

    mathog Guest

    Shao Miller wrote:
    > On 3/19/2013 16:29, mathog wrote:
    >> int i = *(int *)(&some_float); /* 2a */
    >>
    >> specifies a reinterpretation of the bits in memory. I always thought
    >> the second form was a pretty awkward way to accomplish this, and that
    >> there was room for improvement there. It isn't that the meaning is not
    >> clear, it is just that having to resort to using a pointer to the data
    >> in order to get its bits reinterpreted (still) feels like a kludge.
    >> There is also "const" and "volatile" to consider, which are fine in
    >> declarations but have always seemed too lengthy when placed inside casts.
    >>

    >
    > You don't often want to do this.


    Agreed, but sometimes you have to.

    >> Access specifies read/write access to the result of the cast one of:
    >> rw read/write access.
    >> Default when no [access] is specified.
    >> (Is there currently a keyword to specify this?)
    >> ro read only access (equivalent to "const")
    >> wo write only access (equivalent to "volatile")

    >
    > "wo" would be a misnomer, since a 'volatile' can be read.


    Oops, right, this is not exactly "volatile". The case I was thinking of
    for "wo" was a store to a write only register mapped into memory. It
    might be that a read from that address would return some value, but that
    value is meaningless. The "wo" tells the compiler it must not generate
    any code that would read from that address. The closest thing I could
    think of to that was "volatile", which tells the compiler that the value
    at that address may change, which is not quite the same thing.

    >
    > Interesting stuff, but I don't understand what the "access" is supposed
    > to specify... We already have 'const' and 'volatile', so...? Also, the
    > result of a cast is not an lvalue, so it can't be modified, so...?
    >


    Hmm, the access concept does seem a little underdeveloped. "to" and
    "as" where the starting point. Still, would not the casts below affect
    lvalues?

    char *base; /* base of stack of 3 registers */
    /* something sets base */
    *(rw char *)(base+0) = 1; /* first register is RW, */
    *(wo char *)(base+1) = 2; /* second register is WO */
    *(ro char *)(base+2) = 3; /* third register is RO */

    and the compiler throws an error on the last one.


    Regards,

    David Mathog
    mathog, Mar 19, 2013
    #5
  6. mathog

    Jorgen Grahn Guest

    On Tue, 2013-03-19, mathog wrote:
    ....
    > I have recently been taking some heat for using constructs like this:
    >
    > float some_float;
    > /* code sets some_float */
    > int i = (int) round(some_float);
    >
    > in C++ code. Some programmers claim that if it is not rewritten as
    >
    > int i = static_cast<int> round(some_float);
    >
    > the sky will fall.


    Maybe I'm ignorant, but I fail to see why a cast is needed at all.

    > So far none of the folks claiming this
    > have actually presented an example where using C style casts on these
    > sorts of simple data types actually does something unexpected.


    It doesn't of course. It's a style issue, and a question of
    maintainability.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Mar 19, 2013
    #6
  7. mathog

    mathog Guest

    James Kuyper wrote:

    > In C++, function overloading and templates make it a commonplace for the
    > type of an expression to depend in a complex fashion upon far-distant
    > code, which gives this feature greater value than it would have in C.
    > However, even in C, typedefs in third-party headers can make it hard to
    > be sure of the type of an expression.


    In general I agree with the preceding - except the last sentence. While
    the programmer might have some difficulty figuring out where something
    is defined, the compiler will do it successfully - or throw an error and
    exit.

    In the specific application a data file consisting of about 100 types of
    records is read into a char buffer in memory, and that is traversed
    record by record using a (char *) pointer. In all instances the
    structure corresponding to each record is known when the compiler does
    its work. The code is basically just this sort of thing:

    char *cptr;
    ...processing input in a loop ...
    COMMON_PREFIX *pre = (COMMON_PREFIX *)cptr;
    switch (pre->type){
    ... other cases ...
    case RECORDTYPE32:
    {
    RTYPE32 *prec = (RTYPE32 *) cptr;
    // access fields via prec structure
    break;
    }
    ... other cases ...
    }
    cptr += pre->recordsize;

    where the structures for COMMON_PREFIX and RTYPE32 and the value for
    RECORDTYPE32 will not change at run time. In this context, I just do
    not see see how the C style casts are in any way more dangerous than the
    C++ style casts. The only major complication in this approach concerns
    the proper alignment of the structs, and both types of casts will have
    the same issues with that.

    Regards,

    David Mathog
    mathog, Mar 19, 2013
    #7
  8. mathog <> wrote:
    > James Kuyper wrote:


    >> In C++, function overloading and templates make it a commonplace for the
    >> type of an expression to depend in a complex fashion upon far-distant
    >> code, which gives this feature greater value than it would have in C.
    >> However, even in C, typedefs in third-party headers can make it hard to
    >> be sure of the type of an expression.


    > In general I agree with the preceding - except the last sentence. While
    > the programmer might have some difficulty figuring out where something
    > is defined, the compiler will do it successfully - or throw an error and
    > exit.


    Somewhere you mentioned assuming that sizeof(int)==sizeof(float).

    It might be true on your machine, but not on the machine someone else
    wants to run your program on.

    It might also be dependent on the exact representation of int and
    float on a machine, including endianness.

    > In the specific application a data file consisting of about 100 types of
    > records is read into a char buffer in memory, and that is traversed
    > record by record using a (char *) pointer. In all instances the
    > structure corresponding to each record is known when the compiler does
    > its work. The code is basically just this sort of thing:
    >
    > char *cptr;
    > ...processing input in a loop ...
    > COMMON_PREFIX *pre = (COMMON_PREFIX *)cptr;
    > switch (pre->type){
    > ... other cases ...
    > case RECORDTYPE32:
    > {
    > RTYPE32 *prec = (RTYPE32 *) cptr;
    > // access fields via prec structure
    > break;
    > }
    > ... other cases ...
    > }
    > cptr += pre->recordsize;


    Often enough, I am lazy, know it only needs to work on one specific
    machine, and just do it.

    > where the structures for COMMON_PREFIX and RTYPE32 and the value for
    > RECORDTYPE32 will not change at run time. In this context, I just do
    > not see see how the C style casts are in any way more dangerous than the
    > C++ style casts. The only major complication in this approach concerns
    > the proper alignment of the structs, and both types of casts will have
    > the same issues with that.


    The way to get around alignment is to memcpy() to/from appropriately
    aligned fields. It takes a little more work, but can sometimes be
    automated. There is, for example XDR which is designed specifically
    for transfering data between dissimilar machines.

    -- glen
    glen herrmannsfeldt, Mar 19, 2013
    #8
  9. mathog

    James Kuyper Guest

    On 03/19/2013 06:02 PM, mathog wrote:
    > James Kuyper wrote:
    >
    >> In C++, function overloading and templates make it a commonplace for the
    >> type of an expression to depend in a complex fashion upon far-distant
    >> code, which gives this feature greater value than it would have in C.
    >> However, even in C, typedefs in third-party headers can make it hard to
    >> be sure of the type of an expression.

    >
    > In general I agree with the preceding - except the last sentence. While
    > the programmer might have some difficulty figuring out where something
    > is defined, the compiler will do it successfully - or throw an error and
    > exit.


    I was thinking primarily about the human difficulties, not the computer
    ones, and not all of the mistakes that can be made by humans due to
    being unsure of a data type result in situations where a diagnostic is
    required. For instance, a typedef for an unsigned type that might or
    might not be smaller than 'int', depending upon which platform it's
    compiled on, could leave a programmer uncertain whether the usual
    arithmetic conversions will cause a given expression to be evaluated
    using signed or unsigned arithmetic. There's no guarantee that making a
    mistake about that issue will result in code for which a diagnostic is
    required. Using the size-named types that were introduced in C99 is
    another potential source for similar uncertainties.

    > In the specific application a data file consisting of about 100 types of
    > records is read into a char buffer in memory, and that is traversed
    > record by record using a (char *) pointer. In all instances the
    > structure corresponding to each record is known when the compiler does
    > its work. The code is basically just this sort of thing:
    >
    > char *cptr;
    > ...processing input in a loop ...
    > COMMON_PREFIX *pre = (COMMON_PREFIX *)cptr;


    Well-motivated and widely followed coding conventions reserve names in
    all caps for macros; but the context suggests that COMMON_PREFIX should
    instead be a typedef.

    > switch (pre->type){
    > ... other cases ...
    > case RECORDTYPE32:
    > {
    > RTYPE32 *prec = (RTYPE32 *) cptr;


    A lot of potential problems with such code become less clear when the
    relationship between COMMON_PREFIX an RTYPE32 is unknown.

    > // access fields via prec structure
    > break;
    > }
    > ... other cases ...
    > }
    > cptr += pre->recordsize;
    >
    > where the structures for COMMON_PREFIX and RTYPE32 and the value for
    > RECORDTYPE32 will not change at run time. In this context, I just do
    > not see see how the C style casts are in any way more dangerous than the
    > C++ style casts. The only major complication in this approach concerns
    > the proper alignment of the structs, and both types of casts will have
    > the same issues with that.


    I'd wondered in your previous message why you were talking so much about
    C++ in a newsgroup devoted to C, but the reason seemed to be that a
    discussion in a C++ context got you thinking about a C. That explanation
    is getting harder to accept - it looks to me like your primary interest
    is C-style casts vs. C++ casts, an issue that can only matter in a C++
    context. If that's that case, it really belongs on a C++ newsgroup,
    where more people are better qualified to discuss the issue.

    Without knowing anything about the nature of COMMON_PREFIX and RTYPE32,
    I can't be sure, but this looks to me a lot like something that, in C++,
    would be better handled by type derivation and virtual member functions,
    rather than type codes and switch statements. It's a lot more type safe
    that way, and avoiding that cast is a key part of the reason why.
    James Kuyper, Mar 19, 2013
    #9
  10. mathog

    Tim Rentsch Guest

    mathog <> writes:

    > { Introductory material, skip down to the end bracket if you don't
    > care how I got onto this subject.
    >
    > I have recently been taking some heat for using constructs like this:
    >
    > float some_float;
    > /* code sets some_float */
    > int i = (int) round(some_float);
    >
    > in C++ code. Some programmers claim that if it is not rewritten as
    >
    > int i = static_cast<int> round(some_float);
    >
    > the sky will fall. So far none of the folks claiming this
    > have actually presented an example where using C style casts on these
    > sorts of simple data types actually does something unexpected. They
    > have a point where inheritance is involved, but there is none of that
    > in the code in question. I like () better here because it doesn't
    > waste 11 characters on every cast, which becomes a factor if more than
    > one cast must fit on the same line.
    > }
    >
    > (For everything below here assume that float and int are the same size.)
    >
    > This got me thinking about C casts, where
    >
    > int i = (int) round(some_float); /* 1a */
    >
    > specifies a type conversion, whereas this
    >
    > int i = *(int *)(&some_float); /* 2a */
    >
    > specifies a reinterpretation of the bits in memory. I always thought
    > the second form was a pretty awkward way to accomplish this, and that
    > there was room for improvement there. It isn't that the meaning is
    > not clear, it is just that having to resort to using a pointer to the
    > data in order to get its bits reinterpreted (still) feels like a
    > kludge. There is also "const" and "volatile" to consider, which are
    > fine in declarations but have always seemed too lengthy when placed
    > inside casts.
    >
    > So bear me with me and consider this more generalized form for a C
    > cast (obviously this is entirely hypothetical, there are no compilers
    > that do this):
    >
    > ([task] [access] type)
    >
    > Where type must be specified and task and access are optional,
    > and:
    >
    > Task specifies the action of the cast, one of:
    > to Short for "convert to". The target is to be
    > converted to the type specified.
    > Default when no [task] is present.
    > as Short for "use as". The target's bits in memory
    > are to be interpreted as specified by this cast.
    > Access specifies read/write access to the result of the cast one of:
    > rw read/write access.
    > Default when no [access] is specified.
    > (Is there currently a keyword to specify this?)
    > ro read only access (equivalent to "const")
    > wo write only access (equivalent to "volatile")
    > Type is the "output" type of the cast, like int or double, just
    > as in current C casts.
    >
    > The general form allows these alternate casts:
    >
    > int i = (int) round(some_float); /* 1a */
    > int i = (to int) round(some_float); /* 1b */
    > int i = (to rw int) round(some_float); /* 1c */
    >
    > int i = *(int *)(&some_float); /* 2a */
    > int i = (as int) some_float; /* 2b */
    >
    > const int i = *(const int *)(&some_float); /* 3a */
    > const int i = (as ro int)some_float; /* 3b */
    >
    > int i = (int) some_float; /* 4a */
    > int i = (to int) some_float; /* 4b */
    >
    > function((volatile float *)&some_float); /* 5a */
    > function((wo float *)&some_float); /* 5b */
    >
    > Not a big win in clarity for 1b or 1c vs. 1a, since the extra
    > text is just specifying defaults. 1b might be a bit clearer
    > for a beginner, but after a few weeks those training wheels
    > would come off. I think 2b is clearer than 2a, and 3b clearer
    > than 3a. Incorrect code, with the wrong task employed, might
    > be easier to spot if the 4b and 2b forms were uniformly employed, but
    > otherwise not a clean win for 4b over 4a. 5b versus 5a feels like a
    > wash.
    >
    > Is something along these lines worth incorporating into the C
    > language, or no? Clearly we can get along without it - I am just
    > wondering if adding this would be beneficial.


    Good one!!! But you posted it 13 days early...
    Tim Rentsch, Mar 19, 2013
    #10
  11. mathog

    Jorgen Grahn Guest

    On Tue, 2013-03-19, James Kuyper wrote:
    > On 03/19/2013 06:02 PM, mathog wrote:

    ....
    >> switch (pre->type){
    >> ... other cases ...
    >> case RECORDTYPE32:
    >> {
    >> RTYPE32 *prec = (RTYPE32 *) cptr;

    >


    > Without knowing anything about the nature of COMMON_PREFIX and RTYPE32,
    > I can't be sure, but this looks to me a lot like something that, in C++,
    > would be better handled by type derivation and virtual member functions,
    > rather than type codes and switch statements. It's a lot more type safe
    > that way, and avoiding that cast is a key part of the reason why.


    Well, he said the data came from a file; C++ doesn't help there, at
    least not the run-time polymorphism stuff. IME at that point he has
    the options to:

    (a) Say "I have guarantees that this I/O buffer maps exactly to one of
    these structs. Alignment, padding, sizes, endianness and
    representations of floats etc are guaranteed not to be problems."

    (b) Deserialization of a documented, binary protocol. More work
    short-term, but this can be done as portable C or C++, and without
    casts.

    I prefer (b) because I have been bitten by (a)-style guarantees
    failing in the past. Like when moving to a 64-bit architecture, or
    between big/little endian.

    (Of course, if the program itself generated the data by writing a
    struct to a file, you can be pretty sure the guarantee holds ...)

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Mar 20, 2013
    #11
  12. mathog

    army1987 Guest

    On Tue, 19 Mar 2013 13:29:10 -0700, mathog wrote:

    > [stuff about C++]
    > int i = static_cast<int> round(some_float);


    Meh. I use int(round(some_float)).

    > [stuff about C]
    > int i = *(int *)(&some_float); /* 2a */


    I *think* the only way to do that that's not technically UB is
    assert(sizeof i == sizeof some_float); memcpy(&i, &some_float; sizeof i);



    --
    [ T H I S S P A C E I S F O R R E N T ]
    Troppo poca cultura ci rende ignoranti, troppa ci rende folli.
    -- fathermckenzie di it.cultura.linguistica.italiano
    <http://xkcd.com/397/>
    army1987, Mar 21, 2013
    #12
  13. mathog

    Tim Rentsch Guest

    army1987 <> writes:

    > On Tue, 19 Mar 2013 13:29:10 -0700, mathog wrote:
    >
    >> [stuff about C++]
    >> int i = static_cast<int> round(some_float);

    >
    > Meh. I use int(round(some_float)).
    >
    >> [stuff about C]
    >> int i = *(int *)(&some_float); /* 2a */

    >
    > I *think* the only way to do that that's not technically UB is
    > assert(sizeof i == sizeof some_float); memcpy(&i, &some_float; sizeof i);


    Another way is to use a union.
    Tim Rentsch, Mar 21, 2013
    #13
  14. mathog

    Shao Miller Guest

    On 3/21/2013 11:50, Tim Rentsch wrote:
    > army1987 <> writes:
    >
    >> On Tue, 19 Mar 2013 13:29:10 -0700, mathog wrote:
    >>
    >>> [stuff about C++]
    >>> int i = static_cast<int> round(some_float);

    >>
    >> Meh. I use int(round(some_float)).
    >>
    >>> [stuff about C]
    >>> int i = *(int *)(&some_float); /* 2a */

    >>
    >> I *think* the only way to do that that's not technically UB is
    >> assert(sizeof i == sizeof some_float); memcpy(&i, &some_float; sizeof i);

    >
    > Another way is to use a union.
    >


    Although Mr. T. Rentsch doesn't appear to read my posts any more, I
    think it's still worth pointing out that even by using a union, you have
    the same risks of undefined behaviour, with regards to "effective type".

    A quirk is that you can get a union value that's no longer tied to a
    stored value:

    union {
    int i;
    float f;
    } u;
    u.i = 42;
    (0, u.i).f;

    The rvalue here doesn't have a stored value, so I think it can be argued
    about whether or not effective type applies, or if this type-punning is
    always safe, if no trap representations are involved.

    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
    Shao Miller, Mar 21, 2013
    #14
  15. mathog

    Shao Miller Guest

    On 3/21/2013 13:39, Shao Miller wrote:
    >
    > Although Mr. T. Rentsch doesn't appear to read my posts any more, I
    > think it's still worth pointing out that even by using a union, you have
    > the same risks of undefined behaviour, with regards to "effective type".
    >
    > A quirk is that you can get a union value that's no longer tied to a
    > stored value:
    >
    > union {
    > int i;
    > float f;
    > } u;
    > u.i = 42;

    /* Erm, rather */
    (0, u).f;

    > The rvalue here doesn't have a stored value, so I think it can be argued
    > about whether or not effective type applies, or if this type-punning is
    > always safe, if no trap representations are involved.
    >



    --
    - Shao Miller
    --
    "Thank you for the kind words; those are the kind of words I like to hear.

    Cheerily," -- Richard Harter
    Shao Miller, Mar 21, 2013
    #15
  16. mathog

    mathog Guest

    Tim Rentsch wrote:
    >
    > Good one!!! But you posted it 13 days early...
    >


    Musings, right?

    Coming back to this after a very busy week.

    Imagine that one wanted to write a C interface for a set of registers
    that are mapped into memory. (This also brings up the memstruct vs.
    struct argument of another thread, since strictly speaking and in the
    general case there may not be a way to do this with a struct without C
    extensions, because of uncontrollable use of padding by the compiler.)

    Anyway, C has always used "naked" type specifications in declarations,
    and the same type keywords in () for casts. Add another () keyword "is"
    and then a () could be used as a declaration and not a cast.

    Add "vo" to mean "read volatile".

    Assuming in the following that uint8_t is present on the platform
    and that a struct can actually be used to map onto the registers (which
    might not be OK if the first register is not aligned with a 4 or 8 byte
    boundary, but presumably if these registers exist in real hardware that
    issue would not occur).

    Then a set of registers might be declared like this:

    typedef struct {
    (is vo uint8_t) read_state; // sent, received, data pending, etc.
    (is rw uint8_t) bidirectional_buffer;
    (is ro uint8_t) last_buffer_sent;
    (is wo uint8_t) set_state; // OK to send, OK to receive, etc.
    } *REGISTER_SET;

    This tries to make a distinction between two different types of
    "volatile". "read_state" is truly volatile, because it can change
    values even if nothing in the program accesses these registers. On
    the other hand, "last_buffer_sent", which is a copy the hardware makes
    of the contents of "bidirectional_buffer" when that is sent, is only
    "sort of" or "conditionally" volatile. It can change value, but only
    when the program has sent the appropriate signal to "set_state".
    Otherwise "last_buffer_sent" may be read repeatedly like any normal
    memory location. The same issue with "bidirectional_buffer" - most of
    the time it may be re-read safely (when "OK to receive" has not been
    set) other times not (after a "OK to receive" is set).

    It isn't clear to me how to express exactly this state of affairs in
    current C syntax. This is closest, I guess:

    typedef struct {
    volatile const uint8_t read_state;
    volatile uint8_t bidirectional_buffer;
    volatile const uint8_t last_buffer_sent;
    volatile uint8_t set_state; // ????
    } *REGISTER_SET;

    This drops the distinction between vo and ro, but that is probably
    not a problem, either way the compiler knows that every time the program
    specifies a read a different value might come back. However,
    the declaration for set_state is not sufficient. "volatile" is not at
    all "write only", it just says that the value read can change
    unpredictably, not that reads are forbidden. What if the physical
    register is such that a read of "set_state" causes a hardware error,
    or at the very least, will return line noise? Is there some way to
    express that situation in current C syntax, so that if this snippet was
    buried elsewhere in the code the compiler would always flag it as an error:

    REGISTER_SET rset=(REGISTER_SET) pointer_to_a_set_of_registers;
    // next line would crash the machine
    printf("value of set_state is: %d\n",rset->set_state);

    Regards,

    David Mathog
    mathog, Mar 28, 2013
    #16
  17. mathog

    Les Cargill Guest

    mathog wrote:
    > Tim Rentsch wrote:
    >>
    >> Good one!!! But you posted it 13 days early...
    >>

    >
    > Musings, right?
    >
    > Coming back to this after a very busy week.
    >
    > Imagine that one wanted to write a C interface for a set of registers
    > that are mapped into memory. (This also brings up the memstruct vs.
    > struct argument of another thread, since strictly speaking and in the
    > general case there may not be a way to do this with a struct without C
    > extensions, because of uncontrollable use of padding by the compiler.)
    >


    Padding is frequently ( but not always ) controllable. There are
    pragmas and such to influence alignment. "But it isn't portable" - well,
    it doesn't have to be - it's a memory map, right? It's already
    hardware-specific.

    > Anyway, C has always used "naked" type specifications in declarations,
    > and the same type keywords in () for casts. Add another () keyword "is"
    > and then a () could be used as a declaration and not a cast.
    >


    Icky.

    > Add "vo" to mean "read volatile".
    >
    > Assuming in the following that uint8_t is present on the platform
    > and that a struct can actually be used to map onto the registers (which
    > might not be OK if the first register is not aligned with a 4 or 8 byte
    > boundary, but presumably if these registers exist in real hardware that
    > issue would not occur).
    >


    Right. And if they do, there's no sin in adding spacers so it aligns
    correctly.


    > Then a set of registers might be declared like this:
    >
    > typedef struct {
    > (is vo uint8_t) read_state; // sent, received, data pending, etc.
    > (is rw uint8_t) bidirectional_buffer;
    > (is ro uint8_t) last_buffer_sent;
    > (is wo uint8_t) set_state; // OK to send, OK to receive, etc.
    > } *REGISTER_SET;
    >
    > This tries to make a distinction between two different types of
    > "volatile". "read_state" is truly volatile, because it can change
    > values even if nothing in the program accesses these registers. On
    > the other hand, "last_buffer_sent", which is a copy the hardware makes
    > of the contents of "bidirectional_buffer" when that is sent, is only
    > "sort of" or "conditionally" volatile.


    So far as 'C' is concerned, if it's a little bit volatile, then you have
    to declare it volatile.

    > It can change value, but only
    > when the program has sent the appropriate signal to "set_state".
    > Otherwise "last_buffer_sent" may be read repeatedly like any normal
    > memory location. The same issue with "bidirectional_buffer" - most of
    > the time it may be re-read safely (when "OK to receive" has not been
    > set) other times not (after a "OK to receive" is set).
    >
    > It isn't clear to me how to express exactly this state of affairs in
    > current C syntax. This is closest, I guess:
    >
    > typedef struct {
    > volatile const uint8_t read_state;
    > volatile uint8_t bidirectional_buffer;
    > volatile const uint8_t last_buffer_sent;
    > volatile uint8_t set_state; // ????
    > } *REGISTER_SET;
    >
    > This drops the distinction between vo and ro, but that is probably
    > not a problem, either way the compiler knows that every time the program
    > specifies a read a different value might come back. However,
    > the declaration for set_state is not sufficient. "volatile" is not at
    > all "write only", it just says that the value read can change
    > unpredictably, not that reads are forbidden.



    This isn't a problem. You can enforce the actual semantics otherwise.

    > What if the physical
    > register is such that a read of "set_state" causes a hardware error,
    > or at the very least, will return line noise?



    Then doctor, doctor it hurts when I do that. I find this sort of FPGA
    behavior nauseating. Yeah, it happens. It completely precludes
    memmove() to copy the struct, a great sin IMO.

    I'd file a bug report on the FPGA.

    > Is there some way to
    > express that situation in current C syntax, so that if this snippet was
    > buried elsewhere in the code the compiler would always flag it as an error:
    >


    I don't think there's a single thing in 'C' that does not have an
    r-value. You can fly the W/Rbar pin all day in 'C', but there's no
    write-only.

    > REGISTER_SET rset=(REGISTER_SET) pointer_to_a_set_of_registers;
    > // next line would crash the machine
    > printf("value of set_state is: %d\n",rset->set_state);
    >


    I'd write getters/setters for it myself, to keep that crash off
    the table.

    > Regards,
    >
    > David Mathog
    >
    >
    >
    >


    --
    Les Cargill
    Les Cargill, Mar 28, 2013
    #17
  18. mathog

    Tim Rentsch Guest

    mathog <> writes:

    > Tim Rentsch wrote:
    >>
    >> Good one!!! But you posted it 13 days early...

    >
    > Musings, right?
    >
    > Coming back to this after a very busy week.
    >
    > Imagine that one wanted to write a C interface for a set of
    > registers that are mapped into memory. [snip example]


    Except for giving an example, you don't really say what
    you're hoping to accomplish, or what the costs or benefits
    are for doing so. It looks like everything you're trying to
    do can be done in standard C simply by applying volatile
    semantics selectively and wrapping accesses inside functions
    (said functions then having the responsibility for using
    volatile appropriately). So to me it looks like an awful lot
    of cost for almost no benefit, especially since the range of
    applicability is so small -- most C code has no need for any
    kind of volatile access, let alone those that are selectively
    volatile.
    Tim Rentsch, Mar 29, 2013
    #18
  19. mathog

    mathog Guest

    Jorgen Grahn wrote:
    > On Tue, 2013-03-19, mathog wrote:
    > ...
    >> I have recently been taking some heat for using constructs like this:
    >>
    >> float some_float;
    >> /* code sets some_float */
    >> int i = (int) round(some_float);

    > Maybe I'm ignorant, but I fail to see why a cast is needed at all.


    Normally it is not needed and that was a bad example. This actually
    comes up in association with printf() statements where the arguments
    must match the conversion specifiers or warnings result (and the result
    is not usually what was intended). In the next line "xe" is a double
    variable

    printf("width %d\n",(int) (xe * 64.0));

    and gcc generates this warning

    warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has
    type ‘double’ [-Wformat]

    when the (int) cast is not present.

    Regards,

    David Mathog
    mathog, Apr 11, 2013
    #19
  20. mathog <> wrote:

    (snip)

    > Normally it is not needed and that was a bad example. This actually
    > comes up in association with printf() statements where the arguments
    > must match the conversion specifiers or warnings result (and the result
    > is not usually what was intended). In the next line "xe" is a double
    > variable


    > printf("width %d\n",(int) (xe * 64.0));


    > and gcc generates this warning


    > warning: format ???%d??? expects argument of type ???int???, but argument 2 has
    > type ???double??? [-Wformat]


    > when the (int) cast is not present.


    Some people have been known to print out the hex representation
    of floating point values by using %x with float or double.
    (Likely two %8.8x in the latter case.)

    -- glen
    glen herrmannsfeldt, Apr 11, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. J Kenneth King
    Replies:
    3
    Views:
    348
    Diez B. Roggisch
    Feb 22, 2009
  2. Cameron Simpson
    Replies:
    1
    Views:
    346
    J Kenneth King
    Feb 19, 2009
  3. mk
    Replies:
    2
    Views:
    279
    Terry Reedy
    Nov 24, 2009
  4. Antoine Pitrou

    Re: pointless musings on performance

    Antoine Pitrou, Nov 24, 2009, in forum: Python
    Replies:
    9
    Views:
    339
    Paul Boddie
    Nov 26, 2009
  5. trans.  (T. Onoma)

    Musings on module/class definition domains

    trans. (T. Onoma), Oct 6, 2004, in forum: Ruby
    Replies:
    4
    Views:
    121
    Mauricio Fernández
    Oct 8, 2004
Loading...

Share This Page