Typecast clarification

Discussion in 'C Programming' started by syuga2012@gmail.com, Feb 12, 2009.

  1. Guest

    Hi Folks,

    To determine if a machine is little endian / big endian the foll. code
    snippet is used...

    int num = 1;

    if( * (char *)&num == 1)
    printf ("\n Little Endian");

    else
    printf("\n Big endian");

    I needed a few clarifications regarding this.

    1. Can we use void * instead of char * ?
    2. When do we use void * and when char * ?
    3. Does the above typecast convert an integer to a char (1 byte) in
    memory?
    For e.g if I used a variable ch, to store the result of the above
    typecast

    4. In general, when can we safely do typecasts ? Are such code
    portable ?

    Thanks a lot for your help. Appreciate it.

    syuga
     
    , Feb 12, 2009
    #1
    1. Advertising

  2. WANG Cong Guest

    wrote:

    > Hi Folks,
    >
    > To determine if a machine is little endian / big endian the foll. code
    > snippet is used...
    >
    > int num = 1;
    >
    > if( * (char *)&num == 1)
    > printf ("\n Little Endian");
    >
    > else
    > printf("\n Big endian");
    >
    > I needed a few clarifications regarding this.
    >
    > 1. Can we use void * instead of char * ?


    Here? No, you can not dereference a void* pointer.

    > 2. When do we use void * and when char * ?


    void* is used generally, for example, malloc(), it can help
    you to pass pointers without castings.

    Here, in your case, using char * is because it only wants to
    fetch one byte from a 4-byte int.

    > 3. Does the above typecast convert an integer to a char (1 byte) in
    > memory?
    > For e.g if I used a variable ch, to store the result of the above
    > typecast


    No, it casts an int pointer to char pointer.

    >
    > 4. In general, when can we safely do typecasts ? Are such code
    > portable ?


    When you understand what you are doing. :)
     
    WANG Cong, Feb 12, 2009
    #2
    1. Advertising

  3. Guest

    Subject: "Typecast clarification"

    technically, what you are doing is "casting" not "typecasting"
    [prepare for flamewar]


    On 12 Feb, 09:49, "" <> wrote:
    > Hi Folks,
    >
    > To determine if a machine is little endian / big endian the foll. code
    > snippet is used...
    >
    > int num = 1;
    >
    > if( * (char *)&num == 1)
    >   printf ("\n Little Endian");
    > else
    >   printf("\n Big endian");


    that takes the address of num, casts it to a pointer to char
    (unsigned char might be slightly safer), then dereferences
    it to give a char. If it's little endian the number will
    be stored (assuming 8bit chars and 32 bit ints)

    lo hi
    01 00 00 00

    lo hi
    00 00 00 01

    so the code will do what you expect. Note there
    are more than 2 ways to order 4 objects...
    (and some of them *have* been used)

    > I needed a few clarifications regarding this.
    >
    > 1. Can we use void * instead of char * ?


    no. You cannot dereference a (void*)
    (some compilers allow this but they are not compliant with the
    standard)

    > 2. When do we use void * and when char * ?


    (void*) for anonymous or unknown type. (char*) for
    pointer to characters (eg strings). (unsigned char*)
    for getting at the representation. It is safe to cast
    any data to unsigned char.

    > 3. Does the above typecast convert an integer to a char (1 byte) in
    > memory?


    it doesn't actually modifythe value in memory, but
    only how the program looks at it.

    >     For e.g if I used a variable ch, to store the result of the above
    > typecast


    sorry, lost me. Could you post code?


    > 4. In general, when can we safely do typecasts ?


    when necessary :) There's no short answer to this one.

    > Are such code portable ?


    sometimes. More often then not, no though


    > Thanks a lot for your help. Appreciate it.


    happy coding


    --
    Nick Keighley

    "Half-assed programming was a time-filler that, like knitting,
    must date to the beginning of human experience."
    "A Fire Upon The Deep" by Verne Vinge
     
    , Feb 12, 2009
    #3
  4. James Kuyper Guest

    wrote:
    > Hi Folks,
    >
    > To determine if a machine is little endian / big endian the foll. code
    > snippet is used...
    >
    > int num = 1;
    >
    > if( * (char *)&num == 1)
    > printf ("\n Little Endian");
    >
    > else
    > printf("\n Big endian");


    Note: this code assumes that there are only two possible
    representations. That's a good approximation to reality, but it's not
    the exact truth. If 'int' is a four-byte type (which it is on many
    compilers), there's 24 different byte orders theoretically possible, 6
    of which would be identified as Little Endian by this code, 5 of them
    incorrectly. 18 of them would be identified as Big Endian, 17 of them
    incorrectly.

    This would all be pure pedantry, if it weren't for one thing: of those
    24 possible byte orders, something like 8 to 11 of them (I can't
    remember the exact number) are in actual use on real world machines.
    Even that would be relatively unimportant if bigendian and littlendian
    were overwhelmingly the most popular choices, but that's not even the
    case: the byte orders 2134 and 3412 have both been used in some fairly
    common machines.

    The really pedantic issue is that the standard doesn't even guarantee
    that 'char' and 'int' number the bits in the same order. A conforming
    implementation of C could use the same bit that is used by an 'int'
    object to store a value of '1' as the sign bit when the byte containing
    that bit is interpreted as a char.

    > I needed a few clarifications regarding this.
    >
    > 1. Can we use void * instead of char * ?


    No, because you cannot dereference a pointer to void.

    > 2. When do we use void * and when char * ?


    The key differences between char* and void* are that
    a) you cannot dereference or perform pointer arithmetic on void*
    b) there are implicit conversions between void* and any other pointer to
    to object type.

    The general rule is that you should use void* whenever the implicit
    conversions are sufficiently important. The standard library's mem*()
    functions are a good example where void* is appropriate, because they
    are frequently used on pointers to types other than char. You should use
    char* whenever your actually accessing the object as an array of
    characters, which requires pointer arithmetic and dereferencing. You
    should use unsigned char* when accessing the object as an array of
    uninterpreted bytes.

    > 3. Does the above typecast convert an integer to a char (1 byte) in
    > memory?


    There's no such thing as a typecast in C. There is a type conversion,
    which can occur either implicitly, or explicitly. Explicit conversions
    occur as a result of cast expressions.

    The (char*) cast does not convert an integer into a char. It converts a
    pointer to an int into a pointer to a char. The char object it points at
    is the first byte of 'num'. The * operator interprets that byte as a char.

    > For e.g if I used a variable ch, to store the result of the above
    > typecast


    The result of the cast expression is a pointer to char; it can be
    converted into a char and stored into a char variable, but the result of
    that conversion is probably meaningless unless sizeof(intptr_t) == 1,
    which is pretty unlikely. It would NOT, in general, have anything to do
    with the value stored in the first byte of "num".

    You could write:

    char c = *(char*)&num;

    > 4. In general, when can we safely do typecasts ? Are such code
    > portable ?


    The only type conversions that are reasonably safe in portable code are
    the ones which occur implicitly, without the use of a cast, and even
    those have dangers. Any use of a cast should be treated as a danger
    sign. The pattern *(T*), where T is an arbitrary type, is called type
    punning. In general, this is one of the most dangerous uses of a cast.
    In the case where T is "char", it happens to be relatively safe.

    The best answer to your question is to read section 6.3 of the standard.
    However, it may be hard for someone unfamiliar with standardese to
    translate what section 6.3 says into "safe" or "unsafe", "portable" or
    "unportable". Here's my quick attempt at a translation:

    * Any value may be converted to void; there's nothing that you can do
    with the result. The only use for such a cast would be to shut up the
    diagnostics that some compilers generate when you fail to do anything
    with the value returned by a function. However, it is perfectly safe.

    * Converting any numeric value to a type that is capable of storing that
    value is safe. If the value is currently of a type which has a range
    which is guaranteed to be a subset of the the range of the target type,
    safety is automatic - for instance, when converting "signed char" to
    "int". Otherwise, it's up to your program to make sure that the value is
    within the valid range.

    * Converting a value to a signed or floating point type that is outside
    of the valid range for that type is not safe.

    * Converting a numeric value to an unsigned type that is outside the
    valid range is safe, in the sense that your program will continue
    running; but the resulting value will be different from the original by
    a multiple of the number that is one more than the maximum value which
    can be stored in that type. If that change in value is desired and
    expected (D&E), that's a good thing, otherwise it's bad.

    * Converting a floating point value to an integer type will loose the
    fractional part of that value. If this is D&E, good, otherwise, bad.

    * Converting a floating point value to a type with lower precision will
    generally lose precision. If this is acceptable and expected, good -
    otherwise, bad.

    * Converting a _Complex value to a real type will cause the imaginary
    part of the value to be discarded. Converting it to an _Imaginary type
    will cause the real part of the value to be discarded. Converting
    between real and _Imaginary types will always result in a value of 0. In
    each of these cases, if the change in value is D&E, good - otherwise, bad.

    * Converting a null pointer constant to a pointer type results in a null
    pointer of that type. Converting a null pointer to a different pointer
    type results in a null pointer of that target type. Both conversions are
    safe.

    * Converting a pointer to an integer type is safe, but unless the target
    type is either an intptr_t or a uintptr_t, the result is
    implementation-defined, rendering it pretty much useless, at least in
    portable code. If the target type is intptr_t or uintptr_t, the result
    may be safely converted back to the original pointer type, and the
    result of that conversion will compare equal to the original pointer.
    You can safely treat that integer value just like any other integer
    value, but conversion back to the original pointer type is the only
    meaningful thing that can be done with it.

    * Except as described above, converting an integer value into a pointer
    type is always dangerous. Note: an integer constant expression with a
    value of 0 qualifies as a null pointer constant. Therefore, it qualifies
    as one of the cases "described above".

    * Any pointer to a function type may be safely converted into a pointer
    to a different pointer type. The result may be converted back to the
    original pointer type, in which case it will compare equal to the
    original pointer. However, you can only safely dereference a function
    pointer if it points at a function whose actual type is compatible with
    the type that the function pointer points at.

    * Conversions which add a qualifier to a pointer type (such as int* =>
    const int*) are safe.

    * Conversions which remove a qualifier from a pointer type (such as
    volatile double * => double *) are safe in themselves, but are
    invariably needed only to perform operations that can be dangerous
    unless you know precisely what the relevant rules are.

    * A pointer to any object can be safely converted into a pointer to a
    character type. The result points at the first byte of that object.

    * Conversion of a pointer to an object or incomplete type into a pointer
    to a different object or incomplete type is safe, but only if it is
    correctly aligned for that type. There are only a few cases where you
    can be portably certain that the alignment is correct, which limits the
    usefulness of this case.

    Except as indicated above, the standard says absolutely nothing about
    WHERE the resulting pointer points at, which in principle even more
    seriously restricts the usefulness of the result of such a conversion.
    However, in practice, on most real systems the resulting pointer will
    point at the same location in memory as the original pointer.

    However, it is only safe to dereference such a pointer if you do so in a
    way that conforms to the anti-aliasing rules (6.5p7). And that is what
    makes type punning so dangerous.
     
    James Kuyper, Feb 12, 2009
    #4
  5. Boon Guest

    syuga wrote:

    > To determine if a machine is little endian or big endian, the
    > following code snippet is used...
    >
    > int num = 1;
    >
    > if( * (char *)&num == 1)
    > printf ("\n Little Endian");
    >
    > else
    > printf("\n Big endian");


    You don't need casts if you use memcmp.

    $ cat endian.c
    #include <stdint.h>
    #include <string.h>
    #include <stdio.h>

    int main(void)
    {
    uint32_t i = 0x12345678;
    uint8_t msb_first[4] = { 0x12, 0x34, 0x56, 0x78 };
    uint8_t lsb_first[4] = { 0x78, 0x56, 0x34, 0x12 };
    if (memcmp(&i, msb_first, 4) == 0) puts("BIG ENDIAN");
    else if (memcmp(&i, lsb_first, 4) == 0) puts("LITTLE ENDIAN");
    else puts("SOMETHING ELSE");
    return 0;
    }
     
    Boon, Feb 12, 2009
    #5
  6. Bruce Cook Guest

    James Kuyper wrote:

    > wrote:
    >> Hi Folks,
    >>
    >> To determine if a machine is little endian / big endian the foll. code
    >> snippet is used...
    >>
    >> int num = 1;
    >>
    >> if( * (char *)&num == 1)
    >> printf ("\n Little Endian");
    >>
    >> else
    >> printf("\n Big endian");

    >
    > Note: this code assumes that there are only two possible
    > representations. That's a good approximation to reality, but it's not
    > the exact truth. If 'int' is a four-byte type (which it is on many
    > compilers), there's 24 different byte orders theoretically possible, 6
    > of which would be identified as Little Endian by this code, 5 of them
    > incorrectly. 18 of them would be identified as Big Endian, 17 of them
    > incorrectly.
    >
    > This would all be pure pedantry, if it weren't for one thing: of those
    > 24 possible byte orders, something like 8 to 11 of them (I can't
    > remember the exact number) are in actual use on real world machines.
    > Even that would be relatively unimportant if bigendian and littlendian
    > were overwhelmingly the most popular choices, but that's not even the
    > case: the byte orders 2134 and 3412 have both been used in some fairly
    > common machines.


    And there's arguments as to weather 2143, 3412 or 4321 is the "real" big-
    endian once it jumped from 16 bits to 32 bits, endian-ness became a bit
    complicated. It's original intent was to enable short-word and word fetches
    to fetch the same value, assuming the word contained a small value. This
    came about because processors often had octet as well as word instructions.

    Once 32 bits came about and instructions had 8, 16 and 32 bit word operand
    sizes, the question was do you optimize for 8-bit or 16 bit fetches.
    Different processor designers came up with different solutions to this,
    which lead to all the differing endians.

    Then when you get to 64-bit native such as the Alpha, there's even more
    combinations (8 octets per word instead of just 4).

    The Alpha is interesting because it's endianness is controllable, although
    practically you'd have it fixed for a particular operating system so testing
    for it would still be valid.

    [...]

    Bruce
     
    Bruce Cook, Feb 12, 2009
    #6
  7. James Kuyper <> writes:
    [...]
    > Note: this code assumes that there are only two possible
    > representations. That's a good approximation to reality, but it's not
    > the exact truth. If 'int' is a four-byte type (which it is on many
    > compilers), there's 24 different byte orders theoretically possible, 6
    > of which would be identified as Little Endian by this code, 5 of them
    > incorrectly. 18 of them would be identified as Big Endian, 17 of them
    > incorrectly.
    >
    > This would all be pure pedantry, if it weren't for one thing: of those
    > 24 possible byte orders, something like 8 to 11 of them (I can't
    > remember the exact number) are in actual use on real world
    > machines. Even that would be relatively unimportant if bigendian and
    > littlendian were overwhelmingly the most popular choices, but that's
    > not even the case: the byte orders 2134 and 3412 have both been used
    > in some fairly common machines.


    Really? I've only heard of 1234, 4321, 2143, and 3412 being used in
    real life. In fact, I've only heard of one of the last two (whichever
    one the PDP-11 used). What other orders have been used, and *why*?

    [...]

    > * Converting a numeric value to an unsigned type that is outside the
    > valid range is safe, in the sense that your program will continue
    > running; but the resulting value will be different from the original
    > by a multiple of the number that is one more than the maximum value
    > which can be stored in that type. If that change in value is desired
    > and expected (D&E), that's a good thing, otherwise it's bad.


    Almost. Converting a *signed or unsigned* value to an unsigned type
    is safe, as you describe. Converting a floating-point value to
    unsigned, if the value is outside the range of the unsigned type,
    invokes undefined behavior.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 12, 2009
    #7
  8. jameskuyper Guest

    Keith Thompson wrote:
    > James Kuyper <> writes:
    > [...]
    > > Note: this code assumes that there are only two possible
    > > representations. That's a good approximation to reality, but it's not
    > > the exact truth. If 'int' is a four-byte type (which it is on many
    > > compilers), there's 24 different byte orders theoretically possible, 6
    > > of which would be identified as Little Endian by this code, 5 of them
    > > incorrectly. 18 of them would be identified as Big Endian, 17 of them
    > > incorrectly.
    > >
    > > This would all be pure pedantry, if it weren't for one thing: of those
    > > 24 possible byte orders, something like 8 to 11 of them (I can't
    > > remember the exact number) are in actual use on real world
    > > machines. Even that would be relatively unimportant if bigendian and
    > > littlendian were overwhelmingly the most popular choices, but that's
    > > not even the case: the byte orders 2134 and 3412 have both been used
    > > in some fairly common machines.

    >
    > Really? I've only heard of 1234, 4321, 2143, and 3412 being used in


    My reference to 2134 was a typo - I meant 2143.

    > real life. In fact, I've only heard of one of the last two (whichever
    > one the PDP-11 used). What other orders have been used, and *why*?


    I remember seeing a web site that listed a large number number of
    orders in current use, and cited specific machines for each byte
    order. Unfortunately, I did not save the URL, so I can't cite it.
    Sorry!
    However it is sufficient for my purposes that 2143 and 3412 are in
    use, and all you have to do to verify that is to do a web search for
    "middle endian".

    > > * Converting a numeric value to an unsigned type that is outside the
    > > valid range is safe, in the sense that your program will continue
    > > running; but the resulting value will be different from the original
    > > by a multiple of the number that is one more than the maximum value
    > > which can be stored in that type. If that change in value is desired
    > > and expected (D&E), that's a good thing, otherwise it's bad.

    >
    > Almost. Converting a *signed or unsigned* value to an unsigned type
    > is safe, as you describe. Converting a floating-point value to
    > unsigned, if the value is outside the range of the unsigned type,
    > invokes undefined behavior.


    You're right. It's not an issue I've had to worry about very often,
    and I remembered it incorrectly. I did the first 7 items on my list
    straight from memory, and I should have double-checked them against
    the standard before posting.
     
    jameskuyper, Feb 12, 2009
    #8
  9. LL Guest

    "WANG Cong" <> wrote in message
    news:gn125p$14h$99.com...
    > wrote:
    >
    >> Hi Folks,
    >>
    >> To determine if a machine is little endian / big endian the foll. code
    >> snippet is used...
    >>
    >> int num = 1;
    >>
    >> if( * (char *)&num == 1)
    >> printf ("\n Little Endian");

    I'm a novice on C too but here makes no sense.
    Refer to C Precedence Table
    (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here unary * has
    the highest precedence, then comes == then &. So what's this supposed to
    mean? Dereferencing what?

    >>
    >> else
    >> printf("\n Big endian");
    >>
    >> I needed a few clarifications regarding this.
    >>
    >> 1. Can we use void * instead of char * ?

    >
    > Here? No, you can not dereference a void* pointer.
    >
    >> 2. When do we use void * and when char * ?

    >
    > void* is used generally, for example, malloc(), it can help
    > you to pass pointers without castings.
    >
    > Here, in your case, using char * is because it only wants to
    > fetch one byte from a 4-byte int.
    >
    >> 3. Does the above typecast convert an integer to a char (1 byte) in
    >> memory?
    >> For e.g if I used a variable ch, to store the result of the above
    >> typecast

    >
    > No, it casts an int pointer to char pointer.
    >
    >>
    >> 4. In general, when can we safely do typecasts ? Are such code
    >> portable ?

    >
    > When you understand what you are doing. :)

    Could someone tell me how does this test for endianness?
     
    LL, Feb 12, 2009
    #9
  10. jameskuyper Guest

    LL wrote:
    > "WANG Cong" <> wrote in message
    > news:gn125p$14h$99.com...
    > > wrote:

    .....
    > >> int num = 1;
    > >>
    > >> if( * (char *)&num == 1)
    > >> printf ("\n Little Endian");

    > I'm a novice on C too but here makes no sense.
    > Refer to C Precedence Table
    > (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here unary * has
    > the highest precedence, then comes == then &. So what's this supposed to
    > mean? Dereferencing what?


    It's a mistake to pay too much attention to precedence tables. The C
    standard defines things in terms of grammar, not in terms of
    precedence, and the relevant grammar rule is 6.5.3p1:

    "unary-expression:
    ...
    unary-operator cast-expression

    unary-operator: one of
    & * + - ~ !
    "

    Thus, & and * have the same "precedence", the key issue is whether or
    not the thing to the right of the operator can be parsed as a cast-
    expression. You can't parse anything to the right of the '*' operator
    as a cast express that is shorter than "(char*)&num". Therefore the
    &num has to be evaluated first, giving a pointer to 'num'. Then
    (char*) is applied to that pointer, converting it to a pointer to
    char. Finally, the unary '*' is evaluated, returning the value of the
    byte at that location, interpreted as a char. That value is then
    compared to 1 for equality.

    > Could someone tell me how does this test for endianness?


    If 'int' is a little-endian type, the bit that will be set is in the
    first byte; if it's a big-endian type, the bit that will be set is in
    the last byte. If those were the only two possibilities, this would be
    a good way to find out which one it is.
     
    jameskuyper, Feb 12, 2009
    #10
  11. CBFalconer Guest

    LL wrote:
    >

    .... snip ...
    >
    > I'm a novice on C too but here makes no sense.
    > Refer to C Precedence Table
    > (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here
    > unary * has the highest precedence, then comes == then &. So
    > what's this supposed to mean? Dereferencing what?


    C doesn't define precedences. It defines the parsing grammar. See
    the things marked C99 below, which refer to the C standard.
    n869_txt.bz2 is a bzipped text file. I advise you to mistrust such
    things as your reference (I haven't looked at it).

    Some useful references about C:
    <http://www.ungerhu.com/jxh/clc.welcome.txt>
    <http://c-faq.com/> (C-faq)
    <http://benpfaff.org/writings/clc/off-topic.html>
    <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> (C99)
    <http://cbfalconer.home.att.net/download/n869_txt.bz2> (pre-C99)
    <http://www.dinkumware.com/c99.aspx> (C-library}
    <http://gcc.gnu.org/onlinedocs/> (GNU docs)
    <http://clc-wiki.net/wiki/C_community:comp.lang.c:Introduction>
    <http://clc-wiki.net/wiki/Introduction_to_comp.lang.c>

    --
    [mail]: Chuck F (cbfalconer at maineline dot net)
    [page]: <http://cbfalconer.home.att.net>
    Try the download section.
     
    CBFalconer, Feb 13, 2009
    #11
  12. CBFalconer Guest

    Keith Thompson wrote:
    >

    .... snip ...
    >
    > Almost. Converting a *signed or unsigned* value to an unsigned
    > type is safe, as you describe. Converting a floating-point value
    > to unsigned, if the value is outside the range of the unsigned
    > type, invokes undefined behavior.


    And people should also be aware that converting unsigned to signed
    can also invoke UB.

    3.18
    [#1] undefined behavior
    behavior, upon use of a nonportable or erroneous program
    construct, of erroneous data, or of indeterminately valued
    objects, for which this International Standard imposes no
    requirements

    [#2] NOTE Possible undefined behavior ranges from ignoring
    the situation completely with unpredictable results, to
    behaving during translation or program execution in a
    documented manner characteristic of the environment (with or
    without the issuance of a diagnostic message), to
    terminating a translation or execution (with the issuance of
    a diagnostic message).

    [#3] EXAMPLE An example of undefined behavior is the
    behavior on integer overflow.

    --
    [mail]: Chuck F (cbfalconer at maineline dot net)
    [page]: <http://cbfalconer.home.att.net>
    Try the download section.
     
    CBFalconer, Feb 13, 2009
    #12
  13. On Thu, 12 Feb 2009 21:43:46 -0500, CBFalconer wrote:
    > Keith Thompson wrote:
    >>

    > ... snip ...
    >>
    >> Almost. Converting a *signed or unsigned* value to an unsigned type is
    >> safe, as you describe. Converting a floating-point value to unsigned,
    >> if the value is outside the range of the unsigned type, invokes
    >> undefined behavior.

    >
    > And people should also be aware that converting unsigned to signed can
    > also invoke UB.


    This is misleading. The result of an unsigned-to-signed conversion, when
    the value is not representable in the signed type, is implementation-
    defined (or in C99, an implementation-defined signal may be raised). It's
    not considered an overflow, and it can be useful in correct programs.
     
    Harald van Dijk, Feb 13, 2009
    #13
  14. CBFalconer <> writes:
    > Keith Thompson wrote:
    >>

    > ... snip ...
    >>
    >> Almost. Converting a *signed or unsigned* value to an unsigned
    >> type is safe, as you describe. Converting a floating-point value
    >> to unsigned, if the value is outside the range of the unsigned
    >> type, invokes undefined behavior.

    >
    > And people should also be aware that converting unsigned to signed
    > can also invoke UB.

    [snip definition of undefined behavior]

    As Harald pointed out, the result of such a conversion is
    implementation-defined, or it can raise an implementation-defined
    signal.

    But of course James Kuyper already covered that case (though he said
    it's "not safe", which was appropriate wording for the level at which
    he was aiming). I only commented on the floating-point-to-unsigned
    issue because James got the rest of it right.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 13, 2009
    #14
  15. Guest

    On 12 Feb, 13:41, James Kuyper <> wrote:


    > The really pedantic issue is that the standard doesn't even guarantee
    > that 'char' and 'int' number the bits in the same order.


    what!?

    So this gives implementation defined behaviour:
    int i = 1;
    printf ("%d\n", (unsigned char)(i & 1));

    [ignoring the fact that printf() itself might do something strange
    if a sanscrit font were selected]


    > A conforming
    > implementation of C could use the same bit that is used by an 'int'
    > object to store a value of '1' as the sign bit when the byte containing
    > that bit is interpreted as a char.


    I read that three times and didn't understand it.
    Assuming (for concretness) that we have 16 bit bytes
    and 8 bit chars and the ints are lsb at lower addresss.

    int i = 0x0080;

    lo hi
    80 00

    the one bit is not in a sign position, but
    char c = *(char*)i;

    c will equal 0x80 so the one bit *is* in a sign
    position. is that what you were talking about.


    The idea of the bits being renumbered is... unsettling
    [Darth Vader's first C programming class]
     
    , Feb 13, 2009
    #15
  16. James Kuyper Guest

    Richard Heathfield wrote:
    > CBFalconer said:
    >
    >> LL wrote:
    >> ... snip ...
    >>> I'm a novice on C too but here makes no sense.
    >>> Refer to C Precedence Table
    >>> (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here
    >>> unary * has the highest precedence, then comes == then &. So
    >>> what's this supposed to mean? Dereferencing what?

    >> C doesn't define precedences. It defines the parsing grammar.
    >> See the things marked C99 below, which refer to the C standard.
    >> n869_txt.bz2 is a bzipped text file. I advise you to mistrust
    >> such things as your reference (I haven't looked at it).

    >
    > I /have/ looked at it, and it doesn't contain any mistakes that p53
    > of K&R2 doesn't contain. It's actually slightly more useful than
    > the K&R2 version, since it labels ambiguous operators as being
    > either the unary or binary version (and does so correctly).
    >
    > (But then I would expect nothing less from a page written by Curt
    > Landon Noll.)


    Like Chuck, I didn't look at that page either, before posting my earlier
    response. When I did look just now, I find that RH is right; insofar as
    the grammar of C can be approximately described in terms of precedence,
    and associativity, that page does so reasonably well.

    LL was misinterpreting it when he assumed that it was telling him that
    unary & has lower precedence than ==. I think he missed the distinction
    between unary '&' and binary '&', and didn't notice the unary '&' right
    next to the unary '*'.
     
    James Kuyper, Feb 13, 2009
    #16
  17. James Kuyper Guest

    wrote:
    > On 12 Feb, 13:41, James Kuyper <> wrote:
    >
    >
    >> The really pedantic issue is that the standard doesn't even guarantee
    >> that 'char' and 'int' number the bits in the same order.

    >
    > what!?
    >
    > So this gives implementation defined behaviour:
    > int i = 1;
    > printf ("%d\n", (unsigned char)(i & 1));


    No. The binary '&' operator works on the bits of the value, not the bits
    of the representation. The expression 'i&1' returns a value of 1 if the
    bit with a value of 1 is set in the representation of 'i', regardless of
    which bit that is. The value of that expression will therefore be 1, a
    value which will be preserved when converted to unsigned char, and will
    still be preserved when it is promoted to either 'int' or 'unsigned
    int', depending upon whether or not UCHAR_MAX < INT_MAX.

    To test my assertion, you must look at the representation of 'i', not
    just at it's value:

    for(char *p = (char*)&i; p < (char*)(&i + 1); p++)
    printf("%d ", *p);
    printf("\n");

    What I am saying is that the standard does not guarantee that any of the
    values printed out by the above code will be '1'. If 'int' doesn't have
    any padding bits, then exactly one of those values will be non-zero, and
    the one that is non-zero will be either a power of two, or (if char is
    signed) whatever value the sign bit represents, which depends upon
    whether it has 2's complement, 1's complement, or sign-magnitude
    representation.

    In practice, I'd be very surprised if the non-zero value was anything
    other than 1, and I think it might be a good thing for the standard to
    be more specific about such things. I don't think it would cause any
    problems if the standard specified that all of the value bits stored
    within a given byte of an integer type must have values that represent
    consecutive powers of 2, in the same order for all integer types, with
    the sign bit being adjacent to the value bit with the highest value.
    Does anyone know of an implementation for which that is not true?

    >> A conforming
    >> implementation of C could use the same bit that is used by an 'int'
    >> object to store a value of '1' as the sign bit when the byte containing
    >> that bit is interpreted as a char.

    >
    > I read that three times and didn't understand it.
    > Assuming (for concretness) that we have 16 bit bytes
    > and 8 bit chars and the ints are lsb at lower addresss.
    >
    > int i = 0x0080;
    >
    > lo hi
    > 80 00
    >
    > the one bit is not in a sign position, but
    > char c = *(char*)i;
    >
    > c will equal 0x80 so the one bit *is* in a sign
    > position. is that what you were talking about.


    No. It will be clearer if we use a different value for i, and talk about
    unsigned char, rather than char, because my point really wasn't specific
    to the sign bit, it applies equally well to any bit, and is easier to
    explain without the distraction of the possible existence of a sign bit
    in char.

    Assume:
    sizeof(int) = 2
    UCHAR_MAX = 255
    i == 0x0040
    *(1 + (unsigned char*)&i) == 0.

    I am saying that the standard allows the existence of an implementation
    with those things could be true, while the value of *(unsigned char*)&i
    could be any of the following: 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40,
    or 0x80.

    > The idea of the bits being renumbered is... unsettling
    > [Darth Vader's first C programming class]


    The C standard fails to say a great many things that most C programmers
    take for granted. In some cases, there's a good reason for it. I'm not
    sure this is one of those cases.
     
    James Kuyper, Feb 13, 2009
    #17
  18. Bruce Cook Guest

    Keith Thompson wrote:

    > James Kuyper <> writes:

    [...]
    > Really? I've only heard of 1234, 4321, 2143, and 3412 being used in
    > real life. In fact, I've only heard of one of the last two (whichever
    > one the PDP-11 used). What other orders have been used, and *why*?


    PDP-11 was original big-endian , 21 - only 16 bits.

    There were no 32-bit instructions, so the endian-ness was restricted to the
    2 options big/little

    Bruce
     
    Bruce Cook, Feb 13, 2009
    #18
  19. Bruce Cook <> writes:
    > Keith Thompson wrote:
    >> James Kuyper <> writes:

    > [...]
    >> Really? I've only heard of 1234, 4321, 2143, and 3412 being used in
    >> real life. In fact, I've only heard of one of the last two (whichever
    >> one the PDP-11 used). What other orders have been used, and *why*?

    >
    > PDP-11 was original big-endian , 21 - only 16 bits.
    >
    > There were no 32-bit instructions, so the endian-ness was restricted to the
    > 2 options big/little


    Right, but when 32-bit operations were implemented in software, they
    used the ordering 2143 (two adjacent 16-bit integers).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 13, 2009
    #19
  20. Kaz Kylheku Guest

    On 2009-02-13, CBFalconer <> wrote:
    > LL wrote:
    >>

    > ... snip ...
    >>
    >> I'm a novice on C too but here makes no sense.
    >> Refer to C Precedence Table
    >> (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here
    >> unary * has the highest precedence, then comes == then &. So
    >> what's this supposed to mean? Dereferencing what?

    >
    > C doesn't define precedences. It defines the parsing grammar.


    The C grammar does in fact define precedences, implicitly. The grammar can be
    shown to exhibit precedence, by reduction to an equivalent grammar which uses
    precedence to generate the same language. The concepts of associativity
    and precedence are mathematically precise.

    Most C programmers in fact work with these concepts, rather than the factored
    grammar. The K&R2 gives an associativity and precedence table on page 53;
    even Dennis Ritchie thinks of C expression grammar in terms of associativity
    and precedence.

    So it's a perfectly accurate remark to say that in the C expression a * b + c,
    the * operator has a higher precedence than the + operator.
     
    Kaz Kylheku, Feb 13, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark Fitzpatrick

    Re: typecast error!

    Mark Fitzpatrick, Feb 1, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    1,675
    Lau Lei Cheong
    Feb 8, 2006
  2. Joe Van Meer

    Re: typecast error!

    Joe Van Meer, Feb 1, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    422
    Bob Lehmann
    Feb 2, 2006
  3. Otis Mukinfus

    Re: typecast error!

    Otis Mukinfus, Feb 3, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    374
    Terry Burns
    Feb 4, 2006
  4. Tomba

    generics typecast

    Tomba, Jan 9, 2006, in forum: Java
    Replies:
    8
    Views:
    1,334
  5. Roger Tombey

    Typecast clarification

    Roger Tombey, May 26, 2010, in forum: C Programming
    Replies:
    29
    Views:
    1,323
    Eric Sosman
    May 27, 2010
Loading...

Share This Page