Richard Heathfield's setbits()

Discussion in 'C Programming' started by Alex Vinokur, May 2, 2010.

  1. Alex Vinokur

    Alex Vinokur Guest

    Hi,

    Here is Richard Heathfield's function from http://users.powernet.co.uk/eton/kandr2/krx206.html

    #include <stdio.h>
    unsigned setbits(unsigned x, int p, int n, unsigned y)
    {
    return (x & ((~0 << (p + 1)) | (~(~0 << (p + 1 - n))))) | ((y & ~(~0
    << n)) << (p + 1 - n));
    }

    setbits(x,p,n,y) returns x with the n bits that begin at position p
    set to the rightmost n bits of y, leaving the other bits unchanged.

    It seems that setbits (x, 31, n, y) may produce undefined behavior.

    According to the C++ standard:
    Behavior of shift operators << and >> is undefined if the right
    operand is negative, or
    greater than or equal to the length in bits of the promoted left
    operand.

    So, result ((~0 << (p + 1)) may be undefined.

    Regards,

    Alex Vinokur
     
    Alex Vinokur, May 2, 2010
    #1
    1. Advertising

  2. Alex Vinokur <> writes:

    > Here is Richard Heathfield's function from http://users.powernet.co.uk/eton/kandr2/krx206.html
    >
    > #include <stdio.h>
    > unsigned setbits(unsigned x, int p, int n, unsigned y)
    > {
    > return (x & ((~0 << (p + 1)) | (~(~0 << (p + 1 - n))))) | ((y & ~(~0
    > << n)) << (p + 1 - n));
    > }
    >
    > setbits(x,p,n,y) returns x with the n bits that begin at position p
    > set to the rightmost n bits of y, leaving the other bits unchanged.
    >
    > It seems that setbits (x, 31, n, y) may produce undefined behavior.


    Yes, it does.

    > According to the C++ standard:


    K&R is about C so you should probably quote the C standard. The intent
    is that C and C++ remain synchronised about this sort of thing so won't
    matter much, but a solution to a K&R exercise must be assumed to be C.
    As it happens, modern C (C99) has diverged from C++ in the case of left
    shifting negative numbers but the above is, presumably, not C99.

    > Behavior of shift operators << and >> is undefined if the right
    > operand is negative, or
    > greater than or equal to the length in bits of the promoted left
    > operand.
    >
    > So, result ((~0 << (p + 1)) may be undefined.


    It can be undefined for other reasons, though none that matter in the
    context of K&R2. ~0 is often signed and negative in which case it is
    undefined in C99 but not in C90 or in C++ (at least up to and including
    2003). It's safer to shift unsigned types so I'd suggest ~0u.

    I think it's hard to do this without knowing the width of the type. I'd
    probably write:

    unsigned width = CHAR_BIT * sizeof x;
    unsigned mask = ~0u >> width - n << p - n + 1;
    return x & ~mask | y & mask;

    This goes wrong if there are padding bits, but at least we can check for
    that (UINT_MAX will be == ~0u if there are none).

    There is a bit-twiddling version that is one operation shorter:

    return x ^ (x ^ a) & mask;

    but you'd need to justify the "eh?" this might prompt in the reader.

    --
    Ben.
     
    Ben Bacarisse, May 2, 2010
    #2
    1. Advertising

  3. Alex Vinokur

    Jonathan Lee Guest

    On May 2, 7:46 am, Alex Vinokur <> wrote:
    > Here is Richard Heathfield's function
    > ...
    > setbits(x,p,n,y) returns x with the n bits that begin at position p
    > set to the rightmost n bits of y, leaving the other bits unchanged.
    >
    > It seems that setbits (x, 31, n, y) may produce undefined behavior.


    Assuming p, n are in [0, INT_BIT)

    unsigned mask = (~((~0U) << n)) << p;
    return (x & ~mask) + ((y << p) & mask);
    or
    return x ^ ((x ^ (y << p)) & mask)

    --Jonathan
     
    Jonathan Lee, May 2, 2010
    #3
  4. Alex Vinokur

    Eric Sosman Guest

    On 5/2/2010 11:39 AM, Ben Bacarisse wrote:
    > [...]
    > ~0 is often signed and negative [...]


    In C, s/often/always/.

    > [...]
    > This goes wrong if there are padding bits, but at least we can check for
    > that (UINT_MAX will be == ~0u if there are none).


    Also if there are twenty. 6.5.3.3p4: "[...] If the promoted
    type is an unsigned type, the expression ~E is equivalent to the
    maximum value representable in that type minus E." As always, the
    settings of any padding bits in the result of ~E (of any arithmetic
    operation) are unspecified.

    --
    Eric Sosman
    lid
     
    Eric Sosman, May 2, 2010
    #4
  5. Eric Sosman <> writes:

    > On 5/2/2010 11:39 AM, Ben Bacarisse wrote:
    >> [...]
    >> ~0 is often signed and negative [...]

    >
    > In C, s/often/always/.


    I'll take your word for it! I was not sure about ~0 on a 1's complement
    machine that supports negative zero. It's called "negative" but is it
    less than zero for the purposes of a shift operation? I was not sure.

    >> [...]
    >> This goes wrong if there are padding bits, but at least we can check for
    >> that (UINT_MAX will be == ~0u if there are none).

    >
    > Also if there are twenty. 6.5.3.3p4: "[...] If the promoted
    > type is an unsigned type, the expression ~E is equivalent to the
    > maximum value representable in that type minus E." As always, the
    > settings of any padding bits in the result of ~E (of any arithmetic
    > operation) are unspecified.


    Ah, yes, of course. Do you know a good way to determine the width of an
    unsigned type? By "good" I probably mean "other than the obvious"
    iterative one.

    --
    Ben.
     
    Ben Bacarisse, May 2, 2010
    #5
  6. Alex Vinokur

    Eric Sosman Guest

    On 5/2/2010 2:15 PM, Ben Bacarisse wrote:
    > Eric Sosman<> writes:
    >
    >> On 5/2/2010 11:39 AM, Ben Bacarisse wrote:
    >>> [...]
    >>> ~0 is often signed and negative [...]

    >>
    >> In C, s/often/always/.

    >
    > I'll take your word for it! I was not sure about ~0 on a 1's complement
    > machine that supports negative zero. It's called "negative" but is it
    > less than zero for the purposes of a shift operation? I was not sure.


    Ah! Okay, "negative zero" might not be "negative" (since it
    compares equal to "positive zero"). Point taken.

    >>> [...]
    >>> This goes wrong if there are padding bits, but at least we can check for
    >>> that (UINT_MAX will be == ~0u if there are none).

    >>
    >> Also if there are twenty. 6.5.3.3p4: "[...] If the promoted
    >> type is an unsigned type, the expression ~E is equivalent to the
    >> maximum value representable in that type minus E." As always, the
    >> settings of any padding bits in the result of ~E (of any arithmetic
    >> operation) are unspecified.

    >
    > Ah, yes, of course. Do you know a good way to determine the width of an
    > unsigned type? By "good" I probably mean "other than the obvious"
    > iterative one.


    Hallvard B. Furuseth came up with

    "

    /* Number of bits in inttype_MAX, or in any (1<<k)-1 where
    * 0 <= k < 3.2E+10 */
    #define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL
    *30 \
    + (m)%0x3fffffffL /((m)%31+1)/31%31*5 +
    4-12/((m)%31+3))

    Or if you are less paranoid about how large UINTMAX_MAX can get:

    /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
    #define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))

    .." (Sorry about the line-wrapping.) Dunno whether you'd deem this
    good, but it's certainly a jaw-dropper.

    --
    Eric Sosman
    lid
     
    Eric Sosman, May 2, 2010
    #6
  7. On Sun, 2 May 2010, Ben Bacarisse wrote:

    > Do you know a good way to determine the width of an unsigned type? By
    > "good" I probably mean "other than the obvious" iterative one.


    You can count the number of set bits in logarithmic time. (*)

    Counting the bits set in (uintmax_t)(unsigned_type)-1 should give the
    number of value bits in "unsigned_type". It should cover all unsigned
    types, and it should require not more steps than log2(sizeof(uintmax_t) *
    (size_t)CHAR_BIT) rounded up or so.

    (*) To see this, divide the entire bit-string in adjacent bit-pairs. A
    bit-pair can have 0, 1, or 2 bits set. This sum can be represented in two
    bits as 00_b, 01_b, or 10_b; there is no carry from this addition. Then
    you can add adjacent bit-pairs representing the sums, in order to extend
    the sum to the originally adjacent bit-pair. The highest value will be
    4_d, or 0100_b, so it will fit into each four-bit nibble. And so on. You
    basically do multiple additions at once. Disjunct bit-segments can be used
    in parallel, because there is never a carry. As you double the segment
    length in each step, the representible range is squared per segment, but
    the maximum sum is only doubled; double exponential vs. exponential.

    For uin32_t, it should look something like this:

    unsigned
    countbits32(uint32_t u32)
    {
    /*
    Compute initial sums. A single bit is identical to the number of bits
    set in it.
    */

    /* 10101010 ... 01010101 ... */
    u32 = (u32 & 0xAAAAAAAAlu) >> 1 + (u32 & 0x55555555lu);

    /* 11001100 ... 00110011 ... */
    u32 = (u32 & 0xCCCCCCCClu) >> 2 + (u32 & 0x33333333lu);

    /* 11110000 ... 00001111 ... */
    u32 = (u32 & 0xF0F0F0F0lu) >> 4 + (u32 & 0x0F0F0F0Flu);

    u32 = (u32 & 0xFF00FF00lu) >> 8 + (u32 & 0x00FF00FFlu);
    u32 = (u32 & 0xFFFF0000lu) >> 16 + (u32 & 0x0000FFFFlu);
    }

    (I probably broke this, so caveat emptor.)

    Another way to look at this is a complete binary tree of six levels (five
    levels plus root). The leaves are the individual original bits (also
    representing the number of bits set in each individual bit). Each internal
    node sums the value of its two direct children, representing the number of
    all nonzero leaves of the subtree rooted at that node. Leveling up happens
    in parallel.

    (So much ranting about such a small thing, and even so, probably
    incorrect. Sorry!)

    Cheers,
    lacos
     
    Ersek, Laszlo, May 2, 2010
    #7
  8. Eric Sosman <> writes:

    > On 5/2/2010 2:15 PM, Ben Bacarisse wrote:

    <snip>
    >> Do you know a good way to determine the width of an
    >> unsigned type? By "good" I probably mean "other than the obvious"
    >> iterative one.

    >
    > Hallvard B. Furuseth came up with
    >
    > "
    >
    > /* Number of bits in inttype_MAX, or in any (1<<k)-1 where
    > * 0 <= k < 3.2E+10 */
    > #define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL
    > %0x3fffffffL *30 \
    > + (m)%0x3fffffffL /((m)%31+1)/31%31*5 +
    > 4-12/((m)%31+3))
    >
    > Or if you are less paranoid about how large UINTMAX_MAX can get:
    >
    > /* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
    > #define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
    >
    > ." (Sorry about the line-wrapping.) Dunno whether you'd deem this
    > good, but it's certainly a jaw-dropper.


    I remember seeing that now... And I was worried about suggesting

    x ^ (x ^ y) & mask
    for
    x & ~mask | y & mask

    !!

    --
    Ben.
     
    Ben Bacarisse, May 2, 2010
    #8
  9. Ben Bacarisse <> writes:

    > Alex Vinokur <> writes:

    <snip>
    >> setbits(x,p,n,y) returns x with the n bits that begin at position p
    >> set to the rightmost n bits of y, leaving the other bits unchanged.

    ^^^^^^^^^^^^^^^^^^^^^
    I missed this bit of the spec. You need y << p - n + 1 in:

    > unsigned width = CHAR_BIT * sizeof x;
    > unsigned mask = ~0u >> width - n << p - n + 1;
    > return x & ~mask | y & mask;


    return x & ~mask | (y << p - n + 1) & mask;

    but you should also use:

    unsigned mask = ~(~0u << n) << p - n + 1;

    as this does not need the width of the type.

    --
    Ben.
     
    Ben Bacarisse, May 2, 2010
    #9
  10. Jonathan Lee <> writes:

    > On May 2, 7:46 am, Alex Vinokur <> wrote:
    >> Here is Richard Heathfield's function
    >> ...
    >> setbits(x,p,n,y) returns x with the n bits that begin at position p
    >> set to the rightmost n bits of y, leaving the other bits unchanged.
    >>
    >> It seems that setbits (x, 31, n, y) may produce undefined behavior.

    >
    > Assuming p, n are in [0, INT_BIT)
    >
    > unsigned mask = (~((~0U) << n)) << p;


    This is a better way to make the mask but I think you've altered what p
    means. Alex (based on Richard's code) seems to take p to be the
    position of the most significant bit of those changed.

    It's simpler (and consistent with the problem wording) to interpret p as
    the least significant bit of the changed bits (as you have done) but
    some people might be confused by this change of meaning.

    It easy to switch to the other interpretation of the problem: substitute
    p - n + 1 for p.

    > return (x & ~mask) + ((y << p) & mask);
    > or
    > return x ^ ((x ^ (y << p)) & mask)


    --
    Ben.
     
    Ben Bacarisse, May 2, 2010
    #10
  11. Alex Vinokur

    Alex Vinokur Guest

    "Alex Vinokur" <> wrote in message
    news:...
    > Hi,
    >
    > Here is Richard Heathfield's function from
    > http://users.powernet.co.uk/eton/kandr2/krx206.html
    >
    > #include <stdio.h>
    > unsigned setbits(unsigned x, int p, int n, unsigned y)
    > {
    > return (x & ((~0 << (p + 1)) | (~(~0 << (p + 1 - n))))) | ((y & ~(~0
    > << n)) << (p + 1 - n));
    > }
    >
    > setbits(x,p,n,y) returns x with the n bits that begin at position p
    > set to the rightmost n bits of y, leaving the other bits unchanged.
    >
    > It seems that setbits (x, 31, n, y) may produce undefined behavior.
    >
    > According to the C++ standard:
    > Behavior of shift operators << and >> is undefined if the right
    > operand is negative, or
    > greater than or equal to the length in bits of the promoted left
    > operand.
    >
    > So, result ((~0 << (p + 1)) may be undefined.


    Replace ((~0 << (p + 1)) with (((~0 << (p)) << 1)

    >
    > Regards,
    >
    > Alex Vinokur
    >
    >
    >
    >
     
    Alex Vinokur, May 3, 2010
    #11
  12. Alex Vinokur

    Phil Carmody Guest

    Ben Bacarisse <> writes:
    > Alex Vinokur <> writes:
    >> Here is Richard Heathfield's function from http://users.powernet.co.uk/eton/kandr2/krx206.html
    >>
    >> #include <stdio.h>
    >> unsigned setbits(unsigned x, int p, int n, unsigned y)
    >> {
    >> return (x & ((~0 << (p + 1)) | (~(~0 << (p + 1 - n))))) | ((y & ~(~0
    >> << n)) << (p + 1 - n));
    >> }
    >>
    >> setbits(x,p,n,y) returns x with the n bits that begin at position p
    >> set to the rightmost n bits of y, leaving the other bits unchanged.
    >>
    >> It seems that setbits (x, 31, n, y) may produce undefined behavior.

    >
    > Yes, it does.
    >
    >> According to the C++ standard:

    >
    > K&R is about C so you should probably quote the C standard. The intent
    > is that C and C++ remain synchronised about this sort of thing so won't
    > matter much, but a solution to a K&R exercise must be assumed to be C.
    > As it happens, modern C (C99) has diverged from C++ in the case of left
    > shifting negative numbers but the above is, presumably, not C99.
    >
    >> Behavior of shift operators << and >> is undefined if the right
    >> operand is negative, or
    >> greater than or equal to the length in bits of the promoted left
    >> operand.
    >>
    >> So, result ((~0 << (p + 1)) may be undefined.

    >
    > It can be undefined for other reasons, though none that matter in the
    > context of K&R2. ~0 is often signed and negative in which case it is
    > undefined in C99 but not in C90 or in C++ (at least up to and including
    > 2003). It's safer to shift unsigned types so I'd suggest ~0u.
    >
    > I think it's hard to do this without knowing the width of the type. I'd
    > probably write:


    Modulo caveats, so would I.

    > unsigned width = CHAR_BIT * sizeof x;
    > unsigned mask = ~0u >> width - n << p - n + 1;


    Alas that can't portably inject a zero-length bitstring.

    > return x & ~mask | y & mask;
    >
    > This goes wrong if there are padding bits, but at least we can check for
    > that (UINT_MAX will be == ~0u if there are none).
    >
    > There is a bit-twiddling version that is one operation shorter:
    >
    > return x ^ (x ^ a) & mask;


    One C operation shorter, yes. One instruction deeper (3 rather than 2)
    on architectures sufficiently rich to parallelise the '&'s in the former.

    > but you'd need to justify the "eh?" this might prompt in the reader.


    Anyone who can't read that expression from left to right as
    "change in x the bits that are different between x and a within the mask"
    on the second reading shouldn't be reading code, but should be clicking
    on buttons in some GUI instead. It's a perfectly standard idiom amongst
    those who have used C as a high level assembler in every-tick-counts
    embedded work. Or at least should be.

    Phil
    --
    I find the easiest thing to do is to k/f myself and just troll away
    -- David Melville on r.a.s.f1
     
    Phil Carmody, May 3, 2010
    #12
  13. Alex Vinokur

    Phil Carmody Guest

    Ben Bacarisse <> writes:
    > context of K&R2. ~0 is often signed and negative in which case it is
    > undefined in C99 but not in C90 or in C++ (at least up to and including
    > 2003). It's safer to shift unsigned types so I'd suggest ~0u.


    ~0u may end up being converted to being signed on sufficiently
    bizarre architectures which permit the usual arithmetic
    conversions (6.3.1.8) to fall through to, and be caught by:

    Otherwise, if the type of the operand with signed
    integer type can represent all of the values of
    the type of the operand with unsigned integer
    type, then the operand with unsigned integer type
    is converted to the type of the operand with
    signed integer type.

    But that probably only affects people in 33-bit la-la-land.

    Phil
    --
    I find the easiest thing to do is to k/f myself and just troll away
    -- David Melville on r.a.s.f1
     
    Phil Carmody, May 3, 2010
    #13
  14. Phil Carmody <> writes:

    > Ben Bacarisse <> writes:
    >> Alex Vinokur <> writes:

    <snip>
    >>> setbits(x,p,n,y) returns x with the n bits that begin at position p
    >>> set to the rightmost n bits of y, leaving the other bits unchanged.

    <snip>
    >> unsigned width = CHAR_BIT * sizeof x;
    >> unsigned mask = ~0u >> width - n << p - n + 1;

    >
    > Alas that can't portably inject a zero-length bitstring.


    True. I don't know if that matters or not (the specification is a
    little loose) but I have already noted that JL's construction of the
    mask is superior and, since it handles 0 naturally, it wins all round:

    unsigned width = ~(~0u << n) << p - n + 1;
    return x & ~mask | (y << p - n + 1) & mask;

    If one goes on to interpret p as the least significant bit of those
    injected then there is no need for p - n + 1; and the solution is
    simpler still:

    unsigned mask = ~(~0u << n) << p;
    return x & ~mask | (y << p) & mask;

    which is what he posted (modulo parentheses).

    [I'm posting just to summarise what I think is the neatest solution.]

    <snip>
    --
    Ben.
     
    Ben Bacarisse, May 3, 2010
    #14
  15. On Mon, 3 May 2010, Phil Carmody wrote:

    > Ben Bacarisse <> writes:


    >> context of K&R2. ~0 is often signed and negative in which case it is
    >> undefined in C99 but not in C90 or in C++ (at least up to and including
    >> 2003). It's safer to shift unsigned types so I'd suggest ~0u.

    >
    > ~0u may end up being converted to being signed


    0u has type "unsigned int", it has the conversion rank of "int", so it is
    no candidate for integer promotion. Thus ~0u is still of type "unsigned
    int" (and has value UINT_MAX, C99 "6.5.3.3 Unary arithmetic operators"
    p4). If you use ~0u as one operand of a binary (as in, taking two
    arguments) operator that implies the usual arithmetic conversions, and the
    other operand is also of integer type, then

    - If the other operand has lesser conversion rank than "int", it is
    converted to "int" or "unsigned int", dependent on whether "int" can
    represent all values of the original type with the small conversion rank.
    If it is converted to "int", then that "int" is converted to "unsigned
    int", and the operator is evaluated.

    - If the other operand has the conversion rank of "int", it is either
    "int" or "unsigned int", and then see the last sentence the previous
    paragraph.

    - If the other operand has greater rank than that of "int", eg. it's a
    "long" (long long) or "long unsigned" (long long unsigned), or some other
    extended integer type, then ~0u is converted. If the other operand is
    "long", and "long" can represent all values of "unsigned int", then ~0u
    will be converted to "long", but the value won't change: ((long)UINT_MAX).
    If the other operand is "unsigned long", or it's "long" but "long" can't
    represent all "unsigned int" values, then the other operand will be
    converted to (or stay) "unsigned long", ~0u will be also converted to
    "unsigned long" (-> ((unsigned long)UINT_MAX)), and the operator will be
    evaluated.

    In short, ~0u always has type "unsigned int" in itself.

    Shifting doesn't imply the usual arithmetic conversions, see C99 6.5.7
    "Bitwise shift operators" p2-3: "Each of the operands shall have integer
    type. [...] The integer promotions are performed on each of the operands.
    The type of the result is that of the promoted left operand."

    Cheers,
    lacos
     
    Ersek, Laszlo, May 3, 2010
    #15
  16. "Ersek, Laszlo" <> writes:

    > On Mon, 3 May 2010, Phil Carmody wrote:
    >
    >> Ben Bacarisse <> writes:

    >
    >>> context of K&R2. ~0 is often signed and negative in which case it
    >>> is undefined in C99 but not in C90 or in C++ (at least up to and
    >>> including 2003). It's safer to shift unsigned types so I'd suggest
    >>> ~0u.

    >>
    >> ~0u may end up being converted to being signed

    >
    > 0u has type "unsigned int", it has the conversion rank of "int", so it
    > is no candidate for integer promotion.


    n1256.pdf 6.3.1.1 p2 says:

    An object or expression with an integer type whose integer conversion
    rank is less than /or equal to/ the rank of int and unsigned int

    (my emphasis). This is part of the paragraph that determines what types
    are candidates for integer promotion.

    This may be a change from the published C99 causes by one of the
    Technical Corrigenda because my copy of n1256.pdf has change bars here.
    None the less, it is still official "C".

    I was aware of the odd possibility that ~0u could be a signed int but it
    seemed way to fussy to bring it up! For one thing, there is nothing you
    can do about except hope that the implementation does the shifts as
    you'd expect anyway.

    BTW, I'd like to be wrong about this.

    <snip>
    --
    Ben.
     
    Ben Bacarisse, May 3, 2010
    #16
  17. On Mon, 3 May 2010, Ben Bacarisse wrote:

    > "Ersek, Laszlo" <> writes:
    >
    >> On Mon, 3 May 2010, Phil Carmody wrote:
    >>
    >>> Ben Bacarisse <> writes:

    >>
    >>>> context of K&R2. ~0 is often signed and negative in which case it
    >>>> is undefined in C99 but not in C90 or in C++ (at least up to and
    >>>> including 2003). It's safer to shift unsigned types so I'd suggest
    >>>> ~0u.
    >>>
    >>> ~0u may end up being converted to being signed

    >>
    >> 0u has type "unsigned int", it has the conversion rank of "int", so it
    >> is no candidate for integer promotion.

    >
    > n1256.pdf 6.3.1.1 p2 says:
    >
    > An object or expression with an integer type whose integer conversion
    > rank is less than /or equal to/ the rank of int and unsigned int
    >
    > (my emphasis). This is part of the paragraph that determines what types
    > are candidates for integer promotion.
    >
    > This may be a change from the published C99 causes by one of the
    > Technical Corrigenda because my copy of n1256.pdf has change bars here.
    > None the less, it is still official "C".


    Yes, my copy of C99 doesn't contain the emphasized words. TC2 entry 10
    adds them.

    Even so, 0u could be promoted to "int" only, and only if INT_MAX >=
    UINT_MAX. Considering sizeof(int) == sizeof(unsigned), this would require
    strictly more padding bits in "unsigned int" than in "signed int". That
    would fly in the face of the conversion rules:

    ----v----
    Otherwise, if the operand that has unsigned integer type has rank greater
    or equal to the rank of the type of the other operand, then the operand
    with signed integer type is converted to the type of the operand with
    unsigned integer type.
    ----^----

    and the candidate types for integer constants suffixed with "u" (unsigned,
    unsigned long, unsigned long long).

    Furthermore, INT_MAX > UINT_MAX is impossible as per C99 6.2.5 "Types" p9.
    Hence, for such a promotion to be possible, the implementation must make
    UINT_MAX = INT_MAX, and place strictly more padding bits in "unsigned int"
    than in "int", and the conversion from "int" to "unsigned int" would be
    lossy.

    I have no idea why TC2 entry 10 came to be.


    > I was aware of the odd possibility that ~0u could be a signed int but it
    > seemed way to fussy to bring it up! For one thing, there is nothing you
    > can do about except hope that the implementation does the shifts as
    > you'd expect anyway.
    >
    > BTW, I'd like to be wrong about this.


    Me too, absolutely. My C89 unsigned's suddenly silently turning into int's
    would be catastrophic. "unsigned" was the safe haven you once reached
    (after considering range) and stayed in.

    .... Ah, I think it's a wording bug in TC2 entry 10 (or not, if I also
    misunderstand 6.3.1.1 p3 "The integer promotions preserve value including
    sign").

    6.3.1.1 p2 has two sections. The first section says what may be used "in
    an expression wherever an int or unsigned int may be used". The strict
    less-than rank restriction might have exluded "int" and "unsigned"
    themselves, which is (considered very formally) nonsensical: "an int may
    not be used where an int may be used". TC2 might have fixed this.

    The second section describes the circumstances of the promotion, eg. the
    target type.

    The question is whether section 1 *implies* section 2, that is, if an
    object or expression with integer type may be used in place of an int or
    unsigned int, does that necessarily invoke promotion? This didn't matter
    with pre-TC2 C99, because even if there has been such an implication,
    "int" and "unsigned int" falsified the premise.

    I think the intent of the second section could be reformulated like this:

    ----v----
    /Except when the original type is unsigned int,/ f an int can represent
    all values of the original type, the value is converted to an int;
    otherwise, it is converted to an unsigned int. These are called the
    integer promotions. 48) All other types are unchanged by the integer
    promotions.
    ----^----

    This introduces some ambiguity in interpreting the "otherwise" branch, but
    I don't think it matters. There should be no difference between "an
    unsigned int is not promoted" and "an unsigned int is promoted to an
    unsigned int".

    Cheers,
    lacos
     
    Ersek, Laszlo, May 3, 2010
    #17
  18. Alex Vinokur

    Tim Rentsch Guest

    Eric Sosman <> writes:

    > On 5/2/2010 11:39 AM, Ben Bacarisse wrote:
    >> [...]
    >> ~0 is often signed and negative [...]

    >
    > In C, s/often/always/.


    Trivia question 1: Under what circumstances may the C expression

    ~0 > 0

    evaluate to 1?


    >> [...]
    >> This goes wrong if there are padding bits, but at least we can check for
    >> that (UINT_MAX will be == ~0u if there are none).

    >
    > Also if there are twenty. 6.5.3.3p4: "[...] If the promoted
    > type is an unsigned type, the expression ~E is equivalent to the
    > maximum value representable in that type minus E." As always, the
    > settings of any padding bits in the result of ~E (of any arithmetic
    > operation) are unspecified.


    Trivia question 2: Under what circumstances may the C expression

    UINT_MAX == (unsigned) -1 && UINT_MAX != ~0u

    evaluate to 1?
     
    Tim Rentsch, May 3, 2010
    #18
  19. Alex Vinokur

    Tim Rentsch Guest

    Phil Carmody <> writes:

    > Ben Bacarisse <> writes:
    >> context of K&R2. ~0 is often signed and negative in which case it is
    >> undefined in C99 but not in C90 or in C++ (at least up to and including
    >> 2003). It's safer to shift unsigned types so I'd suggest ~0u.

    >
    > ~0u may end up being converted to being signed on sufficiently
    > bizarre architectures which permit the usual arithmetic
    > conversions (6.3.1.8) to fall through to, and be caught by:
    >
    > Otherwise, if the type of the operand with signed
    > integer type can represent all of the values of
    > the type of the operand with unsigned integer
    > type, then the operand with unsigned integer type
    > is converted to the type of the operand with
    > signed integer type.
    >


    Right idea but not quite the right section; the usual
    arithmetic conversions don't apply to operands of ~.
     
    Tim Rentsch, May 3, 2010
    #19
  20. Alex Vinokur

    Eric Sosman Guest

    On 5/3/2010 2:12 PM, Tim Rentsch wrote:
    > Eric Sosman<> writes:
    >
    >> On 5/2/2010 11:39 AM, Ben Bacarisse wrote:
    >>> [...]
    >>> ~0 is often signed and negative [...]

    >>
    >> In C, s/often/always/.

    >
    > Trivia question 1: Under what circumstances may the C expression
    >
    > ~0> 0
    >
    > evaluate to 1?


    Ben Bacarisse gave the necessary hint in a response yesterday
    (which I acknowledged as correct yesterday). If ~0 produces a
    "minus zero" trap representation, the behavior is undefined -- in
    which case the `>' can yield 0, 1, or even -98.6 in blinking lights.

    > Trivia question 2: Under what circumstances may the C expression
    >
    > UINT_MAX == (unsigned) -1&& UINT_MAX != ~0u
    >
    > evaluate to 1?


    Three that come to mind:

    - When `UINT_MAX' is a user-defined macro or variable, not
    the macro defined by <limits.h>

    - When the code has (invalidly) #define'd the `unsigned'
    keyword to something screwy

    - When the implementation is non-conforming

    Can't think of any others, though.

    --
    Eric Sosman
    lid
     
    Eric Sosman, May 3, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    12
    Views:
    660
  2. Malcolm McLean

    Richard Heathfield's personal threads.

    Malcolm McLean, Nov 9, 2007, in forum: C Programming
    Replies:
    52
    Views:
    1,211
    Christopher Benson-Manica
    Nov 21, 2007
  3. Tomás Ó hÉilidhe

    [OFF-TOPIC] Animosity on the part of Mr Richard Heathfield

    Tomás Ó hÉilidhe, Dec 13, 2008, in forum: C Programming
    Replies:
    113
    Views:
    2,065
    Richard Bos
    Dec 22, 2008
  4. Anonymous

    Animosity on the part of Mr Richard Heathfield

    Anonymous, Dec 14, 2008, in forum: C Programming
    Replies:
    6
    Views:
    407
    CBFalconer
    Dec 21, 2008
  5. Alex Vinokur

    Richard Heathfield's setbits()

    Alex Vinokur, May 2, 2010, in forum: C++
    Replies:
    13
    Views:
    817
    Ben Bacarisse
    May 3, 2010
Loading...

Share This Page