Trap representations for unsigned integers

Discussion in 'C Programming' started by Army1987, Apr 21, 2007.

  1. Army1987

    Army1987 Guest

    If uMyInt_t is an unsigned integral type, is the following a
    necessary and sufficient condition that uMyInt_t has no trap
    representation?

    (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1

    That is, I'm asking wheter it equals 0 whenever uMyInt_t has trap
    representations, equals a nonzero value whenever uMyInt_t has no
    trap representation, and never triggers undefined behaviour.

    --
    #include <stdio.h>
    char s[]="\16Jsa ukenethr ,cto haCr\n";int main(void){*s*=5;*
    s%=23;putchar(s[0]);return*s-14?main():!putchar(9[s+*s]);}
     
    Army1987, Apr 21, 2007
    #1
    1. Advertising

  2. Army1987 wrote:
    > If uMyInt_t is an unsigned integral type, is the following a
    > necessary and sufficient condition that uMyInt_t has no trap
    > representation?
    >
    > (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1


    No. If uMyInt_t has padding bits, you will right-shift by a number
    greater than (or equal to) the number of value bits, and for that the
    behaviour is undefined.
     
    =?utf-8?B?SGFyYWxkIHZhbiBExLNr?=, Apr 21, 2007
    #2
    1. Advertising

  3. Army1987

    Eric Sosman Guest

    Army1987 wrote:
    > If uMyInt_t is an unsigned integral type, is the following a
    > necessary and sufficient condition that uMyInt_t has no trap
    > representation?
    >
    > (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1
    >
    > That is, I'm asking wheter it equals 0 whenever uMyInt_t has trap
    > representations, equals a nonzero value whenever uMyInt_t has no
    > trap representation, and never triggers undefined behaviour.


    I think there are at least two problems with this test.
    First, if uMyInt_t has padding bits the shift count may be
    too large and lead to undefined behavior ("may" because of
    possible promotion to int or unsigned int). Second, the
    presence of padding bits does not imply the existence of trap
    representations: the extra bits may just be along for the ride.

    The best way to detect padding bits may be to count the
    number of 1's in (uMyInt_t)-1, or to compare the numeric value
    of (uMyInt_t)-1 to the "expected" quantity. The first test is
    easy at run time but difficult or impossible at preprocessing
    time; the second has problems, too (what type should you use
    to form the expected value?). I can think of no reliable way
    to determine whether trap representations exist; if you find
    there are no padding bits you can deduce that there are no
    traps, but that's as far as I think you can get.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Apr 21, 2007
    #3
  4. Army1987

    Army1987 Guest

    "Harald van D?k" <> ha scritto nel messaggio
    news:...
    > Army1987 wrote:
    >> If uMyInt_t is an unsigned integral type, is the following a
    >> necessary and sufficient condition that uMyInt_t has no trap
    >> representation?
    >>
    >> (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1

    >
    > No. If uMyInt_t has padding bits, you will right-shift by a number
    > greater than (or equal to) the number of value bits, and for that the
    > behaviour is undefined.


    Is

    ( DBL_MAX >= (uMyInt_t)(-1) || (puts("Your DS9K is about to self-\
    destruct. Get a real computer."), exit(1)),
    /* On the DeathStation 9000 exit(1) activates self-destruction */
    ceil(log2((uMyInt_t)(-1))) >= CHAR_BIT*sizeof(uMyInt_t) )

    any better?
     
    Army1987, Apr 21, 2007
    #4
  5. Army1987 wrote:
    > "Harald van D?k" <> ha scritto nel messaggio
    > news:...
    > > Army1987 wrote:
    > >> If uMyInt_t is an unsigned integral type, is the following a
    > >> necessary and sufficient condition that uMyInt_t has no trap
    > >> representation?
    > >>
    > >> (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1

    > >
    > > No. If uMyInt_t has padding bits, you will right-shift by a number
    > > greater than (or equal to) the number of value bits, and for that the
    > > behaviour is undefined.

    >
    > Is
    >
    > ( DBL_MAX >= (uMyInt_t)(-1) || (puts("Your DS9K is about to self-\
    > destruct. Get a real computer."), exit(1)),
    > /* On the DeathStation 9000 exit(1) activates self-destruction */
    > ceil(log2((uMyInt_t)(-1))) >= CHAR_BIT*sizeof(uMyInt_t) )
    >
    > any better?


    On the DS9K, DBL_MAX would be large enough, but CHAR_BIT *
    sizeof(uMuInt_t) would give the wrong result because SIZE_MAX is too
    small. :)

    Seriously though, your current expression is already no longer an
    integer constant expression, so at this point there's no downside to
    just writing a function.
     
    =?utf-8?B?SGFyYWxkIHZhbiBExLNr?=, Apr 21, 2007
    #5
  6. Army1987

    Army1987 Guest

    "Eric Sosman" <> ha scritto nel messaggio
    news:...
    > Army1987 wrote:
    >> If uMyInt_t is an unsigned integral type, is the following a
    >> necessary and sufficient condition that uMyInt_t has no trap
    >> representation?
    >>
    >> (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1
    >>
    >> That is, I'm asking wheter it equals 0 whenever uMyInt_t has trap
    >> representations, equals a nonzero value whenever uMyInt_t has no
    >> trap representation, and never triggers undefined behaviour.


    [snip] Second, the
    > presence of padding bits does not imply the existence of trap
    > representations: the extra bits may just be along for the ride.


    So I'll replace "necessary and sufficient condition" with
    "sufficient condition".

    What I was thinking of is something like:

    #include <string.h>
    unsigned char randchar();
    /* Get a random integer from 0 to UCHAR_MAX */

    unsigned long longrand()
    {
    unsigned long result;
    if (NO_TRAP(unsigned long)) {
    int i;
    unsigned char res[sizeof result];
    for (i=0; i<sizeof result; i++)
    res = randchar();
    memcpy(&result, res, sizeof result);
    } else {
    /* invent something else */
    }
    return result;
    }

    I would still be able to use the algorithm with the right result (a
    uniformly distributed random integer from 0 to UINT_MAX) if there
    are padding bits but they are ignored.
     
    Army1987, Apr 21, 2007
    #6
  7. Army1987

    Army1987 Guest

    "Harald van D?k" <> ha scritto nel messaggio
    news:...
    > Army1987 wrote:


    >> Is
    >>
    >> ( DBL_MAX >= (uMyInt_t)(-1) || (puts("Your DS9K is about to self-\
    >> destruct. Get a real computer."), exit(1)),
    >> /* On the DeathStation 9000 exit(1) activates self-destruction */
    >> ceil(log2((uMyInt_t)(-1))) >= CHAR_BIT*sizeof(uMyInt_t) )
    >>
    >> any better?

    >
    > On the DS9K, DBL_MAX would be large enough, but CHAR_BIT *
    > sizeof(uMuInt_t) would give the wrong result because SIZE_MAX is too
    > small. :)

    On C99 I might use (uintmax_t)CHAR_BIT*sizeof(uMyInt_t)...


    > Seriously though, your current expression is already no longer an
    > integer constant expression, so at this point there's no downside to
    > just writing a function.


    I was thinking of using a macro, so I could write
    NO_PADDING(size_t), NO_PADDING(unsigned int), etc. There's no way
    to do that with a function.
     
    Army1987, Apr 21, 2007
    #7
  8. Army1987 wrote:
    > "Eric Sosman" <> ha scritto nel messaggio
    > news:...
    > > Army1987 wrote:
    > >> If uMyInt_t is an unsigned integral type, is the following a
    > >> necessary and sufficient condition that uMyInt_t has no trap
    > >> representation?
    > >>
    > >> (uMyInt_t)(-1) >> CHAR_BIT*sizeof(uMyInt_t)-1
    > >>
    > >> That is, I'm asking wheter it equals 0 whenever uMyInt_t has trap
    > >> representations, equals a nonzero value whenever uMyInt_t has no
    > >> trap representation, and never triggers undefined behaviour.

    >
    > [snip] Second, the
    > > presence of padding bits does not imply the existence of trap
    > > representations: the extra bits may just be along for the ride.

    >
    > So I'll replace "necessary and sufficient condition" with
    > "sufficient condition".
    >
    > What I was thinking of is something like:
    >
    > #include <string.h>
    > unsigned char randchar();
    > /* Get a random integer from 0 to UCHAR_MAX */
    >
    > unsigned long longrand()
    > {
    > unsigned long result;
    > if (NO_TRAP(unsigned long)) {
    > int i;
    > unsigned char res[sizeof result];
    > for (i=0; i<sizeof result; i++)
    > res = randchar();
    > memcpy(&result, res, sizeof result);
    > } else {
    > /* invent something else */
    > }
    > return result;
    > }
    >
    > I would still be able to use the algorithm with the right result (a
    > uniformly distributed random integer from 0 to UINT_MAX) if there
    > are padding bits but they are ignored.


    Why not

    unsigned long result = randchar();
    if ((unsigned long) -1 > UCHAR_MAX)
    {
    size_t i;
    for (i = sizeof result - 1; i != 0; i--)
    result = (result << CHAR_BIT) | randchar();
    }

    It works regardless of any padding bits.
     
    =?utf-8?B?SGFyYWxkIHZhbiBExLNr?=, Apr 21, 2007
    #8
  9. Army1987 wrote:
    > "Harald van D?k" <> ha scritto nel messaggio
    > news:...
    > > Army1987 wrote:

    >
    > >> Is
    > >>
    > >> ( DBL_MAX >= (uMyInt_t)(-1) || (puts("Your DS9K is about to self-\
    > >> destruct. Get a real computer."), exit(1)),
    > >> /* On the DeathStation 9000 exit(1) activates self-destruction */
    > >> ceil(log2((uMyInt_t)(-1))) >= CHAR_BIT*sizeof(uMyInt_t) )
    > >>
    > >> any better?

    > >
    > > On the DS9K, DBL_MAX would be large enough, but CHAR_BIT *
    > > sizeof(uMuInt_t) would give the wrong result because SIZE_MAX is too
    > > small. :)

    > On C99 I might use (uintmax_t)CHAR_BIT*sizeof(uMyInt_t)...


    And on C90, you can use a cast to unsigned long.

    > > Seriously though, your current expression is already no longer an
    > > integer constant expression, so at this point there's no downside to
    > > just writing a function.

    >
    > I was thinking of using a macro, so I could write
    > NO_PADDING(size_t), NO_PADDING(unsigned int), etc. There's no way
    > to do that with a function.


    You could do

    #define NO_PADDING(type) (count_bits((type) -1) == sizeof(type))

    where count_bits accepts an unsigned long / uintmax_t.
     
    =?utf-8?B?SGFyYWxkIHZhbiBExLNr?=, Apr 21, 2007
    #9
  10. Harald van Dijk wrote:
    > You could do
    >
    > #define NO_PADDING(type) (count_bits((type) -1) == sizeof(type))


    .... == sizeof(type) * CHAR_BIT
     
    =?utf-8?B?SGFyYWxkIHZhbiBExLNr?=, Apr 21, 2007
    #10
  11. Army1987

    Army1987 Guest

    "Harald van D?k" <> ha scritto nel messaggio
    news:...
    > Why not
    >
    > unsigned long result = randchar();
    > if ((unsigned long) -1 > UCHAR_MAX)
    > {
    > size_t i;
    > for (i = sizeof result - 1; i != 0; i--)
    > result = (result << CHAR_BIT) | randchar();
    > }
    >
    > It works regardless of any padding bits.


    I hadn't thought of that before... (Mathematically x << y << z << t
    is equivalent to x << (y+z+t), but I hadn't thought to the fact
    that the former might be valid C even when the latter may be UB.)
    Thanks.
     
    Army1987, Apr 21, 2007
    #11
  12. Harald van Dijk <> wrote:
    > > unsigned char randchar();

    >
    > unsigned long result = randchar();
    > if ((unsigned long) -1 > UCHAR_MAX)
    > {
    >     size_t i;
    >     for (i = sizeof result - 1; i != 0; i--)
    >         result = (result << CHAR_BIT) | randchar();
    > }
    >
    > It works regardless of any padding bits.


    True, but it may call randchar() more times than needed
    on a hypothetical implementation where there are LOTS
    of padding bits.

    unsigned long m = -1, result = randchar();
    while ((m = m >> (CHAR_BIT - 1) >> 1) != 0)
    result = (result << CHAR_BIT) | randchar();

    --
    Peter
     
    Peter Nilsson, Apr 23, 2007
    #12
  13. Army1987

    Army1987 Guest

    "Peter Nilsson" <> ha scritto nel messaggio
    news:...
    >Harald van D?k <> wrote:
    >> > unsigned char randchar();

    >>
    >> unsigned long result = randchar();
    >> if ((unsigned long) -1 > UCHAR_MAX)
    >> {
    >> size_t i;
    >> for (i = sizeof result - 1; i != 0; i--)
    >> result = (result << CHAR_BIT) | randchar();
    >> }
    >>
    >> It works regardless of any padding bits.

    >
    > True, but it may call randchar() more times than needed
    > on a hypothetical implementation where there are LOTS
    > of padding bits.
    >
    > unsigned long m = -1, result = randchar();
    > while ((m = m >> (CHAR_BIT - 1) >> 1) != 0)
    > result = (result << CHAR_BIT) | randchar();


    Or something more readable:

    #include <limits.h>
    #if __STDC_VERSION__ >= 199901L
    #include <stdint.h>
    #else
    #define uintmax_t unsigned long
    #endif

    int count_bits(uintmax_t n)
    {
    int result = 1;
    while (n/=2)
    result++;
    return result;
    }

    #define PADDING(t) ( CHAR_BIT*sizeof(t) - count_bits((t)(-1)) )

    unsigned long longrand()
    {
    size_t i;
    unsigned long result = 0;
    size_t bytes = sizeof result - PADDING(unsigned long)/CHAR_BIT;
    for (i=0; i<bytes; i++) {
    result <<= CHAR_BIT;
    result |= randchar();
    }
    return result;
    }

    (Did I get this right?)
     
    Army1987, Apr 24, 2007
    #13
  14. "Army1987" <> writes:
    [...]
    > #include <limits.h>
    > #if __STDC_VERSION__ >= 199901L
    > #include <stdint.h>
    > #else
    > #define uintmax_t unsigned long
    > #endif


    Why are you using #define rather than typedef?

    The test for __STDC_VERSION__ *should* work, but unfortunately in real
    life it may not be reliable. The __STDC_VERSION__ macro typically
    tells you whether the compiler claims to support C99; this may or may
    not tell you whether the <stdint.h> header exists. You can also get a
    false negative, if the compiler doesn't claim to support C99, but it
    supports unsigned long long as an extension. (Most such compilers
    support a mode in which they conform to C90 without supporting
    C99-specific features; the trick is to make sure the compiler is
    actually invoked in such a mode.)

    A conditional #include would be handy here:

    #if header_exists <stdint.h>
    #include <stdint.h>
    #else
    typedef unsigned long uintmax_t
    #endif

    but, alas, that doesn't exist.

    Configuration systems like GNU autoconf can be helpful, but the
    details are off-topic.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 24, 2007
    #14
  15. Army1987

    pete Guest

    Peter Nilsson wrote:
    >
    > Harald van Dijk <> wrote:
    > > > unsigned char randchar();

    > >
    > > unsigned long result = randchar();
    > > if ((unsigned long) -1 > UCHAR_MAX)
    > > {
    > > Â Â size_t i;
    > > Â Â for (i = sizeof result - 1; i != 0; i--)
    > > Â Â Â Â result = (result << CHAR_BIT) | randchar();
    > > }
    > >
    > > It works regardless of any padding bits.

    >
    > True, but it may call randchar() more times than needed
    > on a hypothetical implementation where there are LOTS
    > of padding bits.
    >
    > unsigned long m = -1, result = randchar();
    > while ((m = m >> (CHAR_BIT - 1) >> 1) != 0)
    > result = (result << CHAR_BIT) | randchar();


    (result << CHAR_BIT) is undefined if sizeof(result) equals one.

    --
    pete
     
    pete, Apr 28, 2007
    #15
  16. Army1987

    Army1987 Guest

    "pete" <> ha scritto nel messaggio
    news:...
    > Peter Nilsson wrote:
    >>
    >> Harald van Dijk <> wrote:
    >> > > unsigned char randchar();
    >> >
    >> > unsigned long result = randchar();
    >> > if ((unsigned long) -1 > UCHAR_MAX)
    >> > {
    >> > Â Â size_t i;
    >> > Â Â for (i = sizeof result - 1; i != 0; i--)
    >> > Â Â Â Â result = (result << CHAR_BIT) | randchar();
    >> > }
    >> >
    >> > It works regardless of any padding bits.

    >>
    >> True, but it may call randchar() more times than needed
    >> on a hypothetical implementation where there are LOTS
    >> of padding bits.
    >>
    >> unsigned long m = -1, result = randchar();
    >> while ((m = m >> (CHAR_BIT - 1) >> 1) != 0)
    >> result = (result << CHAR_BIT) | randchar();

    >
    > (result << CHAR_BIT) is undefined if sizeof(result) equals one.


    if sizeof result is 1, then m is UCHAR_MAX, and
    m >> (CHAR_BIT - 1) >> 1 is 0.
    So the body loop is never done then.
    (I still wonder why uns_var >> value_bits and uns_var << value_bits
    are undefined. uns_var * 2**value_bits modulo 2**value_bits is 0,
    and so is floor(uns_var / 2**value_bits). That's what
    uns_var >> value_bits-1 >> 1 and uns_var << value_bits-1 << 1 do.
    Maybe I'll post that to comp.std.c.)
     
    Army1987, Apr 28, 2007
    #16
  17. Army1987

    Flash Gordon Guest

    Army1987 wrote, On 28/04/07 10:28:

    <snip>

    > (I still wonder why uns_var >> value_bits and uns_var << value_bits
    > are undefined. uns_var * 2**value_bits modulo 2**value_bits is 0,
    > and so is floor(uns_var / 2**value_bits). That's what
    > uns_var >> value_bits-1 >> 1 and uns_var << value_bits-1 << 1 do.
    > Maybe I'll post that to comp.std.c.)


    The most probable reason for it being undefined is because the way
    processors will treat it if you use assembler instructions to do it varies.
    --
    Flash Gordon
     
    Flash Gordon, Apr 28, 2007
    #17
  18. "Army1987" <> wrote:
    > "Peter Nilsson" <> :
    > > > > unsigned char randchar();

    > > unsigned long m = -1, result = randchar();
    > > while ((m = m >> (CHAR_BIT - 1) >> 1) != 0)
    > > result = (result << CHAR_BIT) | randchar();

    >
    > Or something more readable:


    But less correct. ;-)

    > #include <limits.h>

    <snip>
    > #define PADDING(t) ( CHAR_BIT*sizeof(t) - count_bits((t)(-1)))


    Why are people so hung up on calculating the number of
    padding bits?

    > unsigned long longrand()
    > {
    > size_t i;
    > unsigned long result = 0;
    > size_t bytes = sizeof result - PADDING(unsigned long)
    > /CHAR_BIT;
    > for (i=0; i<bytes; i++) {
    > result <<= CHAR_BIT;
    > result |= randchar();
    > }
    > return result;
    >
    > }
    >
    > (Did I get this right?)


    No. If unsigned long is exactly 1 byte (ergo unpadded),
    then the left shifting of result by CHAR_BIT bits invokes
    undefined behaviour.

    --
    Peter
     
    Peter Nilsson, Apr 30, 2007
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Ammon
    Replies:
    5
    Views:
    2,801
    Old Wolf
    Aug 26, 2004
  2. pemo

    Trap Representations - c99 [again]

    pemo, Mar 15, 2006, in forum: C Programming
    Replies:
    10
    Views:
    448
    Chris Torek
    Mar 19, 2006
  3. Replies:
    7
    Views:
    300
    Branimir Maksimovic
    May 13, 2007
  4. Spiros Bousbouras

    Trap representations

    Spiros Bousbouras, Jan 22, 2009, in forum: C Programming
    Replies:
    23
    Views:
    833
    Richard Tobin
    Jan 29, 2009
  5. Ian Collins

    Re: C++ trap representations

    Ian Collins, Feb 2, 2009, in forum: C Programming
    Replies:
    3
    Views:
    390
Loading...

Share This Page