bitwise & operator and counting set bits

Discussion in 'C Programming' started by sathyashrayan, Apr 9, 2005.

  1. Group,
    Following function will check weather a bit
    is set in the given variouble x.

    int bit_count(long x)
    {
    int n = 0;
    /*
    ** The loop will execute once for each bit of x
    set,
    */
    if (x)
    do
    n++;
    while (0 != (x = x&(x-1)));
    return (n);
    }

    But the same action in the following function is
    confusing me a lot.

    int bitcount(long i)
    {
    i = ((i & 0xAAAAAAAA) >> 1) + (i &
    0x55555555);
    i = ((i & 0xCCCCCCCC) >> 2) + (i &
    0x33333333);
    i = ((i & 0xF0F0F0F0) >> 4) + (i &
    0x0F0F0F0F);
    i = ((i & 0xFF00FF00) >> 8) + (i &
    0x00FF00FF);
    i = ((i & 0xFFFF0000) >> 16) + (i &
    0x0000FFFF);
    return (int)i;
    }

    All those hex number are magic numbers? Is the
    above an implementation defined?
    Please clear my doubt. I am posting all the two
    codes which I have taken from C snippet
    archives.

    -------------------code-1------------------

    #include <stdio.h>
    #include <stdlib.h>
    #define plural_text(n) &"s"[(1 == (n))]
    int bitcount(long i)
    {
    i = ((i & 0xAAAAAAAA) >> 1) + (i &
    0x55555555);
    i = ((i & 0xCCCCCCCC) >> 2) + (i &
    0x33333333);
    i = ((i & 0xF0F0F0F0) >> 4) + (i &
    0x0F0F0F0F);
    i = ((i & 0xFF00FF00) >> 8) + (i &
    0x00FF00FF);
    i = ((i & 0xFFFF0000) >> 16) + (i &
    0x0000FFFF);
    return (int)i;
    }
    int main(int argc, char *argv[])
    {
    long n;
    while(--argc)
    {
    int i;
    n = atol(*++argv);
    i = bitcount(n);
    printf("%ld contains %d bit%s
    set\n",n, i, plural_text(i));
    }
    return 0;
    }

    ----------------code-2----------------------------
    -
    #include <stdio.h>
    #include <stdlib.h>

    #define plural_text(n) &"s"[(1 == (n))]

    int bit_count(long x)
    {
    int n = 0;
    /*
    ** The loop will execute once for each bit of x
    set, this is in average
    ** twice as fast as the shift/test method.
    */
    if (x)
    do
    n++;
    while (0 != (x = x&(x-1)));
    return (n);
    }
    int main(int argc, char *argv[])
    {
    long n;
    while(--argc)
    {
    int i;
    n = atol(*++argv);
    i = bit_count(n);
    printf("%ld contains %d bit%s
    set\n",n, i, plural_text(i));
    }
    return 0;
    }
     
    sathyashrayan, Apr 9, 2005
    #1
    1. Advertising

  2. In article <425778fe$0$13276$>,
    sathyashrayan <> wrote:
    >But the same action in the following function is
    >confusing me a lot.


    >int bitcount(long i)
    >{
    > i = ((i & 0xAAAAAAAA) >> 1) + (i &
    >0x55555555);
    > i = ((i & 0xCCCCCCCC) >> 2) + (i &
    >0x33333333);


    >All those hex number are magic numbers?


    Yes, they assume 32 bit longs, which is implementation dependant.
    --
    Are we *there* yet??
     
    Walter Roberson, Apr 9, 2005
    #2
    1. Advertising

  3. sathyashrayan

    Chris Torek Guest

    In article <425778fe$0$13276$>,
    sathyashrayan <> wrote:
    > Group,
    > Following function will check weather a bit
    >is set in the given variouble x.
    > int bit_count(long x)

    [snippage]

    Actually, it (non-portably -- x should have type "unsigned long")
    counts the *number* of bits set.

    I think it is time to re-post this Moldy Oldie :) The stuff at
    the top is not Standard C, and the timing code as well, but the
    bit-counting functions are at least commented. Note that I wrote
    this shortly after the 1989 C Standard came out, before it was
    widely available -- we were not even using GCC yet on most machines.


    #ifndef lint
    static char rcsid[] = "$Id: bct.c,v 1.5 90/10/13 08:44:12 chris Exp $";
    #endif

    /*
    * bct - bitcount timing
    *
    * Runs a bunch of different functions-to-count-bits-in-a-longword
    * (many assume 32 bits per long) with a timer around each loop, and
    * tries to calcuate the time used.
    */
    #include <stdio.h>
    #include <sys/types.h>

    #ifdef FD_SETSIZE
    # define USE_GETRUSAGE
    # include <sys/time.h>
    # include <sys/resource.h>
    #else
    # include <sys/times.h>
    #endif

    #define SIZEOF(a) (sizeof(a)/sizeof(a[0]))


    /*
    * This function is used to calibrate the timing mechanism.
    * This way we can subtract the loop and call overheads.
    */
    int
    calibrate(n)
    register unsigned long n;
    {
    return 0;
    }


    /*
    * This function counts the number of bits in a long.
    * It is limited to 63 bit longs, but a minor mod can cope with 511 bits.
    *
    * It is so magic, an explanation is required:
    * Consider a 3 bit number as being
    * 4a+2b+c
    * if we shift it right 1 bit, we have
    * 2a+b
    * subtracting this from the original gives
    * 2a+b+c
    * if we shift the original 2 bits right we get
    * a
    * and so with another subtraction we have
    * a+b+c
    * which is the number of bits in the original number.
    * Suitable masking allows the sums of the octal digits in a
    * 32 bit number to appear in each octal digit. This isn't much help
    * unless we can get all of them summed together.
    * This can be done by modulo arithmetic (sum the digits in a number by molulo
    * the base of the number minus one) the old "casting out nines" trick they
    * taught in school before calculators were invented.
    * Now, using mod 7 wont help us, because our number will very likely have
    * more than 7 bits set. So add the octal digits together to get base64
    * digits, and use modulo 63.
    * (Those of you with 64 bit machines need to add 3 octal digits together to
    * get base512 digits, and use mod 511.)
    *
    * This is HACKMEM 169, as used in X11 sources.
    */
    int
    t0_hackmemmod(n)
    register unsigned long n;
    {
    register unsigned long tmp;

    tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
    return ((tmp + (tmp >> 3)) & 030707070707) % 63;
    }


    /*
    * This is the same as the above, but does not use the % operator.
    * Most modern machines have clockless division, and so the modulo is as
    * expensive as, say, an addition.
    */
    int
    t1_hackmemloop(n)
    register unsigned long n;
    {
    register unsigned long tmp;

    tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
    tmp = (tmp + (tmp >> 3)) & 030707070707;
    while (tmp > 63)
    tmp = (tmp & 63) + (tmp >> 6);
    return tmp;
    }

    /*
    * Above, without using while loop. It takes at most 5 iterations of the
    * loop, so we just do all 5 in-line. The final result is never 63
    * (this is assumed above as well).
    */
    int
    t2_hackmemunrolled(n)
    register unsigned long n;
    {
    register unsigned long tmp;

    tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
    tmp = (tmp + (tmp >> 3)) & 030707070707;
    tmp = (tmp & 63) + (tmp >> 6);
    tmp = (tmp & 63) + (tmp >> 6);
    tmp = (tmp & 63) + (tmp >> 6);
    tmp = (tmp & 63) + (tmp >> 6);
    tmp = (tmp & 63) + (tmp >> 6);
    return (tmp);
    }


    /*
    * This function counts the bits in a long.
    *
    * It removes the lsb and counting the number of times round the loop.
    * The expression (n & -n) yields the lsb of a number,
    * but it only works on 2's compliment machines.
    */
    int
    t3_rmlsbsub(n)
    register unsigned long n;
    {
    register int count;

    for (count = 0; n; n -= (n & -n))
    count++;
    return count;
    }

    int
    t4_rmlsbmask(n)
    register unsigned long n;
    {
    register int count;

    for (count = 0; n; count++)
    n &= n - 1; /* take away lsb */
    return (count);
    }

    /*
    * This function counts the bits in a long.
    *
    * It works by shifting the number down and testing the bottom bit.
    */
    int
    t5_testlsb(n)
    register unsigned long n;
    {
    register int count;

    for (count = 0; n; n >>= 1)
    if (n & 1)
    count++;
    return count;
    }

    /*
    * This function counts the bits in a long.
    *
    * It works by shifting the number left and testing the top bit.
    * On many machines shift is expensive, so it uses a cheap addition instead.
    */
    int
    t6_testmsb(n)
    register unsigned long n;
    {
    register int count;

    for (count = 0; n; n += n)
    if (n & ~(~(unsigned long)0 >> 1))
    count++;
    return count;
    }

    int
    t7_testsignandshift(n)
    register unsigned long n;
    {
    register int count;

    for (count = 0; n; n <<= 1)
    if ((long)n < 0)
    count++;
    return (count);
    }


    /*
    * This function counts the bits in a long.
    *
    * It works by masking each bit.
    * This is the second most intuitively obvious method,
    * and is independent of the number of bits in the long.
    */
    int
    t8_testeachbit(n)
    register unsigned long n;
    {
    register int count;
    register unsigned long mask;

    count = 0;
    for (mask = 1; mask; mask += mask)
    if (n & mask)
    count++;
    return count;
    }


    /*
    * This function counts the bits in a long.
    *
    * It works by masking each bit.
    * This is the most intuitively obvious method,
    * but how do you a priori know how many bits in the long?
    * (except for ''sizeof(long) * CHAR_BITS'' expression)
    */
    int
    t9_testeachbit1shl(n)
    register unsigned long n;
    {
    register int count;
    register int bit;

    count = 0;
    for (bit = 0; bit < 32; ++bit)
    if (n & ((unsigned long)1 << bit))
    count++;
    return count;
    }

    static char nbits[256] = {
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
    };

    static int inbits[256] = {
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
    };

    int
    tA_tableshift(n)
    register unsigned long n;
    {

    return (nbits[n & 0xff] + nbits[(n >> 8) & 0xff] +
    nbits[(n >> 16) & 0xff] + nbits[n >> 24]);
    }

    int
    tB_tableuchar(n)
    unsigned long n;
    {
    register unsigned char *p = (unsigned char *)&n;

    return (nbits[p[0]] + nbits[p[1]] + nbits[p[2]] + nbits[p[3]]);
    }

    int
    tC_tableshiftcast(n)
    register unsigned long n;
    {

    return nbits[(unsigned char) n] +
    nbits[(unsigned char) (n >> 8)] +
    nbits[(unsigned char) (n >> 16)] +
    nbits[(unsigned char) (n >> 24)];
    }

    int
    tD_itableshift(n)
    register unsigned long n;
    {

    return (inbits[n & 0xff] + inbits[(n >> 8) & 0xff] +
    inbits[(n >> 16) & 0xff] + inbits[n >> 24]);
    }

    int
    tE_itableuchar(n)
    unsigned long n;
    {
    register unsigned char *p = (unsigned char *)&n;

    return (inbits[p[0]] + inbits[p[1]] + inbits[p[2]] + inbits[p[3]]);
    }

    int
    tF_itableshiftcast(n)
    register unsigned long n;
    {

    return inbits[(unsigned char) n] +
    inbits[(unsigned char) (n >> 8)] +
    inbits[(unsigned char) (n >> 16)] +
    inbits[(unsigned char) (n >> 24)];
    }

    /*
    * Explanation:
    * First we add 32 1-bit fields to get 16 2-bit fields.
    * Each 2-bit field is one of 00, 01, or 10 (binary).
    * We then add all the two-bit fields to get 8 4-bit fields.
    * These are all one of 0000, 0001, 0010, 0011, or 0100.
    *
    * Now we can do something different, becuase for the first
    * time the value in each k-bit field (k now being 4) is small
    * enough that adding two k-bit fields results in a value that
    * still fits in the k-bit field. The result is four 4-bit
    * fields containing one of {0000,0001,...,0111,1000} and four
    * more 4-bit fields containing junk (sums that are uninteresting).
    * Pictorially:
    * n = 0aaa0bbb0ccc0ddd0eee0fff0ggg0hhh
    * n>>4 = 00000aaa0bbb0ccc0ddd0eee0fff0ggg
    * sum = 0aaaWWWWiiiiXXXXjjjjYYYYkkkkZZZZ
    * where W, X, Y, and Z are the interesting sums (each at most 1000,
    * or 8 decimal). Masking with 0x0f0f0f0f extracts these.
    *
    * Now we can change tactics yet again, because now we have:
    * n = 0000WWWW0000XXXX0000YYYY0000ZZZZ
    * n>>8 = 000000000000WWWW0000XXXX0000YYYY
    * so sum = 0000WWWW000ppppp000qqqqq000rrrrr
    * where p and r are the interesting sums (and each is at most
    * 10000, or 16 decimal). The sum `q' is junk, like i, j, and
    * k above; but it is not necessarry to discard it this time.
    * One more fold, this time by sixteen bits, gives
    * n = 0000WWWW000ppppp000qqqqq000rrrrr
    * n>>16 = 00000000000000000000WWWW000ppppp
    * so sum = 0000WWWW000ppppp000sssss00tttttt
    * where s is at most 11000 and t is it most 100000 (32 decimal).
    *
    * Now we have t = r+p = (Z+Y)+(X+W) = ((h+g)+(f+e))+((d+c)+(b+a)),
    * or in other words, t is the number of bits set in the original
    * 32-bit longword. So all we have to do is return the low byte
    * (or low 6 bits, but `low byte' is typically just as easy if not
    * easier).
    *
    * This technique is also applicable to 64 and 128 bit words, but
    * 256 bit or larger word sizes require at least one more masking
    * step.
    */
    int
    tG_sumbits(n)
    register unsigned long n;
    {

    n = (n & 0x55555555) + ((n >> 1) & 0x55555555);
    n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
    n = (n + (n >> 4)) & 0x0f0f0f0f;
    n += n >> 8;
    n += n >> 16;
    return (n & 0xff);
    }

    /*
    * build a long random number.
    * The standard rand() returns at least a 15 bit number.
    * We use the top 9 of 15 (since the lower N bits of the usual rand()
    * repeat with a period of 2^N).
    */
    unsigned long
    bigrand()
    {
    #define randbits() ((unsigned long)((rand() >> 6) & 0777))
    register unsigned long r;

    r = randbits() << 27;
    r |= randbits() << 18;
    r |= randbits() << 9;
    r |= randbits();
    return (r);
    }

    /*
    * Run the test many times.
    * You will need a running time in the 10s of seconds
    * for an accurate result.
    *
    * To give a fair comparison, the random number generator
    * is seeded the same for each measurement.
    *
    * Return value is seconds per iteration.
    */
    #ifndef REPEAT
    #if defined(mips) || defined(sparc)
    #define REPEAT (1L<<20)
    #else
    #define REPEAT (1L<<18)
    #endif
    #endif

    double
    measure(func)
    register int (*func)();
    {
    #ifdef USE_GETRUSAGE
    struct rusage ru0, ru1;
    register long j;

    srand(1);
    (void) getrusage(RUSAGE_SELF, &ru0);
    for (j = 0; j < REPEAT; ++j)
    func(bigrand());
    (void) getrusage(RUSAGE_SELF, &ru1);
    ru1.ru_utime.tv_sec -= ru0.ru_utime.tv_sec;
    if ((ru1.ru_utime.tv_usec -= ru0.ru_utime.tv_usec) < 0) {
    ru1.ru_utime.tv_usec += 1000000;
    ru1.ru_utime.tv_sec--;
    }
    return ((ru1.ru_utime.tv_sec + (ru1.ru_utime.tv_usec / 1000000.0)) /
    (double)REPEAT);
    #else
    register long j;
    struct tms start;
    struct tms finish;

    srand(1);
    times(&start);
    for (j = 0; j < REPEAT; ++j)
    func(bigrand());
    times(&finish);
    return ((finish.tms_utime - start.tms_utime) / (double)REPEAT);
    #endif
    }

    struct table {
    char *name;
    int (*func)();
    double measurement;
    int trial;
    };

    struct table table[] =
    {
    { "hackmemmod", t0_hackmemmod },
    { "hackmemloop", t1_hackmemloop },
    { "hackmemunrolled", t2_hackmemunrolled },
    { "rmlsbsub", t3_rmlsbsub },
    { "rmlsbmask", t4_rmlsbmask },
    { "testlsb", t5_testlsb },
    { "testmsb", t6_testmsb },
    { "testsignandshift", t7_testsignandshift },
    { "testeachbit", t8_testeachbit },
    { "testeachbit1shl", t9_testeachbit1shl },
    { "tableshift", tA_tableshift },
    { "tableuchar", tB_tableuchar },
    { "tableshiftcast", tC_tableshiftcast },
    { "itableshift", tD_itableshift },
    { "itableuchar", tE_itableuchar },
    { "itableshiftcast", tF_itableshiftcast },
    { "sumbits", tG_sumbits },
    };

    main(argc, argv)
    int argc;
    char **argv;
    {
    double calibration, v, best;
    int j, k, m, verbose;

    verbose = argc > 1;
    /*
    * run a few tests to make sure they all agree
    */
    srand(getpid());
    for (j = 0; j < 10; ++j) {
    unsigned long n;
    int bad;

    n = bigrand();
    for (k = 0; k < SIZEOF(table); ++k)
    table[k].trial = table[k].func(n);
    bad = 0;
    for (k = 0; k < SIZEOF(table); ++k)
    for (m = 0; m < SIZEOF(table); ++m)
    if (table[k].trial != table[m].trial)
    bad = 1;
    if (bad) {
    printf("wrong: %08lX", n);
    for (k = 0; k < SIZEOF(table); ++k)
    printf(" %3d", table[k].trial);
    printf("\n");
    }
    }

    /*
    * calibrate the timing mechanism
    */
    calibration = measure(calibrate);
    if (verbose)
    printf("calibration: %g\n", calibration);

    /*
    * time them all, keeping track of the best (smallest)
    */
    for (j = 0; j < SIZEOF(table); ++j) {
    v = measure(table[j].func) - calibration;
    if (verbose)
    printf("%s: %g\n", table[j].name, v);
    table[j].measurement = v;
    if (!j || v < best)
    best = v;
    }
    (void) printf("%-24s %-14sratio\n", "function", "time");
    for (j = 0; j < SIZEOF(table); ++j) {
    (void) printf("%-24s %#10.8g %6.3f\n",
    table[j].name,
    table[j].measurement,
    table[j].measurement / best);
    }
    exit(0);
    }
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Apr 9, 2005
    #3
  4. sathyashrayan

    Guest

    lets understand the magic numbers first
    AAAAAAAA = 1010 1010 1010 1010
    55555555 = 0101 0101 0101 0101
    alternative 1s and zeros
    CCCCCCCC = 1100 1100 1100 1100
    33333333 = 0011 0011 0011 0011
    2 ones and 2 zeros
    then 4 ones 4 zeros
    then 8 ones 8 zeros
    then 16 ones 16 zeros
    so lets go with
    dec 99 -> hex 63 -> bin 0110 0011


    now we can see that number of high bits is the sum of all the bit
    fields
    0+1+1+0+0+0+1+1 = 4
    this is same as adding consecutive bits and then addin them together
    (0+1)+(1+0)+(0+0)+(1+1)=0101 0010 and this is wat you get after first
    step.
    0110 0011 & 1010 1010 1010 1010 = 0010 0010 > 1= 0001 0001
    0110 0011 & 0101 0101 0101 0101 = 0100 0001
    add 0101 0010 :) (2 bits can have maximum two high bits and their sum
    is 2(10) that can be accomodated in 2 bits)

    now take 4 bits together
    ((0+1)+(1+0))+((0+0)+(1+1))=(01+01)+(00+10)=0010 0010 (4 bits can have
    maximum four high bits and their sum is 4(100) that can be accomodated
    in 4 bits)
    same as wat you get after step 2
    now take 8 bits together
    (((0+1)+(1+0))+((0+0)+(1+1)))=((01+01)+(00+10))=0010+0010=0000 0100
    wat you get after step 3
    this is the answer 0000 0100 = 4

    hope that clears yr doubt
     
    , Apr 10, 2005
    #4
  5. <> wrote in message news:...
    > lets understand the magic numbers first
    > AAAAAAAA = 1010 1010 1010 1010
    > 55555555 = 0101 0101 0101 0101
    > alternative 1s and zeros
    > CCCCCCCC = 1100 1100 1100 1100
    > 33333333 = 0011 0011 0011 0011
    > 2 ones and 2 zeros
    > then 4 ones 4 zeros
    > then 8 ones 8 zeros
    > then 16 ones 16 zeros
    > so lets go with
    > dec 99 -> hex 63 -> bin 0110 0011
    >
    >
    > now we can see that number of high bits is the sum of all the bit
    > fields
    > 0+1+1+0+0+0+1+1 = 4
    > this is same as adding consecutive bits and then addin them together
    > (0+1)+(1+0)+(0+0)+(1+1)=0101 0010 and this is wat you get after first
    > step.
    > 0110 0011 & 1010 1010 1010 1010 = 0010 0010 > 1= 0001 0001
    > 0110 0011 & 0101 0101 0101 0101 = 0100 0001
    > add 0101 0010 :) (2 bits can have maximum two high bits and their sum
    > is 2(10) that can be accomodated in 2 bits)
    >
    > now take 4 bits together
    > ((0+1)+(1+0))+((0+0)+(1+1))=(01+01)+(00+10)=0010 0010 (4 bits can have
    > maximum four high bits and their sum is 4(100) that can be accomodated
    > in 4 bits)
    > same as wat you get after step 2
    > now take 8 bits together
    > (((0+1)+(1+0))+((0+0)+(1+1)))=((01+01)+(00+10))=0010+0010=0000 0100
    > wat you get after step 3
    > this is the answer 0000 0100 = 4
    >
    > hope that clears yr doubt
    >


    Thanks a lot. It does help


    --
    "combination is the heart of chess"

    A.Alekhine

    Mail to:
    sathyashrayan AT gmail DOT com
     
    sathyashrayan, Apr 10, 2005
    #5
  6. "sathyashrayan" <> wrote in message
    news:425778fe$0$13276$...
    > Group,
    > Following function will check weather a bit
    > is set in the given variouble x.
    >
    > int bit_count(long x)
    > {
    > int n = 0;
    > /*
    > ** The loop will execute once for each bit of x
    > set,
    > */
    > if (x)
    > do
    > n++;
    > while (0 != (x = x&(x-1)));
    > return (n);
    > }
    >
    > But the same action in the following function is
    > confusing me a lot.
    >
    > int bitcount(long i)
    > {
    > i = ((i & 0xAAAAAAAA) >> 1) + (i &
    > 0x55555555);
    > i = ((i & 0xCCCCCCCC) >> 2) + (i &
    > 0x33333333);
    > i = ((i & 0xF0F0F0F0) >> 4) + (i &
    > 0x0F0F0F0F);
    > i = ((i & 0xFF00FF00) >> 8) + (i &
    > 0x00FF00FF);
    > i = ((i & 0xFFFF0000) >> 16) + (i &
    > 0x0000FFFF);
    > return (int)i;
    > }
    >
    > All those hex number are magic numbers? Is the
    > above an implementation defined?
    > Please clear my doubt. I am posting all the two
    > codes which I have taken from C snippet
    > archives.


    They are not magic numbers. They are bitmasks, progressively masking
    off consecutive bits, 1, 2, 4, 8, 16.

    It may have been clearer had it been written:

    i = ((i >> 1) & 0x55555555)
    + (i & 0x55555555);

    i = ((i >> 2) & 0x33333333)
    + (i & 0x33333333);

    i = ((i >> 4) & 0x0F0F0F0F)
    + (i & 0x0F0F0F0F);

    i = ((i >> 8) & 0x00FF00FF)
    + (i & 0x00FF00FF);

    i = ((i >> 16)& 0x0000FFFF) >> 16)
    + (i & 0x0000FFFF);
    return (int)i;

    This way you can see that the first expression is 16 parallel adds of single
    bits into 2-bit result fields. (The bit counts of 2 bit fields)

    The next is 8 parallel adds of 2 bits into a 4 bit result field.
    (the sums of adjacent prior bit counts)

    etc...

    The final result being the sum of sums (total bit count for a 32 bit number)

    Rufus
     
    Rufus V. Smith, Apr 12, 2005
    #6
  7. "Rufus V. Smith" <> wrote in message
    news:1113314507.697052106a46bcd7df71daf9003ca3cf@teranews...
    >

    <snip>
    > They are not magic numbers. They are bitmasks, progressively masking
    > off consecutive bits, 1, 2, 4, 8, 16.
    >
    > It may have been clearer had it been written:
    >

    <snip>
    >
    > i = ((i >> 16)& 0x0000FFFF) >> 16)
    > + (i & 0x0000FFFF);
    > return (int)i;


    Oops, forgot to get rid of that second shift during cut and paste.
    Should be:

    > i = ((i >> 16)& 0x0000FFFF)
    > + (i & 0x0000FFFF);
    > return (int)i;


    Sorry about that.

    Rufus
     
    Rufus V. Smith, Apr 12, 2005
    #7
  8. "Chris Torek" <> wrote in message
    news:...
    > In article <425778fe$0$13276$>,
    > /*
    > * Explanation:
    > * First we add 32 1-bit fields to get 16 2-bit fields.
    > * Each 2-bit field is one of 00, 01, or 10 (binary).
    > * We then add all the two-bit fields to get 8 4-bit fields.
    > * These are all one of 0000, 0001, 0010, 0011, or 0100.
    > *
    > * Now we can do something different, becuase for the first
    > * time the value in each k-bit field (k now being 4) is small
    > * enough that adding two k-bit fields results in a value that
    > * still fits in the k-bit field. The result is four 4-bit
    > * fields containing one of {0000,0001,...,0111,1000} and four
    > * more 4-bit fields containing junk (sums that are uninteresting).
    > * Pictorially:
    > * n = 0aaa0bbb0ccc0ddd0eee0fff0ggg0hhh
    > * n>>4 = 00000aaa0bbb0ccc0ddd0eee0fff0ggg
    > * sum = 0aaaWWWWiiiiXXXXjjjjYYYYkkkkZZZZ
    > * where W, X, Y, and Z are the interesting sums (each at most 1000,
    > * or 8 decimal). Masking with 0x0f0f0f0f extracts these.
    > *
    > * Now we can change tactics yet again, because now we have:
    > * n = 0000WWWW0000XXXX0000YYYY0000ZZZZ
    > * n>>8 = 000000000000WWWW0000XXXX0000YYYY
    > * so sum = 0000WWWW000ppppp000qqqqq000rrrrr
    > * where p and r are the interesting sums (and each is at most
    > * 10000, or 16 decimal). The sum `q' is junk, like i, j, and
    > * k above; but it is not necessarry to discard it this time.
    > * One more fold, this time by sixteen bits, gives
    > * n = 0000WWWW000ppppp000qqqqq000rrrrr
    > * n>>16 = 00000000000000000000WWWW000ppppp
    > * so sum = 0000WWWW000ppppp000sssss00tttttt
    > * where s is at most 11000 and t is it most 100000 (32 decimal).
    > *
    > * Now we have t = r+p = (Z+Y)+(X+W) = ((h+g)+(f+e))+((d+c)+(b+a)),
    > * or in other words, t is the number of bits set in the original
    > * 32-bit longword. So all we have to do is return the low byte
    > * (or low 6 bits, but `low byte' is typically just as easy if not
    > * easier).
    > *
    > * This technique is also applicable to 64 and 128 bit words, but
    > * 256 bit or larger word sizes require at least one more masking
    > * step.
    > */
    > int
    > tG_sumbits(n)
    > register unsigned long n;
    > {
    >
    > n = (n & 0x55555555) + ((n >> 1) & 0x55555555);
    > n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
    > n = (n + (n >> 4)) & 0x0f0f0f0f;
    > n += n >> 8;
    > n += n >> 16;
    > return (n & 0xff);
    > }
    >


    Nice optimization job on that parallel adder, Chris!

    I'm sure I'll find other gems as I read that posting!

    For myself, and on behalf of others, thanks for
    that post!

    Rufus
     
    Rufus V. Smith, Apr 14, 2005
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sarmin kho
    Replies:
    2
    Views:
    855
    A. Lloyd Flanagan
    Jun 15, 2004
  2. Manish_Ganvir
    Replies:
    13
    Views:
    1,630
    Keith Thompson
    Feb 14, 2005
  3. Ioannis Vranos
    Replies:
    8
    Views:
    356
    James Kanze
    Nov 14, 2008
  4. Shao Miller

    Bitwise Operator Effects on Padding Bits

    Shao Miller, Sep 6, 2010, in forum: C Programming
    Replies:
    61
    Views:
    1,911
    Ben Bacarisse
    Oct 21, 2010
  5. Hannes
    Replies:
    6
    Views:
    8,582
    Paul Uiterlinden
    Feb 16, 2011
Loading...

Share This Page