function to count number of set bits

G

G Patel

Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it
on). Is it portable as far as C is concerned?

And also, if I declare a signed int in the main program and want to set
the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */

int count = bitcount(b);
/* is this undefined- trying to send a negative int to bitcount
function? */
 
E

Eric Sosman

G said:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }

I can't think of a reason why this would fail on a 1s complement or
sign-mag machine (and can't find a non 2s compliment machine to try it
on). Is it portable as far as C is concerned?

It makes no difference what representation the machine
uses for negative integers, because `x' cannot be negative!
I see no portability or conformance problems in the code.
And also, if I declare a signed int in the main program and want to set
the msb to 1, can I do this (32bit ints)?:

int b = 0x8000000;

You're missing a zero here, I believe.
/* is the 0x80000000 taken as an unsigned long constant or a signed
int? */

Neither: Given the assumption of 32 bits, the constant
(with all seven zeroes) will be an `unsigned int'. Since the
value of this constant is too large for plain `int', trying to
convert it to `int' either produces an implementation-defined
result or raises an implementation-defined signal. On most 32-bit
implementations, `b' will be initialized to INT_MIN, -2147483648 --
but the Standard doesn't actually guarantee this.
int count = bitcount(b);
/* is this undefined- trying to send a negative int to bitcount
function? */

No: conversion the other way, from signed to unsigned,
is well-defined. However, the conversion might change the
number of one-bits in the representation. For example, the
representation of -1 on a signed magnitude machine has two
one-bits. Converting this to `unsigned' produces the value
UINT_MAX, which has at least sixteen one-bits. Avoiding such
surprises is one of the reasons to stick to `unsigned' types
for bit-fiddling.
 
G

Gregory Toomey

G said:
Hi, I'm wondering if anyone knows if the following function will
function properly as a set-bit counter on non 2s complement machines
(as K&R2 implies).

| int bitcount(unsigned x)
| {
| int count;
|
| for(count = 0; x != 0; count++, x &= (x-1))
| ;
|
| return count;
| }
Yes it will work.

Heres two counters for 32 bit integers:

int Count (unsigned int w)
{
w = (0x55555555 & w) + (0x55555555 & (w>> 1));
w = (0x33333333 & w) + (0x33333333 & (w>> 2));
w = (0x0f0f0f0f & w) + (0x0f0f0f0f & (w>> 4));
w = (0x00ff00ff & w) + (0x00ff00ff & (w>> 8));
w = (0x0000ffff & w) + (0x0000ffff & (w>>16));
return w;
}

/* slightly faster */
int Count (unsigned int w)
{
const unsigned int all1 = ~0;
const unsigned int mask1h = all1 / 3 << 1;
const unsigned int mask2l = all1 / 5;
const unsigned int mask4l = all1 / 17;
w -= (mask1h & w) >> 1;
w = (w & mask2l) + ((w>>2) & mask2l);
w = w + (w >> 4) & mask4l;
w += w >> 8;
w += w >> 16;
return w & 0xff;
}

gtoomey
 
G

G Patel

Eric said:
It makes no difference what representation the machine
uses for negative integers, because `x' cannot be negative!
I see no portability or conformance problems in the code.


You're missing a zero here, I believe.


Neither: Given the assumption of 32 bits, the constant
(with all seven zeroes) will be an `unsigned int'. Since the
value of this constant is too large for plain `int', trying to
convert it to `int' either produces an implementation-defined
result or raises an implementation-defined signal. On most 32-bit
implementations, `b' will be initialized to INT_MIN, -2147483648 --
but the Standard doesn't actually guarantee this.

Thank you. Now I get it, I went and checked the C89 draft and found:

The type of an integer constant is the first of the corresponding list
in which its value can be represented. Unsuffixed decimal: int, long
int, unsigned long int; unsuffixed octal or hexadecimal: int, unsigned
int, long int, unsigned long int; suffixed by the letter u or U:
unsigned int, unsigned long int; suffixed by the letter l or L: long
int, unsigned long int; suffixed by both the letters u or U and l or L:
unsigned long int .

But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant
with the - unary operator imparted on it.

How would one explain these constants:

-0xF , -077, -56


No mention of "-" that go in front of constants in the standard. How
does the "-" affect the constant (I imagine it just takes all the
signed types in each list out of the picture).


[I wonder what the authors of my C textbook was thinking when he named
it: Teach yourself C in 24 hours ... they couldn't be more wrong]
 
B

Ben Pfaff

G Patel said:
But I'm wondering how the "-" plays into any of this (for negative
constants). Is the "-" part of the constant or is it just the constant
with the - unary operator imparted on it.

The latter.
 
G

G Patel

Ben said:
The latter.

Ok, so the - isn't part of the actual constant, only acts as a - unary
operator. But what about constants that "fall" into a unsigned type.
Wouldn't the whole expression be a sort of negative unsigned constant?
 
B

Ben Pfaff

G Patel said:
Ok, so the - isn't part of the actual constant, only acts as a - unary
operator. But what about constants that "fall" into a unsigned type.
Wouldn't the whole expression be a sort of negative unsigned constant?

Arithmetic in unsigned types "wraps around". A negative value in
an unsigned type is changed into a positive one by adding the
type's maximum value plus one until it is in range. Thus, -1u is
equal to UINT_MAX.
 
K

Kobu

Ben said:
The latter.

I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?
 
B

Ben Pfaff

Kobu said:
I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?

Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.

I have no idea why you think this has anything to do with
assignment.
 
K

Keith Thompson

Kobu said:
I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant.

You're mistaken; see below.
If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).

Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)
Where in the standard did you read this?

C99 6.4.4.1 defines the syntax of an integer constant; it doesn't
allow for a leading sign.
 
K

Kobu

Keith said:
You're mistaken; see below.


Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)

Yeap, you're right. Your comments in parenthesis directed me to
textbook.
I've been enlightened.
 
C

CBFalconer

Keith said:
.... snip ...


Do you have an example where this matters? (Note that there are no
integer constants for types shorter than signed or unsigned int.)

Look at the way INT_MIN is usually defined in limits.h. For a 16
bit 2's complement int machine, it would normally be (-INT_MAX -
1). If it were written as -32768 it would create an overflow on
input.
 
L

Luke Wu

Kobu said:
I don't think that is true. I think the "-" sign in front of an
otherwise unadorned arithmetic constant is inherently part of the
constant. If we considered the "-" to be an "add on" that works just
as a unary - operator, then we would have to perform UAC/Promotion on
assignments (where otherwise, an assignment has no promotion/UAC, just
straight conversion to the type of the left operands).
Where in the standard did you read this?


Actually it is treated as a unary - operator. In terms of C, the
constant is just the arithmetic term without a sign. But of course,
with the - in front of it, the negation is calculated at compile time,
and the post-negated constant sits in memory along with corresponding
assembly opcode(s) (calculation is not done at run time, there will be
no evidence of the original positive constant within the instructions).
But run-time instructions at assembly level are not a C issue.
 
E

Eric Sosman

CBFalconer said:
... snip ...



Look at the way INT_MIN is usually defined in limits.h. For a 16
bit 2's complement int machine, it would normally be (-INT_MAX -
1). If it were written as -32768 it would create an overflow on
input.

Not an overflow, exactly, but a result with the wrong
type and the wrong value. Since the constant is too large
for `int' but within range for an `unsigned int' it would
have the latter type; in effect it would be 32768u. The
"negation" would, under the rules of unsigned arithmetic,
reduce modulo 65536u to yield the value 32768u again. Thus
you'd have the, er, "anomalous" condition INT_MIN > INT_MAX!

Note that `(int)-32768' wouldn't work, since the value
must be a constant expression and constant expressions can't
contain cast operators.
 
R

Richard Bos

Luke Wu said:
Actually it is treated as a unary - operator. In terms of C, the
constant is just the arithmetic term without a sign. But of course,
with the - in front of it, the negation is calculated at compile time,
and the post-negated constant sits in memory along with corresponding
assembly opcode(s) (calculation is not done at run time, there will be
no evidence of the original positive constant within the instructions).

Not necessarily. Nothing in the Standard forbids implementations from
computing the negation at run-time. Of course, nobody does so because
it's simple to do at compile-time and potentially a great win, but this
is not required.

Richard
 
L

Lawrence Kirby

CBFalconer wrote:
....


Not an overflow, exactly, but a result with the wrong
type and the wrong value. Since the constant is too large
for `int' but within range for an `unsigned int' it would
have the latter type; in effect it would be 32768u. The
"negation" would, under the rules of unsigned arithmetic,
reduce modulo 65536u to yield the value 32768u again. Thus
you'd have the, er, "anomalous" condition INT_MIN > INT_MAX!

Note that `(int)-32768' wouldn't work, since the value
must be a constant expression and constant expressions can't
contain cast operators.

Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

Lawrence
 
K

Keith Thompson

Lawrence Kirby said:
Casts are allowed in constant expressions although there are some
restrictions which don't apply here. The problem with (int)-32768 where
INT_MAX is 32767 and UINT_MAX is 65535 is that it is equivalent to
(int)(32768U) i.e. you are trying to convert to a signed integer type a
value that is not representable in that type. That's undefined in C90 and
in C99 you get an implementation-defined value or signal.

And if the implementation-defined value happens to be -32768, the
implementation is free to use it as the definition of INT_MIN. (There
are good reasons not to do so. Keeping <limits.h> in sync with the
compiler might be non-trivial, and you might as well have a portable
definition of INT_MIN if you can.)
 
J

Joe Wright

Ben said:
Where did you find anything in the standard that talks about
negative constants? The grammar shows `-' as a unary operator,
not as part of a primary expression.
#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

How about here?

I must be missing something. Help me here. Buy me a beer in Menlo Park.
 
B

Ben Pfaff

Joe Wright said:
#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

How about here?

1 primary-expression:
identifier
constant
string-literal
( expression )

You're trying to make some kind of pedantic point by saying that
you parenthesized it, therefore it's a primary expression. But
you know what my point is: there is no alternative listed as
`- constant', nor is the negative sign part of `constant'
itself (except perhaps as part of an exponent, etc.). There is
no such thing as a negative constant, only a constant to which
you apply a negating unary operator.
 
K

Keith Thompson

Joe Wright said:
#define EOF (-1)

Is '-' part of a primary expression?

#define LONG_MIN (-2147483647L-1L)

How about here?

Both of these are primary expressions because they're parenthesized.

Without the parentheses, neither -1 nor -2147483647L-1L is a primary
expression.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top