clearing low order bits / making a multiple of power of 2

J

John Devereux

Hi,

What is the best way to ensure an integer is a multiple of a given
power of 2?

How about

int size;

...

size ^= (size & (4-1)); /* ensure multiple of 4 */

or

size &= -4; /* ensure multiple of 4 */

?

Thanks,
 
K

Keith Thompson

John Devereux said:
What is the best way to ensure an integer is a multiple of a given
power of 2?

i = 0;

If that doesn't do what you want, you should be more explicit about
your requirements. Defining the problem precisely often gets you at
least halfway to solving it.
 
T

Tim Rentsch

John Devereux said:
Hi,

What is the best way to ensure an integer is a multiple of a given
power of 2?

How about

int size;

...

size ^= (size & (4-1)); /* ensure multiple of 4 */

or

size &= -4; /* ensure multiple of 4 */

size -= size % 4;
 
J

John Devereux

Keith Thompson said:
i = 0;

If that doesn't do what you want, you should be more explicit about
your requirements. Defining the problem precisely often gets you at
least halfway to solving it.

Well, the examples you snipped were supposed to do that! :) But I will
try to say it in words. I also now realise I should have used unsigned
ints.

i is an unsigned integer (0,1,2,3,4,5 etc)

n is a compile-time unsigned constant integer (or literal) that is an
unsigned integer power of 2 (1,2,4,8 etc).

What I need is to round i down to the nearest multiple of n, or leave
unchanged if already a multiple of n. (The application is to generate
aligned addresses for low-level programming).

I know it's fairly trivial, I was just wondering what the "standard"
idiom was for this.
 
J

John Devereux

Tim Rentsch said:
size -= size % 4;

That does express what I want more directly. I would worry about
generating a call to a divide function. It doesn't on my gcc, even
with no optimisation. (With optimisation on, it reduces to 2
instructions on arm).

But my ones are better, both reducing to one instruction :)

-O0 -O2

size -= size % 4; 5 2

size ^= (size & (4-1)); 5 1

size &= -4; 3 1


Apologies if this sort of detail is off-topic for the group.
 
T

Tim Rentsch

John Devereux said:
That does express what I want more directly. I would worry about
generating a call to a divide function. It doesn't on my gcc, even
with no optimisation. (With optimisation on, it reduces to 2
instructions on arm).

But my ones are better, both reducing to one instruction :)

-O0 -O2

size -= size % 4; 5 2

size ^= (size & (4-1)); 5 1

size &= -4; 3 1


Apologies if this sort of detail is off-topic for the group.

An advantage of 'size -= size % 4;' is that it works for the case
you stated, where the variable type is 'int'. It also works for
values other than powers of two. And, what it's doing is obvious;
there's no need to decipher bit values or operations, or wonder if
on some platform they might blow up in your face after promoting to
an overly long int type. The generated code will most likely be
good or even very good when the variable is unsigned and the
rounding down is done to a power of two.

If it were important for some reason to use one of the bitwise
forms, I would normally recommend

size &= ADDRESS_MASK; /* or some other appropriate name */

to show what's going on. Meaning clear, and easy to generate good
code. Good practice would define ADDRESS_MASK with a type that
would guarantee that the AND'ing operation were done with unsigned
types rather than signed types - something like

#define ADDRESS_MASK ((uintmax_t)-4)

This definition prevents surprises that might arise from C's rules
for how different types of values are converted in expressions.
 
J

John Devereux

Tim Rentsch said:
An advantage of 'size -= size % 4;' is that it works for the case
you stated, where the variable type is 'int'. It also works for
values other than powers of two. And, what it's doing is obvious;
there's no need to decipher bit values or operations, or wonder if
on some platform they might blow up in your face after promoting to
an overly long int type. The generated code will most likely be
good or even very good when the variable is unsigned and the
rounding down is done to a power of two.

All agreed.
If it were important for some reason to use one of the bitwise
forms, I would normally recommend

size &= ADDRESS_MASK; /* or some other appropriate name */

to show what's going on. Meaning clear, and easy to generate good
code. Good practice would define ADDRESS_MASK with a type that
would guarantee that the AND'ing operation were done with unsigned
types rather than signed types - something like

#define ADDRESS_MASK ((uintmax_t)-4)

This definition prevents surprises that might arise from C's rules
for how different types of values are converted in expressions.

The issue of "surprises" is why I posted here; If it turns out that

size &= -4;

always in fact works, then I think I would prefer it. It is concise
and easy to remember. But if there is doubt, then your solutions are
obviously much better.
 
A

Alex Fraser

[snip]
The issue of "surprises" is why I posted here; If it turns out that

size &= -4;

always in fact works, then I think I would prefer it. It is concise
and easy to remember. But if there is doubt, then your solutions are
obviously much better.

If the type of "size" is unsigned int or a larger unsigned type, the above
will always work (ie round down to a multiple of four).

With two's complement representation, I think it will always round towards
negative infinity if "size" is any integer type.

Alex
 
T

Tim Rentsch

John Devereux said:
The issue of "surprises" is why I posted here; If it turns out that

size &= -4;

always in fact works, then I think I would prefer it. It is concise
and easy to remember. But if there is doubt, then your solutions are
obviously much better.

There are two reasons to prefer 'size & ADDRESS_MASK' to
'size & -4'. The first is just good development practice:
normally it's good to avoid "magic constants" in open code,
and -4 is a "magic constant". I know this factor isn't what
you asked about, but even so I think it's worth mentioning.

The other reason has to do with correctness. The expression
'size & -4' may work or it may not, depending on the type of
size. As a reader, I have to wonder about the type of 'size'
in evaluating what it does. The rules of type promotion in
C aren't understood exactly by many developers, and even
those who understand them pretty well can be caught off
guard at times. For example, if we have:

uint32_t size = initializing_expression();
size &= -4;

it's possible that the second line won't do the right thing.
The reason is that, on a platform where 'int' is 64 bits,
the operation is done using 'int' operands rather than
'unsigned int' operands. Granted, the likelihood of getting
unexpected behavior may be small; but still it is possible,
and developers who are careful concern themselves with such
things.

Let me also mention another approach that you might find
useful. If we define:

#define ROUND_MOD(v,m) \
( CAN_USE_ANDING(m) ? (v) & -(uintmax_t)(m) : (v) - (v)%(m) )

#define CAN_USE_ANDING(m) ( (m) > 0 && ((m) & (m)-1) == 0 )

then we can write:

size = ROUND_MOD(size,4);

Of course, the expressions involved look horrendous. But
even compilers that only fair at optimizing should be able
to fold all the constants and generate code that's just as
good as if the AND operation had been written directly.
Moreover, this definition of ROUND_MOD will work regardless
of whether 'm' is a power of two or not: if it is, AND'ing
is used; if it isn't, the more general expression using '%'
is used instead.
 
T

Tim Rentsch

Alex Fraser said:
[snip]
The issue of "surprises" is why I posted here; If it turns
out that

size &= -4;

always in fact works, then I think I would prefer it. It is
concise and easy to remember. But if there is doubt, then
your solutions are obviously much better.

If the type of "size" is unsigned int or a larger unsigned
type, the above will always work (ie round down to a
multiple of four).
Right.


With two's complement representation, I think it will always
round towards negative infinity if "size" is any [signed]
integer type.

Not quite. It's possible in this case that the '&' with a
negative operand will yield a trap representation. (Trap
representations are possible even when the implementation
uses a two's complement representation.) That means the
possibility of undefined behavior rather than a valid value.
 
J

John Devereux

Tim Rentsch said:
....

There are two reasons to prefer 'size & ADDRESS_MASK' to
'size & -4'. The first is just good development practice:
normally it's good to avoid "magic constants" in open code,
and -4 is a "magic constant". I know this factor isn't what
you asked about, but even so I think it's worth mentioning.

The other reason has to do with correctness. The expression
'size & -4' may work or it may not, depending on the type of
size. As a reader, I have to wonder about the type of 'size'
in evaluating what it does. The rules of type promotion in
C aren't understood exactly by many developers, and even
those who understand them pretty well can be caught off
guard at times. For example, if we have:

uint32_t size = initializing_expression();
size &= -4;

it's possible that the second line won't do the right thing.
The reason is that, on a platform where 'int' is 64 bits,
the operation is done using 'int' operands rather than
'unsigned int' operands. Granted, the likelihood of getting
unexpected behavior may be small; but still it is possible,
and developers who are careful concern themselves with such
things.


OK, but I wonder does it in fact ever do the wrong thing, even for the
case you mention (i.e. using int operands)? Or any other integral
operands?
Let me also mention another approach that you might find
useful. If we define:

#define ROUND_MOD(v,m) \
( CAN_USE_ANDING(m) ? (v) & -(uintmax_t)(m) : (v) - (v)%(m) )

#define CAN_USE_ANDING(m) ( (m) > 0 && ((m) & (m)-1) == 0 )

then we can write:

size = ROUND_MOD(size,4);

Of course, the expressions involved look horrendous. But
even compilers that only fair at optimizing should be able
to fold all the constants and generate code that's just as
good as if the AND operation had been written directly.
Moreover, this definition of ROUND_MOD will work regardless
of whether 'm' is a power of two or not: if it is, AND'ing
is used; if it isn't, the more general expression using '%'
is used instead.

Interesting, thanks.
 
T

Tim Rentsch

John Devereux said:
Tim Rentsch said:
[snip]
The expression
'size & -4' may work or it may not, depending on the type of
size. As a reader, I have to wonder about the type of 'size'
in evaluating what it does. The rules of type promotion in
C aren't understood exactly by many developers, and even
those who understand them pretty well can be caught off
guard at times. For example, if we have:

uint32_t size = initializing_expression();
size &= -4;

it's possible that the second line won't do the right thing.
The reason is that, on a platform where 'int' is 64 bits,
the operation is done using 'int' operands rather than
'unsigned int' operands. Granted, the likelihood of getting
unexpected behavior may be small; but still it is possible,
and developers who are careful concern themselves with such
things.


OK, but I wonder does it in fact ever do the wrong thing, even
for the case you mention (i.e. using int operands)? Or any other
integral operands?

Getting a question like this one, especially in comp.lang.c,
it's usually good to give three answers.

Seen in the light of what the standard says, it's certainly
possible that there will be undefined behavior; there are
some grey areas in the standard, but this isn't one of them.
That settles the issue as far as comp.lang.c is concerned.

At the other end of the spectrum, practically speaking,
most developers simply won't encounter an implementation
where 'size &= -4;' misbehaves. If you had to bet, unless
there is some reason to believe that something about the
situation is unusual, the right bet is that 'size &= -4;'
will behave as desired.

In between, my guess is there are some implementations --
even if you and I are unlikely to encounter them -- where
'size &= -4;' doesn't do what you want (for some values of
signed integer types). Because I don't know and because I
think it's likely that such implmentations are out there
_somewhere_, I think it's wise to steer clear of potential
undefined behavior. This question is one where it's very
hard to be sure. So, it seems better to choose the safer
path.

Furthermore it's good to develop the habit. Even if this
particular case never causes you a problem, having the habit
of writing code that is safe across widely different
implementations will probably spare you some later grief.
 
A

Alex Fraser

Tim Rentsch said:
Alex Fraser said:
[snip]
The issue of "surprises" is why I posted here; If it turns
out that

size &= -4;

always in fact works, then I think I would prefer it. It is
concise and easy to remember. But if there is doubt, then
your solutions are obviously much better.

If the type of "size" is unsigned int or a larger unsigned
type, the above will always work (ie round down to a
multiple of four).
Right.

With two's complement representation, I think it will always
round towards negative infinity if "size" is any [signed]
integer type.

I wrote "any integer type" - I meant precisely that, signed or unsigned.
Not quite. It's possible in this case that the '&' with a
negative operand will yield a trap representation.

I can't find any explicit indication of this, but in any case it is highly
unlikely in practice.

Alex
 
J

John Devereux

Tim Rentsch said:
John Devereux said:
Tim Rentsch said:
[snip]
The expression
'size & -4' may work or it may not, depending on the type of
size. As a reader, I have to wonder about the type of 'size'
in evaluating what it does. The rules of type promotion in
C aren't understood exactly by many developers, and even
those who understand them pretty well can be caught off
guard at times. For example, if we have:

uint32_t size = initializing_expression();
size &= -4;

it's possible that the second line won't do the right thing.
The reason is that, on a platform where 'int' is 64 bits,
the operation is done using 'int' operands rather than
'unsigned int' operands. Granted, the likelihood of getting
unexpected behavior may be small; but still it is possible,
and developers who are careful concern themselves with such
things.


OK, but I wonder does it in fact ever do the wrong thing, even
for the case you mention (i.e. using int operands)? Or any other
integral operands?

Getting a question like this one, especially in comp.lang.c,
it's usually good to give three answers.

Seen in the light of what the standard says, it's certainly
possible that there will be undefined behavior; there are
some grey areas in the standard, but this isn't one of them.
That settles the issue as far as comp.lang.c is concerned.

At the other end of the spectrum, practically speaking,
most developers simply won't encounter an implementation
where 'size &= -4;' misbehaves. If you had to bet, unless
there is some reason to believe that something about the
situation is unusual, the right bet is that 'size &= -4;'
will behave as desired.

In between, my guess is there are some implementations --
even if you and I are unlikely to encounter them -- where
'size &= -4;' doesn't do what you want (for some values of
signed integer types). Because I don't know and because I
think it's likely that such implmentations are out there
_somewhere_, I think it's wise to steer clear of potential
undefined behavior. This question is one where it's very
hard to be sure. So, it seems better to choose the safer
path.

Furthermore it's good to develop the habit. Even if this
particular case never causes you a problem, having the habit
of writing code that is safe across widely different
implementations will probably spare you some later grief.

Thanks for the thoughtful response.

I think I will use your original

size -= size%4;

If execution time is absolutely critical, I will use

size &= -4;

and check things carefully.

I will steer clear of the "macro magic", although I appreciate the
technique.

Thanks,
 
J

John Devereux

Alexei A. Frounze said:
...

Good optimizer will optimize size%4 to faster and simpler analog (to bitwise
AND).

Yes, I reported before my results on gcc-arm. (3.3.2).
-O0 -O2

size -= size % 4; 5 2

size ^= (size & (4-1)); 5 1

size &= -4; 3 1

But, I just tried with the latest gcc (4.1) and it now compiles all
three methods down to the same single machine instruction!

bic r0, r0, #3

So there is no excuse for me not to use the correct method.
 
O

Old Wolf

Keith said:
i = 0;

If that doesn't do what you want, you should be more explicit about
your requirements. Defining the problem precisely often gets you at
least halfway to solving it.

I hate to be the bearer of bad news, but 0 is not a power of 2. :)
 
T

Tim Rentsch

Alex Fraser said:
Tim Rentsch said:
Alex Fraser said:
[snip]
The issue of "surprises" is why I posted here; If it turns
out that

size &= -4;

always in fact works, then I think I would prefer it. It is
concise and easy to remember. But if there is doubt, then
your solutions are obviously much better.

If the type of "size" is unsigned int or a larger unsigned
type, the above will always work (ie round down to a
multiple of four).
Right.

With two's complement representation, I think it will always
round towards negative infinity if "size" is any [signed]
integer type.

I wrote "any integer type" - I meant precisely that, signed or unsigned.

Ok. I guessed you were talking about signed because of the
description of rounding toward negative infinity. For
unsigned there is no difference between rounding toward
negative infinity and rounding toward zero.

I can't find any explicit indication of this, but in any case it is highly
unlikely in practice.

I agree it's unlikely to encounter such an implementation.
I don't know of one. However, the standard allows them, and
they certainly could be useful in ways that implementations
without trap representations aren't, so I wouldn't be
surprised to find that there are some.

The explicit indication is in 6.2.6.2 p2, together with the
description of how '&' works.
 
A

Alex Fraser

Tim Rentsch said:
With two's complement representation, I think it will always
round towards negative infinity if "size" is any [signed]
integer type.

I wrote "any integer type" - I meant precisely that, signed or
unsigned.

Ok. I guessed you were talking about signed because of the
description of rounding toward negative infinity. For
unsigned there is no difference between rounding toward
negative infinity and rounding toward zero.

Exactly; I was too lazy to be explicit. False economy :).
I can't find any explicit indication of this, but in any case it is
highly unlikely in practice.
[snip]
The explicit indication is in 6.2.6.2 p2, together with the
description of how '&' works.

Because the bitwise AND of the padding bits in the (promoted) operands may
generate a trap representation?

Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top