Example of the optimiser recognising a pattern

Tomás Ó hÉilidhe · May 2, 2008

I'm working with a microcontroller at the moment that has a single
instruction for clearing a bit in a byte.

I started off with the following line of code:

x &= ~0x8u; /* Clear the 4th bit */

But then I changed it to the following because I thought I might get
more efficient assembler out of it:

x &= 0xF7u; /* Clear the 4th bit */

Suprisingly, the compiler produced more efficient code for the latter,
presumably because it recognises the pattern of " x &= ~y" for
clearing a single bit.

Anyway just thought I'd give an example of someone winding up with
less efficient code when their aim was to make the code more
efficient :-D

Tomás Ó hÉilidhe · May 2, 2008

Suprisingly, the compiler produced more efficient code for the latter,

Obviously that should be "former". That's what I get for writing
constructs that I don't use in my everyday speech.

Ian Collins · May 3, 2008

Tomás Ó hÉilidhe said:
I'm working with a microcontroller at the moment that has a single
instruction for clearing a bit in a byte.

I started off with the following line of code:

x &= ~0x8u; /* Clear the 4th bit */

But then I changed it to the following because I thought I might get
more efficient assembler out of it:

x &= 0xF7u; /* Clear the 4th bit */

Suprisingly, the compiler produced more efficient code for the latter,
presumably because it recognises the pattern of " x &= ~y" for
clearing a single bit.

Odd, is x an unsigned 8 bit type? If so, the two expressions should
generate identical code.

Tomás Ó hÉilidhe · May 3, 2008

Ian Collins:

Odd, is x an unsigned 8 bit type?

Yes, it is.

If so, the two expressions should
generate identical code.

If I do:

y &= ~0x08u;

then I get the following assembler:

BCF y, 0x3 /* Clear the 4th bit of y */

If I do:

y &= 0x7Fu;

then I get the following assembler:

MOVLW 0x7f /* Load the accumulator with 0x7f */
ANDWF y, F /* AND y with the accumulator
and store the result in y */

The former is one instruction, while the latter is two, and as all
instructions take the same amount of CPU cycles, the latter version is
exactly twice as slow.

Not only that, but things get even worse if you do the following:

if (whatever) y &= ~0x08u;

versus:

if (whatever) y &= 0x7Fu;

On the PIC micrcontroller, there's an instruction that does the
following: "Check whether the last arithmetic operation resulted in
zero, and if so, skip the next instruction". Since the former version
is comprised of a single instruction, this single instruction can be
skipped by the conditional. However, in the case of the latter form
which consists of two instructions, there has to be an interleaving
goto statement. Result: WAY slower.

Bartc · May 3, 2008

Tomás Ó hÉilidhe said:
Ian Collins:

Yes, it is.

If I do:

y &= ~0x08u;

then I get the following assembler:

BCF y, 0x3 /* Clear the 4th bit of y */

If I do:

y &= 0x7Fu;

then I get the following assembler:

MOVLW 0x7f /* Load the accumulator with 0x7f */
ANDWF y, F /* AND y with the accumulator
and store the result in y */

(You meant 0xF7 here?)

Typically a compiler will reduce ~0x8u down to 0xF7u anyway, so there
shouldn't be a difference.

Unless ~0x8u actually generates 0xFFF7u? What's the default uint size on
this compiler? What does y &= 0xFFF7u compile to, if anything? What about y
&= 0x0A?

Or possibly it's just a quirk in the compiler's optimiser. File a bug
report.

-- Bartc

vippstar · May 3, 2008

(You meant 0xF7 here?)

Typically a compiler will reduce ~0x8u down to 0xF7u anyway, so there
shouldn't be a difference.

Unless ~0x8u actually generates 0xFFF7u? What's the default uint size on

Yes, ~0x8u, 0x8u would be 0xF...7 and not 0xF7. (I chose to put
ellipsis and not a number of F's because it's not possible to know how
many F's)
In the latter, 0xF7 would be int, and thus 0x00F7 and not 0xFFF7. Most
likely what the optimizer actually recognizes is all bits except one.
In the latter case it's not clear whether you're trying to clear the
4'th bit only or the other bits too (9-16th bit)

To understand,

unsigned int c;
c = UINT_MAX; /* all bits 1 */
printf("unsigned int context: %u, %u\n", c & ~0x8u, c & 0xF7u); /*
different output */
printf("unsigned char context: %hhu, %hhu\n", (unsigned char)(c &
~0x8u), (unsigned char)(c & 0xF7u)); /* same output */

So they are different, depending on type context. The compiler
optimizer just isn't that advanced to recognize that.

vippstar · May 3, 2008

What the other are saying here is that if size of 'int' on your platform is
greater than 1 byte, then these two pieces of code are not equivalent.

Actually that's not the case.
It doesn't matter whether int is 1 byte or more, since int is at least
16 bits, the operators are well-defined, et cetera.

How to keep the order of executing tasks? - Help needed.	1	Feb 21, 2023
Help with printing a bit pattern with printf and %x	9	Apr 17, 2013
JavaScript: how to keep track of the circle in canvas on specific path?	0	Mar 20, 2023
The cost of the cheapest routes between cities	3	Jan 7, 2023
Pattern search in a matrix	2	May 14, 2010
How to debug every line of a c code with macros like functions ?	0	Aug 8, 2022
AI Example Help	3	Oct 12, 2012
Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023

Example of the optimiser recognising a pattern

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

Ian Collins

Tomás Ó hÉilidhe

Bartc

vippstar

vippstar

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads