Defeating Optimisation for memcmp()

Martin · Nov 10, 2007

Please consider the following code fragment. Assume UINT32 is a typedef
suitable for defining variables of 32 bits, and that ui32 is initialised.

UINT32 ui32;
/* ... */
/* assume ui32 now is holding a value in uc's range */
unsigned char uc = (unsigned char) ui32; /* tell Lint you know target
type is smaller */
cmos->uc = uc;

/* Check the CMOS write was successful */
/* NOTE: I know third arg evals to 1, but sizeof is used for readability
*/
if (memcmp(cmos->uc, &uc, sizeof(unsigned char)) != 0)
/* CMOS write failed */

I would expect a decent compiler to optimize away the memcmp(). Would you
agree that this change will ensure the memcmp is not optimized away?:

UINT32 ui32;
/* ... */
/* assume ui32 now is holding a value in uc's range */
volatile unsigned char uc = (unsigned char) ui32; /* tell Lint you know
target type is smaller */
cmos->uc = uc;

/* Check the CMOS write was successful */
if (memcmp(cmos->uc, &uc, sizeof(unsigned char)) != 0)
/* CMOS write failed */

Is a call to a library function (memcmp) less likely to be optimised away
than use of relational operators (a) when one of the operands is volatile,
and (b) when neither are volatile?

For example,

if ( cmos->uc != uc )
/* CMOS write failed */

compared to

if (memcmp(cmos->uc, &uc, sizeof(unsigned char)) != 0)
/* CMOS write failed */

Thank-you in advance.

Ben Bacarisse · Nov 10, 2007

Martin said:
Please consider the following code fragment. Assume UINT32 is a typedef
suitable for defining variables of 32 bits, and that ui32 is initialised.

UINT32 ui32;
/* ... */
/* assume ui32 now is holding a value in uc's range */
unsigned char uc = (unsigned char) ui32; /* tell Lint you know target
type is smaller */
cmos->uc = uc;

/* Check the CMOS write was successful */
/* NOTE: I know third arg evals to 1, but sizeof is used for readability
*/
if (memcmp(cmos->uc, &uc, sizeof(unsigned char)) != 0)

Presumably you mean '&cmos->uc'? The same typo appears several times
so I am not sure. BTW, if you want to document the size, I'd prefer
to write:

if (memcmp(&cmos->uc, &uc, sizeof uc) != 0)

I would expect a decent compiler to optimize away the memcmp(). Would you
agree that this change will ensure the memcmp is not optimized away?:

<snip code that makes uc volatile>

Yes, the compiler must generate code to examine the value in the
volatile variable.

Is a call to a library function (memcmp) less likely to be optimised away
than use of relational operators (a) when one of the operands is volatile,
and (b) when neither are volatile?

For example,

if ( cmos->uc != uc )
/* CMOS write failed */

compared to

if (memcmp(cmos->uc, &uc, sizeof(unsigned char)) != 0)
/* CMOS write failed */

Both must test uc. Of course, this is a bit of a lie. The volatile
object is the structure pointer to by 'cmos' and I'd rather write it
that way:

volatile struct Something *cmos;

Martin · Nov 10, 2007

Ben Bacarisse said:
Presumably you mean '&cmos->uc'? The same typo appears several times
so I am not sure.

It's my mistake Ben - I did miss the ampersand for memcmp()'s first
argument. It should of course be as you say.

Eric Sosman · Nov 10, 2007

Martin said:
[...]
Is a call to a library function (memcmp) less likely to be optimised away
than use of relational operators (a) when one of the operands is volatile,
and (b) when neither are volatile?

The call has undefined behavior if *either* operand is
volatile. 6.7.3p5:

[...] If an attempt is made to refer to an object
defined with a volatile-qualified type through use of
an lvalue with non-volatile-qualified type, the
behavior is undefined.

There's a quibble: If memcmp is not implemented in C, it does
not use C lvalues to access anything at all and thus might
skirt the prohibition. But then there's 7.1.4p1:

If an argument to a function has an invalid value (such
as a value outside the domain of the function, or a
pointer outside the address space of the program, or a
null pointer, or a pointer to non-modifiable storage when
the corresponding parameter is not const-qualified) [...]
the behavior is undefined.

True, volatile is not mentioned. However, the list of ways in
which an argument can be invalid is prefaced by "such as," a
construct suggestive of a non-exhaustive list.

Finally, there's 6.3.2.3p2:

For any qualifier q, a pointer to a non-q-qualified type
may be converted to a pointer to the q-qualified version
of the type [...]

The point is that the conversion in the other direction is not
described as legal: You can add qualifiers in a conversion, but
you cannot subtract them. If you hand a pointer-to-volatile to
memcmp (which expects a pointer-to-const), I think the compiler
is required to issue a diagnostic -- and if it then accepts and
runs the program anyhow, all bets are off.

The solution to your problem is to qualify the "CMOS" thing
as volatile, and use an ordinary comparison:

struct {
...
volatile unsigned char uc;
...
} *cmos = ...;

cmos->uc = uc;
if (cmos->uc == uc) ...

The compiler is not permitted to optimize away any of the accesses
to cmos->uc, because the volatile qualifier declares that those
accesses have side-effects, just like calls to putchar(). The
compiler cannot "just know" that the comparison will turn out
true; it must actually perform it. That's what volatile does.

Ben Bacarisse · Nov 10, 2007

Eric Sosman said:
Martin said:

[...]
Is a call to a library function (memcmp) less likely to be optimised
away than use of relational operators (a) when one of the operands
is volatile, and (b) when neither are volatile?

Click to expand...

The call has undefined behavior if *either* operand is
volatile. 6.7.3p5:

The point is that the conversion in the other direction is not
described as legal: You can add qualifiers in a conversion, but
you cannot subtract them.

I stand (implicitly) corrected. I should have seen this. This is why
c.l.c is so worth reading To the OP: ignore (most of) what I wrote!

Martin · Nov 11, 2007

Eric Sosman said:
The solution to your problem is to qualify the "CMOS" thing
as volatile, and use an ordinary comparison:

struct {
...
volatile unsigned char uc;
...
} *cmos = ...;

cmos->uc = uc;
if (cmos->uc == uc) ...

The compiler is not permitted to optimize away any of the accesses
to cmos->uc, because the volatile qualifier declares that those
accesses have side-effects, just like calls to putchar(). The
compiler cannot "just know" that the comparison will turn out
true; it must actually perform it. That's what volatile does.

Thanks for your comments. I don't think making the cmos variable volatile is
an option.

Why can't I make uc volatile?

volatile unsigned char uc = (unsigned char) ui32;
/* ... init uc ... */
cmos->uc = uc;
if ( cmos->uc != uc )
/* error writing to CMOS */

Ben Bacarisse · Nov 11, 2007

Martin said:
Thanks for your comments. I don't think making the cmos variable volatile is
an option.

Why can't I make uc volatile?

volatile unsigned char uc = (unsigned char) ui32;
/* ... init uc ... */
cmos->uc = uc;
if ( cmos->uc != uc )
/* error writing to CMOS */

Eric Sosman's comment was about using memcmp -- specifically that
passing memcmp a pointer to a volatile object is not permitted. You
can use a != test but...

Making uc volatile won't work. The compiler may assume that cmos->uc
is set as per the assignment (it need not access the object again). I
can't see a way round this other than making the object that is
actually volatile, volatile. Forcing the compiler to re-access uc to
compare it against the value that it may have squirreled away as the
assumed contents of cmos->uc will not help you.

CBFalconer · Nov 11, 2007

Eric said:
.... snip ...

The compiler is not permitted to optimize away any of the accesses
to cmos->uc, because the volatile qualifier declares that those
accesses have side-effects, just like calls to putchar(). The
compiler cannot "just know" that the comparison will turn out
true; it must actually perform it. That's what volatile does.

Not side-effects. The variable may 'spontaneously' change between
reads. There is no reason to insist on a write between reads.

Chris Torek · Nov 11, 2007

(Or, as I tend to prefer, make the structure type's elements
ordinary, non-"volatile"-qualified, but use a volatile qualifier
on the pointer itself:

struct whatever { ... unsigned char uc; ... };
volatile struct whatever *cmos = ...;

This allows one to copy the entire data structure into ordinary
RAM, then manipulate it there without defeating optimization.)

(Or rather, that they "may" have side effects, and the compiler
should assume the worst.)

Why not? (Neither Eric Sosman's method nor mine generally requires
much in the way of code changes.)

You can; it just is silly, and may well not work. (In fact, it is
only likely to work if the code happens to work with no "volatile"
qualifiers anyway. That is, adding the volatile qualifier in the
wrong place is extremely unlikely to help.)

[I am going to make a name change below, so that "uc" unambiguously
refers to cmos->uc, and use "temp_v" for the local variable.]

Making temp_v volatile won't work.

Well, it *might* work, all depending on details about the
compiler's internal workings.

The compiler may assume that cmos->uc is set as per the assignment
(it need not access the object again).

Right -- for instance, it might generate code of the form:

ldw cmos_, a3 # so that register a3 = cmos
ldb -12(sp), d1 # so that register d1 = temp_v
stb d1, 48(a3) /* cmos->uc = temp_v; */

ldb -12(sp), d2 # so that register d2 = temp_v
cmp d1, d2 /* see if cmos->uc == temp_v */
...

Note that temp_v was loaded twice, in case it changed; but the
compiler could see that 48(a3), which refers to cmos->uc, was set
from register d1, and -- since it is "ordinary RAM" (even though
it is not!) it must not have changed, so there was no need to load
*that* again.

I can't see a way round this other than making the object that is
actually volatile, volatile.

Indeed.

Eric Sosman · Nov 11, 2007

Chris said:
:
[...]
The compiler is not permitted to optimize away any of the accesses
to cmos->uc, because the volatile qualifier declares that those
accesses have side-effects ...

Click to expand...

Click to expand...

(Or rather, that they "may" have side effects, and the compiler
should assume the worst.)

5.1.2.3p2:

Accessing a volatile object, modifying an object,
modifying a file, or calling a function that does any
of those operations are all side effects, which are
changes in the state of the execution environment.

That is, a read or write of a volatile object is *defined* as a
side effect, whether or not it "does anything" from the point of
view of the programmer.

Of course, there's still the issue of 6.7.3p6:

[...] What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

.... so the fact that a volatile object is assigned to or has
its value used might not mean that it is "accessed," and so
might not imply a side effect. But it's always been my belief
that 6.7.3p6 isn't a license to omit the access, but rather the
Standard declining to try to define things like memory cache
architectures: An access to a volatile object might end up in
the L4 cache rather than going "all the way to RAM" on some
system, but I think the compiled code must initiate an access
of some kind anyhow -- and that access, whatever it is, is a
side effect by definition.

Martin · Nov 20, 2007

My apologies - I had meant to submit a thank-you message to all your
responses before now.

So, thanks for the helpful advice.

I have two related questions. According to K&R2:

"Except that it should diagnose explicit attempts to change 'const'
objects, a compiler may ignore these qualifiers."

"These qualifiers" are 'const' and 'volatile'. It seems that this could be
an issue. I can carefully implement Eric Sosman's suggestion regarding using
a volatile pointer to the structure, but it seems an ANSI/ISO conformant
compiler is free to ignore it, and the compiler then could optimise away the
following memcmp(). Could someone clarify this for me?

Also, along those lines of making a pointer volatile, I would like some
clarification. Consider this code abstract:

volatile char arr[10];
char arr2[10]
/* ... code that initialises both arrays ... */
if ( memcmp(&arr[3], &arr2[3], 1) ... )
/* etc. */

Does the deferencing of the first argument, arr, mean that memcmp is being
handed a non-volatile type (which is, I believe, the principle behind Chris
Torek's suggestion)?

Eric Sosman · Nov 20, 2007

Martin wrote On 11/20/07 13:52,:

My apologies - I had meant to submit a thank-you message to all your
responses before now.

So, thanks for the helpful advice.

I have two related questions. According to K&R2:

"Except that it should diagnose explicit attempts to change 'const'
objects, a compiler may ignore these qualifiers."

"These qualifiers" are 'const' and 'volatile'. It seems that this could be
an issue. I can carefully implement Eric Sosman's suggestion regarding using
a volatile pointer to the structure, but it seems an ANSI/ISO conformant
compiler is free to ignore it, and the compiler then could optimise away the
following memcmp(). Could someone clarify this for me?

(1) The compiler can ignore volatile only by resorting
to an obstructionist interpretation of the "what constitutes
an access is implementation-defined" clause. Such a compiler
is not trying to be useful in its target environment, and
will find few adopters. If you've got a compiler that's
actually trying to support your environment, it will do the
Right Thing with volatile -- all you need to worry about is
the precise, implementation-defined nature of "Right."

(2) It's still an error to pass a pointer to a volatile
object to a function that's not expecting it. Not a problem
of optimization, but an error on the programmer's part.

Also, along those lines of making a pointer volatile, I would like some
clarification. Consider this code abstract:

volatile char arr[10];
char arr2[10]
/* ... code that initialises both arrays ... */
if ( memcmp(&arr[3], &arr2[3], 1) ... )
/* etc. */

Does the deferencing of the first argument, arr, mean that memcmp is being
handed a non-volatile type (which is, I believe, the principle behind Chris
Torek's suggestion)?

Looks like an error to me. Internally, memcmp() doesn't
"know" it's dealing with volatile characters, so it won't
necessarily "access" them in the right order or the right
way. (For example, it may discover that it can fetch them
four at a time or eight at a time, completely unaware that
the "off-the-end" read might tickle a nearby volatile object
that you weren't expecting it to touch. Your fault, not
memcmp()'s.)

The straightforward `if (arr[3] != arr2[3]) ...' says
what you intend, says it correctly, and works (barring an
intentionally obstreporous compiler).

Ben Pfaff · Nov 20, 2007

Martin said:
I have two related questions. According to K&R2:

"Except that it should diagnose explicit attempts to change 'const'
objects, a compiler may ignore these qualifiers."

This statement is given in the context of qualifiers on types,
not qualifiers on pointers. I think that this is intended to
mean that the compiler is not obligated to store const objects in
read-only memory, and that it is not obligated to put volatile
objects in a special section of memory either.

Martin · Nov 20, 2007

Eric Sosman said:
Looks like an error to me. Internally, memcmp() doesn't
"know" it's dealing with volatile characters, so it won't
necessarily "access" them in the right order or the right
way. (For example, it may discover that it can fetch them
four at a time or eight at a time, completely unaware that
the "off-the-end" read might tickle a nearby volatile object
that you weren't expecting it to touch. Your fault, not
memcmp()'s.)

I think my example was wrong. Of course memcpy() is being passed a pointer
to a volatile char. This is more what I meant:

char arr[10];
char arr2[10]
volatile char *parr = arr;
/* ... code that initialises both arrays ... */
if ( memcmp(&parr[3], &arr2[3], 1) ... )
/* etc. */

Which is analagous to Chris Torek's use of the volatile pointer to
structure. I think that is OK because the first argument now is not pointing
to a volatile once deferenced, but the parr itself it volatile so the call
to memcmp() cannot be optimised away - *however* in the case above where one
element of an array is being compared with another, I can see that my
original example in tandem with the use of the != operator you suggest is a
solution I can adopt.

Thanks for your help.

Martin · Nov 20, 2007

Ben Pfaff said:
This statement is given in the context of qualifiers on types,
not qualifiers on pointers. I think that this is intended to
mean that the compiler is not obligated to store const objects in
read-only memory, and that it is not obligated to put volatile
objects in a special section of memory either.

Are you saying that the quote from K&R2 above does not apply in these
instances?

int i, *const cpi = &i;
const int *pci;

Ben Pfaff · Nov 21, 2007

Martin said:
Are you saying that the quote from K&R2 above does not apply in these
instances?

int i, *const cpi = &i;

cpi is a const pointer to a non-const int. The compiler is not
obligated to put cpi into a read-only section.

const int *pci;

pci is a non-const pointer to a const int. The compiler may not
put pci into a read-only section.

Ben Bacarisse · Nov 21, 2007

Martin said:
Eric Sosman said:

Looks like an error to me. Internally, memcmp() doesn't
"know" it's dealing with volatile characters, so it won't
necessarily "access" them in the right order or the right
way. (For example, it may discover that it can fetch them
four at a time or eight at a time, completely unaware that
the "off-the-end" read might tickle a nearby volatile object
that you weren't expecting it to touch. Your fault, not
memcmp()'s.)

Click to expand...

I think my example was wrong. Of course memcpy() is being passed a pointer
to a volatile char. This is more what I meant:

char arr[10];
char arr2[10]
volatile char *parr = arr;
/* ... code that initialises both arrays ... */
if ( memcmp(&parr[3], &arr2[3], 1) ... )
/* etc. */

You haven't changed anything as far as the constraint violation on the
call to memcmp is concerned. Both

volatile char x[10];
memcmp(x, ...);
memcmp(&x[3], ...);

and

volatile char *x;
memcmp(x, ...);
memcmp(&x[3], ...);

all pass a 'char pointer to volatile' where 'void pointer to const' is
expected.

CBFalconer · Nov 21, 2007

Ben said:
This statement is given in the context of qualifiers on types,
not qualifiers on pointers. I think that this is intended to
mean that the compiler is not obligated to store const objects
in read-only memory, and that it is not obligated to put
volatile objects in a special section of memory either.

The following was my attempt to test incrementing of a void*. I
think I can imagine situations where this ability would be useful
to bypass pointer incrementation. BTW, cc is shorthand for:
gcc -W -Wall -ansi -pedantic
and accesses gcc 3.2.1.

[1] c:\c\junk>cat junk.c
#include <stdio.h>

int main(void) {
void *p, *pb;
char by;

p = &by;

pb = p++;
if (pb == p) puts("++ uses sizeof void* == 0");
else puts("No luck here");
return 0;
}

[1] c:\c\junk>cc junk.c
junk.c: In function `main':
junk.c:9: warning: wrong type argument to increment

[1] c:\c\junk>a
No luck here

Ben Pfaff · Nov 21, 2007

CBFalconer said:
Ben said:

This statement is given in the context of qualifiers on types,
not qualifiers on pointers. I think that this is intended to
mean that the compiler is not obligated to store const objects
in read-only memory, and that it is not obligated to put
volatile objects in a special section of memory either.

Click to expand...

The following was my attempt to test incrementing of a void*. I
think I can imagine situations where this ability would be useful
to bypass pointer incrementation. [...]

I am struggling to understand how this is anything but a non
sequitur. Can you explain?

Dann Corbit · Nov 21, 2007

Ben Pfaff said:
CBFalconer said:

Ben said:

I have two related questions. According to K&R2:

"Except that it should diagnose explicit attempts to change
'const' objects, a compiler may ignore these qualifiers."

This statement is given in the context of qualifiers on types,
not qualifiers on pointers. I think that this is intended to
mean that the compiler is not obligated to store const objects
in read-only memory, and that it is not obligated to put
volatile objects in a special section of memory either.

Click to expand...

The following was my attempt to test incrementing of a void*. I
think I can imagine situations where this ability would be useful
to bypass pointer incrementation. [...]

Click to expand...

I am struggling to understand how this is anything but a non
sequitur. Can you explain?

A void pointer has no stride and cannot be incremented. I find the sentence
very difficult to parse.

memcmp() - compare buffer for 0	29	Oct 10, 2011
A generic interface for numeric variables	8	Apr 4, 2011
array in prototype	7	Jun 12, 2014
Better casts?	10	Jan 21, 2006
Pointer problem with simple preprocessor define	9	Oct 27, 2010
Request for source code review of simple Ising model	88	Apr 10, 2014
Help Me	3	Mar 1, 2006
how to read a Unicode file	2	Nov 7, 2006

Defeating Optimisation for memcmp()

Martin

Ben Bacarisse

Martin

Eric Sosman

Ben Bacarisse

Martin

Ben Bacarisse

CBFalconer

Chris Torek

Eric Sosman

Martin

Eric Sosman

Ben Pfaff

Martin

Martin

Ben Pfaff

Ben Bacarisse

CBFalconer

Ben Pfaff

Dann Corbit

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads