compiler optimizing wrong?

D

Dirk Zabel

Hi,
I get unexpected behaviour from some code which was part of a
manufacturer-supplied library for hardware-related functions
(programming internal flash memory). I would like to get your opinion if
compiler or the code is the culprit.

The compiler is gcc-arm 4.5.1. When no optimization is selected, the
generated code looks as expected (surprise!). If size optimization is
selected with argument "-Os", the getstatus() call inside the loop is
removed.
I stripped down the code as far as possible, so it is more or less
abstract now.

--------------- snip --------------------------------
typedef enum {
st_busy, st_complete, st_timeout
} status_t;

typedef struct
{
volatile uint32_t CR;
volatile uint32_t SR;
} hw_reg_t;


extern hw_reg_t statusreg;
/*
* Get status (busy or complete)
* from inspection of hardware status register
*/
status_t getstatus(void)
{
status_t rval;

if ((statusreg.SR & 1) == 1) {
rval = st_busy;
} else {
rval = st_complete;
}
return rval;
}

/*
* wait until hardware operation is complete or
* timeout occurred
*/
status_t wait_for_operation(unsigned long Timeout)
{
status_t status;

status = getstatus();
while((status == st_busy) && (Timeout != 0)) {
status = getstatus(); /* <- optimized out! */
Timeout--;
}
if(Timeout == 0 && status == st_busy ) {
status = st_timeout;
}
return status;
}
--------------- snip --------------------------------

If I remove the volatile qualifiers from the struct members and instead
put volatile in front of the variable declaration, the problem vanishes:

--------------- snip --------------------------------
typedef struct
{
uint32_t CR;
uint32_t SR;
} hw_reg_t;

extern volatile hw_reg_t statusreg;
--------------- snip --------------------------------

Btw, it also does not occur if I access the status register inline in
wait_for_operation() instead of calling getstatus().

Is this a compiler error or are the volatile declarations inside the
typedef unsufficient?

Regards
Dirk
 
N

Noob

Dirk said:
I get unexpected behaviour from some code which was part of a
manufacturer-supplied library for hardware-related functions
(programming internal flash memory). I would like to get your opinion if
compiler or the code is the culprit.

The compiler is gcc-arm 4.5.1. When no optimization is selected, the
generated code looks as expected (surprise!). If size optimization is
selected with argument "-Os", the getstatus() call inside the loop is
removed.
I stripped down the code as far as possible, so it is more or less
abstract now.

On a related note, you might be interested in the following
article by John Regehr.

Volatile Bugs, Three Years Later
http://blog.regehr.org/archives/503

Regards.
 
T

Thad Smith

Hi,
I get unexpected behaviour from some code which was part of a
manufacturer-supplied library for hardware-related functions (programming
internal flash memory). I would like to get your opinion if compiler or the code
is the culprit. ....
status = getstatus();
while((status == st_busy) && (Timeout != 0)) {
status = getstatus(); /* <- optimized out! */
Timeout--;

This is a compiler bug. Since getstatus() accesses a volatile object, it must
be done each time through the loop.
 
J

jstrap42

On a related note, you might be interested in the following
article by John Regehr.

Volatile Bugs, Three Years Laterhttp://blog.regehr.org/archives/503

Regards.

The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):

volatile int buffer_ready;
char buffer[BUF_SIZE];

void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?
 
C

Chris H

On a related note, you might be interested in the following
article by John Regehr.

Volatile Bugs, Three Years Later
http://blog.regehr.org/archives/503

The article is misleading as it talks about "embedded compilers" by
which they mean GCC. 11 of the 14 were GCC and two of the other three
were hardly mainstream embedded.

On the other hand I know a lot of embedded compilers they did not look
at that work correctly. (I know about two of them myself we tested for
safety critical use)
 
N

Noob

Chris said:
The article is misleading as it talks about "embedded compilers" by
which they mean GCC.

You're confusing the article and the paper.
I counted 0 occurence of the string "embedded compiler"
in Regehr's article.
11 of the 14 were GCC and two of the other three
were hardly mainstream embedded.

IIUC, you dispute the conclusion of their 2008 paper, because
their study didn't include several widely-used compilers?

Can you name these compilers? Are they available at no cost
for academic research?
On the other hand I know a lot of embedded compilers they did not look
at that work correctly. (I know about two of them myself we tested for
safety critical use)

What point are you're trying to make.
 
S

Shao Miller

On a related note, you might be interested in the following
article by John Regehr.

Volatile Bugs, Three Years Laterhttp://blog.regehr.org/archives/503

Regards.

The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):

volatile int buffer_ready;
char buffer[BUF_SIZE];

void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?


I think at each semi-colon, except for the first three. Also in the
evaluation of the size specification in the declaration of 'buffer'.
Also after 'i++'.
 
C

Chris H

You're confusing the article and the paper.
I counted 0 occurence of the string "embedded compiler"
in Regehr's article.

The introduction says
//////////////////////////
We tested thirteen production-quality C compilers
and, for each, found situations in which the compiler generated
incorrect code for accessing volatile variables. This result is
disturbing because it implies that embedded software and operating
systems both typically coded in C, both being bases for many
mission-critical and safety-critical applications, and both relying
on the correct translation of volatiles—may be being miscompiled.
///////////////

and the targets were IA32, Coldfire, MSP430 and AVR

Though you are not likely to be using GCC for safety critical or mission
critical code. Certainly not on the higher SIL levels.

But I do continually hear, due to this report, that "embedded compilers
don't handle volatile correctly" normally when referring to compilers
that were not referenced in the report and do handle volatile correctly.
IIUC, you dispute the conclusion of their 2008 paper, because
their study didn't include several widely-used compilers?

Well.... To be blunt it only really tested one. Variants of GCC.
Can you name these compilers?

No I was/am under NDA when we did the testing but it was for safety
critical use.
Are they available at no cost
for academic research?

So it is research on compilers they "can get for free?" Though I do
take your point that it is difficult to test compilers when you have to
buy them as they are not cheap.

On the other hand as most of the commercial compilers are heavily and
rigorously tested internally what is the advantage in letting a bunch of
academics have them?

In which case the report should have been entitled "GCC Volatiles Are
Miscompiled, and What to Do about It"
 
T

Tim Rentsch

Please excuse my following up in the wrong place in the thread
and with no attributions. Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):

volatile int buffer_ready;
char buffer[BUF_SIZE];

void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?


After each full declarator:

buffer_ready in 'volatile int buffer_ready;'
buffer[BUF_SIZE] in 'char buffer[BUF_SIZE];'
i in 'int i;'

After each full expression:

i=0
i<BUF_SIZE
i++
buffer=0
buffer_ready=1

(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)


It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.
 
J

jstrap42

Please excuse my following up in the wrong place in the thread
and with no attributions.  Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):
volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
    int i;
    for(i=0; i<BUF_SIZE; i++)
       buffer=0;
    buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?

After each full declarator:

   buffer_ready           in  'volatile int buffer_ready;'
   buffer[BUF_SIZE]       in  'char buffer[BUF_SIZE];'
   i                      in  'int i;'

After each full expression:

   i=0
   i<BUF_SIZE
   i++
   buffer=0
   buffer_ready=1

(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)

It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.


Thanks. Again with reference to section 2.1 in this paper:

http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

The author states that "a compiler may not move acceses to volatile
variables across sequence points", yet he also states that "the
compiler is free to move the loop [to] below the store to
buffer_ready". I can't reconcile these two statements - it seems to me
that moving the loop to a position below the store to buffer_ready is
the same as moving the store to buffer_ready to a position above the
loop. The effect is to move the store across sequence points.
 
S

Shao Miller

Please excuse my following up in the wrong place in the thread
and with no attributions. Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):
volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?

After each full declarator:

buffer_ready in 'volatile int buffer_ready;'
buffer[BUF_SIZE] in 'char buffer[BUF_SIZE];'
i in 'int i;'

After each full expression:

i=0
i<BUF_SIZE
i++
buffer=0
buffer_ready=1

(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)

It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.


Thanks. Again with reference to section 2.1 in this paper:

http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

The author states that "a compiler may not move acceses to volatile
variables across sequence points", yet he also states that "the
compiler is free to move the loop [to] below the store to
buffer_ready". I can't reconcile these two statements - it seems to me
that moving the loop to a position below the store to buffer_ready is
the same as moving the store to buffer_ready to a position above the
loop. The effect is to move the store across sequence points.


If you treat sequence points as linear, then I can understand your
confusion.

Please consider:

int i, j;

i = 5;
j = 6;

Since it's known that these objects are distinct and that the side
effects of setting their stored values are independent, I belive that
the compiler is free to choose any order, including "simultaneously."

But sequence points aren't exactly linear... C1X introduced a more
formal sequencing model include "before," "after," "unsequenced,"
"indeterminately sequenced."

Please consider:

int x, y = 3, z = 5;

x = (y, 7) + (z, 11);

There is a sequence point after 'y' is accessed and a sequence point
after 'z' is accessed, but those accesses are "unsequenced" relative to
one another; no order is imposed whatsoever (even "simultaneously").
But both accesses are "sequenced before" the computation of 7 + 11 and,
of course, before the access to 'x'.

So your example's 'for' loop contains side effects which are independent
of 'buffer_ready'. The compiler can move the loop the same as the 'i,
j' example, above. The access to the volatile 'buffer_ready' is still
bounded by the sequence point before (whatever that might be) and by the
sequence point at its semi-colon (at the end of the expression statement).
 
T

Tim Rentsch

Please excuse my following up in the wrong place in the thread
and with no attributions. Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):
volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?

After each full declarator:

buffer_ready in 'volatile int buffer_ready;'
buffer[BUF_SIZE] in 'char buffer[BUF_SIZE];'
i in 'int i;'

After each full expression:

i=0
i<BUF_SIZE
i++
buffer=0
buffer_ready=1

(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)

It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.


Thanks. Again with reference to section 2.1 in this paper:

http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf


I know this paper! I've read this paper. In fact a couple
of years ago I seem to remember posting a series of articles
in response to this paper. I think it's a good paper,
although it does have some stuff in it that is (take your
pick) somewhere between controversial and wrong.

The author states that "a compiler may not move acceses to volatile
variables across sequence points", yet he also states that "the
compiler is free to move the loop [to] below the store to
buffer_ready". I can't reconcile these two statements - it seems to me
that moving the loop to a position below the store to buffer_ready is
the same as moving the store to buffer_ready to a position above the
loop. The effect is to move the store across sequence points.

IMO the comment about moving the loop after the volatile store
is, let me put it diplomatically, not exactly right. In fairness
I should add that it isn't absolutely wrong either. There is a
subtlety in the Standard because what constitutes an access to a
volatile object is implementation-defined. If you do a google
groups search for "volatile" "memory" "regime" I think you'll
be able to find my earlier articles on all this. Might also
try google groups search on the url, that might help pin down
the exact time frame.
 
N

Noob

Tim said:
http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

I know this paper! I've read this paper. In fact a couple
of years ago I seem to remember posting a series of articles
in response to this paper.

Did you have the following thread in mind?

From: Tim Rentsch
Newsgroups: comp.lang.c, comp.arch.embedded
Subject: The Semantics of 'volatile'
Date: 01 Jun 2009 16:43:11 -0700

http://groups.google.com/group/comp.lang.c/browse_frm/thread/9c05dfae38459f2c/556cd6a41357b79f
 
J

jstrap42

Please excuse my following up in the wrong place in the thread
and with no attributions.  Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):
volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
     int i;
     for(i=0; i<BUF_SIZE; i++)
        buffer=0;
     buffer_ready=1;
}
Can anyone point out exactly where the sequence points are in that
code snippet?
After each full declarator:
    buffer_ready           in  'volatile int buffer_ready;'
    buffer[BUF_SIZE]       in  'char buffer[BUF_SIZE];'
    i                      in  'int i;'
After each full expression:
    i=0
    i<BUF_SIZE
    i++
    buffer=0
    buffer_ready=1
(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)
It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.

Thanks. Again with reference to section 2.1 in this paper:

The author states that "a compiler may not move acceses to volatile
variables across sequence points", yet he also states that "the
compiler is free to move the loop [to] below the store to
buffer_ready". I can't reconcile these two statements - it seems to me
that moving the loop to a position below the store to buffer_ready is
the same as moving the store to buffer_ready to a position above the
loop. The effect is to move the store across sequence points.

If you treat sequence points as linear, then I can understand your
confusion.

Please consider:

   int i, j;

   i = 5;
   j = 6;

Since it's known that these objects are distinct and that the side
effects of setting their stored values are independent, I belive that
the compiler is free to choose any order, including "simultaneously."

But sequence points aren't exactly linear...  C1X introduced a more
formal sequencing model include "before," "after," "unsequenced,"
"indeterminately sequenced."

Please consider:

   int x, y = 3, z = 5;

   x = (y, 7) + (z, 11);

There is a sequence point after 'y' is accessed and a sequence point
after 'z' is accessed, but those accesses are "unsequenced" relative to
one another; no order is imposed whatsoever (even "simultaneously").
But both accesses are "sequenced before" the computation of 7 + 11 and,
of course, before the access to 'x'.

So your example's 'for' loop contains side effects which are independent
of 'buffer_ready'.  The compiler can move the loop the same as the 'i,
j' example, above.  The access to the volatile 'buffer_ready' is still
bounded by the sequence point before (whatever that might be) and by the
sequence point at its semi-colon (at the end of the expression statement)..- Hide quoted text -

- Show quoted text -


I think I have to view sequence points as linear. If they weren't then
how would one interpret the words 'previous' and 'subsequent' in this:

N869 5.1.2.3 paragraph 2

"Accessing a volatile object, modifying an object,
modifying a file, or calling a function that does any of

those operations are all side effects, which are changes
in the state of the execution environment. Evaluation of an
expression may produce side effects. At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place."
 
S

Shao Miller

Please excuse my following up in the wrong place in the thread
and with no attributions. Newsreader glitch.
The example given in section 2.1 of that paper is this (I've had to
retype it so there could be mistakes):
volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
int i;
for(i=0; i<BUF_SIZE; i++)
buffer=0;
buffer_ready=1;
}

Can anyone point out exactly where the sequence points are in that
code snippet?
After each full declarator:
buffer_ready in 'volatile int buffer_ready;'
buffer[BUF_SIZE] in 'char buffer[BUF_SIZE];'
i in 'int i;'
After each full expression:
i=0
i<BUF_SIZE
i++
buffer=0
buffer_ready=1


(Of course the middle three of these potentially occur multiple
times because of the 'for' loop.)

It doesn't make much sense to talk about sequence points
in (or after) declarators at file scope, but these points
are what's specified in the Standard.
Thanks. Again with reference to section 2.1 in this paper:

The author states that "a compiler may not move acceses to volatile
variables across sequence points", yet he also states that "the
compiler is free to move the loop [to] below the store to
buffer_ready". I can't reconcile these two statements - it seems to me
that moving the loop to a position below the store to buffer_ready is
the same as moving the store to buffer_ready to a position above the
loop. The effect is to move the store across sequence points.

If you treat sequence points as linear, then I can understand your
confusion.

Please consider:

int i, j;

i = 5;
j = 6;

Since it's known that these objects are distinct and that the side
effects of setting their stored values are independent, I belive that
the compiler is free to choose any order, including "simultaneously."

But sequence points aren't exactly linear... C1X introduced a more
formal sequencing model include "before," "after," "unsequenced,"
"indeterminately sequenced."

Please consider:

int x, y = 3, z = 5;

x = (y, 7) + (z, 11);

There is a sequence point after 'y' is accessed and a sequence point
after 'z' is accessed, but those accesses are "unsequenced" relative to
one another; no order is imposed whatsoever (even "simultaneously").
But both accesses are "sequenced before" the computation of 7 + 11 and,
of course, before the access to 'x'.

So your example's 'for' loop contains side effects which are independent
of 'buffer_ready'. The compiler can move the loop the same as the 'i,
j' example, above. The access to the volatile 'buffer_ready' is still
bounded by the sequence point before (whatever that might be) and by the
sequence point at its semi-colon (at the end of the expression statement).


I think I have to view sequence points as linear. If they weren't then
how would one interpret the words 'previous' and 'subsequent' in this:

N869 5.1.2.3 paragraph 2

"Accessing a volatile object, modifying an object,
modifying a file, or calling a function that does any of
those operations are all side effects, which are changes
in the state of the execution environment. Evaluation of an
expression may produce side effects. At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place."


"Previous" and "subsequent" relative to data dependencies, perhaps?
Further to your point, we have:

"A statement specifies an action to be performed. Except as
indicated, statements are executed in sequence."

But that's the abstract machine and not the actual machine.

My understanding is that sequence points are not _strictly_ linear. In
the 'x = ...' example above, which of the two comma operator sequence
points comes first? Which is the subsequent sequence point?

Underneath "The least requirements on a conforming implementation are",
we can note a change in wording, from the C89:

"At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred."

to the N1256 (C99):

"At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred."

The former _might_ be interpreted to mean "previous evaluations
involving the object". The latter certainly seems clearer about that.

Then it's changed again in N1494 (C1X) to:

"Accesses to volatile objects are evaluated strictly according to the
rules of the abstract machine."

How about that.
 
S

Seebs

I think I have to view sequence points as linear. If they weren't then
how would one interpret the words 'previous' and 'subsequent' in this:

Shao was apparently talking about C1X, which introduced a new and fancier
model for this.

That said... In plain C, you can end up with several sequence points
which have no obvious ordering with respect to each other, at which point
I think "previous" and "subsequent" are perhaps best understood in terms of
execution time.

Consider something like:
static int x, y, z;

(x++ || x++) | (y++ || y++) | (z++ || z++);
or
(x++ || x++) | (x++ || x++) | (x++ || x++);

It doesn't matter WHICH of the sequence points introduced by the ||s
happens first or second or third. What matters is that those sequence
points definitely occur in *some* order. Unfortunately for the second
line of code, there's nothing that prevents the compiler from picking:
x++
x++
x++
sequence-point
sequence-point
sequence-point
[...]

At which point it's pretty clear that we have undefined behavior. But
there is still a middle sequence point, compared to which one of the others
is previous and the other is next. But the first one, so far as I can
tell, is fine, and sets everything to 2. No matter how you shuffle things
around, the two modifications of any *specific* object are always separated
by a sequence point.

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top