A coworker and I have been debating the 'correct' expectation of
evaluation for the phrase a = b = c. Two different versions of GCC
ended up compiling this as b = c; a = b and the other ended up
compiling it as a = c; b = c. (Both with no optimizations enabled).
How did we notice you may ask? Well, in our case 'b' was a memory
mapped register that only has a select number of writable bits. ...
Well, first, if you have not applied the "volatile" qualifier, you
have no right to expect anything in particular, because a compiler
(even without optimization turned on) is allowed to believe that
non-"volatile" objects behave as if they are ordinary RAM. That is,
if you store 3 in some "int", and do not store another value in that
"int", it will still have 3 in it later. So:
void f(void) {
int i = 3;
... /* code that does not modify i */
printf("i is %d\n", i);
...
}
could be compiled to the same object code as:
void f(void) {
... /* code that does not modify i */
puts("i is 3");
...
}
(note the removal of the newline, since puts() adds one; and yes,
gcc really will turn printf() calls into puts() calls in some
cases).
He claims it has been a 'C Standard' that it will be evaluted as a = c; b
= c. I personally believe that it would make more sense for it to be
evaluated as b = c; a = b, although I would never write code that has a
questionable operation. Can anyone settle this debate?
The C89 and C99 standards both use similar (maybe even identical)
wording:
[#3] An assignment operator stores a value in the object
designated by the left operand. An assignment expression
has the value of the left operand after the assignment, but
is not an lvalue. The type of an assignment expression is
the type of the left operand unless the left operand has
qualified type, in which case it is the unqualified version
of the type of the left operand. The side effect of
updating the stored value of the left operand shall occur
between the previous and the next sequence point.
The second sentence is particularly important here. It tells us
that, for instance, if we write:
unsigned int a;
unsigned char b;
unsigned int c = 12345;
a = b = c;
the result stored in "a" is almost certainly *not* 12345, as it is
actually (12345 % (UCHAR_MAX + 1)). Typically UCHAR_MAX is 255 so
this is (12345 % 256), or 57. Of course, the compiler can use the
"as-if" rules to compile this as:
a = 57;
b = 57;
c = 12345;
(with the stores to a, b, and c happening in any order in the
actual underlying machine code -- or even being combined, if
there is some quick way to write to two or all three variables).
In the more difficult example where "b" is somehow mapped to a
hardware register, you -- the programmer -- *must* declare b using
the "volatile" qualifier, to tell the compiler "this thing is *not*
ordinary RAM, so you, Mr Compiler, must not play games with
optimizations that assume it *is* ordinary RAM." More typically,
you might replace b with *p where p has type "volatile T *" for
some type T. Consider the example of a memory-mapped device
that has a control-and-status register, in which writes to the
location cause the device to take actions, and reads from the
location return the device's status:
#define CSR_W_DMA 0x01 /* start DMA */
#define CSR_W_LDA 0x02 /* load DMA address from addr reg */
#define CSR_W_EI 0x04 /* enable interrupts */
#define CSR_W_RD 0x08 /* read from device (write to RAM) */
#define CSR_W_WR 0x00 /* write to device (read from RAM) - pseudo */
...
#define CSR_S_DMA 0x01 /* DMA is occuring now */
#define CSR_S_ERR 0x02 /* error occurred doing DMA */
#define CSR_S_IP 0x04 /* interrupt pending / op done-or-failed */
...
volatile int *csr;
volatile int *addr;
...
*addr = dma_addr; /* tell device where to do the op */
*csr = CSR_W_LDA;
/* do read from device, without using interrupts (poll for done) */
*csr = CSR_W_RD | CSR_W_DMA; /* read op with DMA */
while ((*csr & CSR_W_IP) == 0)
continue;
Without the "volatile" qualifier, the compiler can remove the first
write to *csr entirely (because the next write to *csr clobbers
the previous one), and replace the while loop with one that never
terminates (because we did not set CSR_W_EI and CSR_W_IP is the
same bit, so obviously that bit will never turn on).
Note that on some hardware (SPARC V9 for instance), the CPU may
need special instructions inserted at various points. These
instructions may even depend on the memory model set in the CPU.
(In this case, if I remember right, at least one, maybe two, "membar
#StoreStore"s and/or a "membar #MemIssue" in RMO, nothing at all
in TSO, and just one "StoreStore" in PSO; this depends on the fact
that the CPU recognizes the same address for the reads and writes
of the CSR. If the status register had a different address, one
more membar would sometimes be required.)
See also the ongoing thread "Volatiles in assignment expressions".