The Semantics of 'volatile'
===========================
I've been meaning to get to this for a while, finally there's a
suitable chunk of free time available to do so.
To explain the semantics of 'volatile', we consider several
questions about the concept and how volatile variables behave,
etc. The questions are:
1. What does volatile do?
2. What guarantees does using volatile provide? (What memory
regimes must be affected by using volatile?)
3. What limits does the Standard set on how using volatile
can affect program behavior?
4. When is it necessary to use volatile?
We will take up each question in the order above. The comments
are intended to address both developers (those who write C code)
and implementors (those who write C compilers and libraries).
What does volatile do?
----------------------
This question is easy to answer if we're willing to accept an
answer that may seem somewhat nebulous. Volatile allows contact
between execution internals, which are completely under control
of the implementation, and external regimes (processes or other
agents) not under control of the implementation. To provide such
contact, and provide it in a well-defined way, using volatile
must ensure a common model for how memory is accessed by the
implementation and by the external regime(s) in question.
Subsequent answers will fill in the details around this more
high level one.
What guarantees does using volatile provide?
--------------------------------------------
The short answer is "None." That deserves some elaboration.
Another way of asking this question is, "What memory regimes must
be affected by using volatile?" Let's consider some possibilities.
One: accesses occur not just to registers but to process virtual
memory (which might be just cache); threads running in the same
process affect and are affected by these accesses. Two: accesses
occur not just to cache but are forced out into the inter-process
memory (or "RAM"); other processes running on the same CPU core
affect and are affected by these accesses. Three: accesses occur
not just to memory belonging to the one core but to memory shared
by all the cores on a die; other processes running on the same CPU
(but not necessarily the same core) affect and are affected by
these accesses. Four: accesses occur not just to memory belonging
to one CPU but to memory shared by all the CPUs on the motherboard;
processes running on the same motherboard (even if on another CPU
on that motherboard) affect and are affected by these accesses.
Five: accesses occur not just to fast memory but also to some slow
more permanent memory (such as a "swap file"); other agents that
access the "swap file" affect and are affected by these accesses.
The different examples are intended informally, and in many cases
there is no distinction between several of the different layers.
The point is that different choices of regime are possible (and
I'm sure many readers can provide others, such as not only which
memory is affected but what ordering guarantees are provided).
Now the question again: which (if any) of these different
regimes are /guaranteed/ to be included by a 'volatile' access?
The answer is none of the above. More specifically, the Standard
leaves the choice completely up to the implementation. This
specification is given in one sentence in 6.7.3 p 6, namely:
What constitutes an access to an object that has
volatile-qualified type is implementation-defined.
So a volatile access could be defined as coordinating with any of
the different memory regime alternatives listed above, or other,
more exotic, memory regimes, or even (in the claims of some ISO
committee participants) no particular other memory regimes at all
(so a compiler would be free to ignore volatile completely)[*].
How extreme this range is may be open to debate, but I note that
Larry Jones, for one, has stated unequivocally that the possibility
of ignoring volatile completely is allowed under the proviso given
above. The key point is that the Standard does not identify which
memory regimes must be affected by using volatile, but leaves that
decision to the implementation.
A corollary to the above that any volatile-qualified access
automatically introduces an implementation-defined aspect to a
program.
[*] Possibly not counting the specific uses of 'volatile' as it
pertains to setjmp/longjmp and signals that the Standard
identifies, but these are side issues.
What limits are there on how volatile access can affect program behavior?
-------------------------------------------------------------------------
More properly this question is "What limits does the Standard
impose on how volatile access can affect program behavior?".
Again the short answer is None. The first sentence in 6.7.3 p 6
says:
An object that has volatile-qualified type may be modified
in ways unknown to the implementation or have other unknown
side effects.
Nowhere in the Standard are any limitations stated as to what
such side effects might be. Since they aren't defined, the
rules of the Standard identify the consequences as "undefined
behavior". Any volatile-qualified access results in undefined
behavior (in the sense that the Standard uses the term).
Some people are bothered by the idea that using volatile produces
undefined behavior, but there really isn't any reason to be. At
some level any C statement (or variable access) might behave in
ways we don't expect or want. Program execution can always be
affected by peculiar hardware, or a buggy OS, or cosmic rays, or
anything else outside the realm of what the implementation knows
about. It's always possible that there will be unexpected
changes or side effects, in the sense that they are unexpected by
the implementation, whether volatile is used or not. The
difference is, using volatile interacts with these external
forces in a more well-defined way; if volatile is omitted, there
is no guarantee as to how external forces on particular parts
of the physical machine might affect (or be affected by) changes
in the abstract machine.
Somewhat more succinctly: using volatile doesn't affect the
semantics of the abtract machine; it admits undefined behavior
by unknown external forces, which isn't any different from the
non-volatile case, except that using volatile adds some
(implementation-defined) requirements about how the abstract
machine maps onto the physical machine in the external forces'
universe. However, since the Standard mentions unknown side
effects explicitly, such things seem more "expectable" when
volatile is used. (volatile == Expect the unexected?)
When is it necessary to use volatile?
-------------------------------------
In terms of pragmatics this question is the most interesting of
the four. Of course, as phrased the question asked is more of a
developer question; for implementors, the phrasing would be
something more like "What requirements must my implementation
meet to satisfy developers who are using 'volatile' as the
Standard expects?"
To get some details out of the way, there are two specific cases
where it's necessary to use volatile, called out explicitly in
the Standard, namely setjmp/longjmp (in 7.13.2.1 p 3) and
accessing static objects in a signal handler (in 7.14.1.1 p 5).
If you're a developer writing code for one of these situations,
either use volatile, code around it so volatile isn't needed
(this can be done for setjmp), or be sure that the particular
code you're writing is covered by some implementation-defined
guarantees (extensions or whatever). Similarly, if you're an
implementor, be sure that using volatile in the specific cases
mentioned produces code that works; what this means is that the
volatile-using code should behave just like it would under
regular, non-exotic control structures. Of course, it's even
better if the implementation can do more than the minimum, such
as: define and document some additional cases for signal
handling code; make variable access in setjmp functions work
without having to use volatile, or give warnings for potential
transgressions (or both).
The two specific cases are easy to identify, but of course the
interesting cases are everything else! This area is one of the
murkiest in C programming, and it's useful to take a moment to
understand why. For implementors, there is a tension between
code generation and what semantic interpretation the Standard
requires, mostly because of optimization concerns. Nowhere is
this tension felt more keenly than in translating 'volatile'
references faithfully, because volatile exists to make actions in
the abstract machine align with those occurring in the physical
machine, and such alignment prevents many kinds of optimization.
To appreciate the delicacy of the question, let's look at some
different models for how implementations might behave.
The first model is given as an Example in 5.1.2.3 p 8:
EXAMPLE 1 An implementation might define a one-to-one
correspondence between abstract and actual semantics: at
every sequence point, the values of the actual objects would
agree with those specified by the abstract semantics.
We call this the "White Box model". When using implementations
that follow the White Box model, it's never necessary to use
volatile (as the Standard itself points out: "The keyword
volatile would then be redundant.").
At the other end of the spectrum, a "Black Box model" can be
inferred based on the statements in 5.1.2.3 p 5. Consider an
implementation that secretly maintains "shadow memory" for all
objects in a program execution. Regular memory addresses are
used for address-taking or index calculation, but any actual
memory accesses would access only the shadow memory (which is at
a different location), except for volatile-qualified accesses
which would load or store objects in the regular object memory
(ie, at the machine addresses ...
read more »