Non-volatile compiler optimizations

Discussion in 'C Programming' started by Noob, May 12, 2010.

  1. Noob

    Noob Guest

    Hello,

    I'm trying to understand volatile. I have written trivial code
    where a variable is tested to decide whether to break early out
    of a loop.

    I've considered 4 different cases.
    auto variable
    static with function scope
    static with file scope
    external linkage

    AFAIU, since the abstract machine model is single threaded,
    if a variable is not volatile, then its value cannot change
    inside the loop. (Signals might be an exception?)

    Thus, in every case, the compiler is allowed to remove the
    test, because it may assume that the variable cannot change.

    I tested with gcc -O3
    In autovar, staticfuncvar, and filescopevar gcc removed the test.
    In globalvar, gcc did not remove the test.

    Is it unsafe for the compiler to assume that global's value
    cannot change between two iterations of the loop?

    (Signals might play a role here. A different "module" might
    catch a signal, and change the value of the variable inside
    the signal handler?)

    If so, does that mean that volatile is not needed in the case
    of a variable with external linkage?

    Regards.


    void foo(void);
    void autovar(void)
    {
    int i;
    int local = 0;
    for (i = 0; i < 1000; ++i)
    {
    if (local) break;
    foo();
    }
    }
    void staticfuncvar(void)
    {
    int i;
    static int local = 0;
    for (i = 0; i < 1000; ++i)
    {
    if (local) break;
    foo();
    }
    }
    static int flocal = 0;
    void filescopevar(void)
    {
    int i;
    for (i = 0; i < 1000; ++i)
    {
    if (flocal) break;
    foo();
    }
    }
    int global = 0;
    void globalvar(void)
    {
    int i;
    for (i = 0; i < 1000; ++i)
    {
    if (global) break;
    foo();
    }
    }
     
    Noob, May 12, 2010
    #1
    1. Advertising

  2. Noob

    Tom St Denis Guest

    On May 12, 12:01 pm, Noob <r...@127.0.0.1> wrote:
    > Hello,
    >
    > I'm trying to understand volatile. I have written trivial code
    > where a variable is tested to decide whether to break early out
    > of a loop.
    >
    > I've considered 4 different cases.
    > auto variable
    > static with function scope
    > static with file scope
    > external linkage
    >
    > AFAIU, since the abstract machine model is single threaded,
    > if a variable is not volatile, then its value cannot change
    > inside the loop. (Signals might be an exception?)


    C spec doesn't talk to threads so the compiler only needs to read the
    object when it's logically required from a single thread point of
    view. 'volatile' forces the compiler to read the object [or write it]
    whenever such expressions occur in the program.

    e.g.

    int a = 4;
    if (a == 4) { ... }

    The compiler doesn't have to read 'a' the second time (during the
    test), doesn't have to, but could if it wants. So the fact you see
    different behaviour is not surprising.

    Tom
     
    Tom St Denis, May 12, 2010
    #2
    1. Advertising

  3. Noob

    John Regehr Guest

    > int global = 0;
    > void globalvar(void)
    > {
    >    int i;
    >    for (i = 0; i < 1000; ++i)
    >    {
    >      if (global) break;
    >      foo();
    >    }
    >
    > }


    The issue here is that global could be changed after it is initialized
    but before the function executes. That is why the loop is not
    optimized away. It has nothing to do with signals. If you want to
    use a variable for communication between signals/threads it must be
    volatile, global is not enough.
     
    John Regehr, May 12, 2010
    #3
  4. John Regehr <> writes:

    >> int global = 0;
    >> void globalvar(void)
    >> {
    >>    int i;
    >>    for (i = 0; i < 1000; ++i)
    >>    {
    >>      if (global) break;
    >>      foo();
    >>    }
    >>
    >> }

    >
    > The issue here is that global could be changed after it is initialized
    > but before the function executes. That is why the loop is not
    > optimized away. It has nothing to do with signals. If you want to
    > use a variable for communication between signals/threads it must be
    > volatile, global is not enough.


    Even volatile is probably not enough. In the presence of interrupts, C
    only makes sufficient guarantees about objects of type sig_atomic_t. As
    for threads, I think the combined wisdom over on comp.programming.
    threads is that, on modern hardware, the semantics of volatile are not
    enough to do most things that you might want it to do (obviously it does
    something, just not enough to be useful).

    A volatile shared object *may* be enough, but that will be because of
    the way some particular system works.

    The new standard (C1x) intends to address these deficiencies.

    --
    Ben.
     
    Ben Bacarisse, May 13, 2010
    #4
  5. Noob

    Noob Guest

    John Regehr wrote:

    > Noob wrote:
    >
    >> int global = 0;
    >> void globalvar(void)
    >> {
    >> int i;
    >> for (i = 0; i< 1000; ++i)
    >> {
    >> if (global) break;
    >> foo();
    >> }
    >> }

    >
    > The issue here is that global could be changed after it is initialized
    > but before the function executes. That is why the loop is not
    > optimized away.


    Thanks to all for nudging me in the right direction.

    AFAIU, gcc does not optimize the if-statement because foo
    might modify global.

    If foo did not modify global, IPA might enable gcc to
    optimize the if-statement away.

    http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto-812

    $ cat global.c
    extern void foo(void);
    extern int global;
    void globalvar(void)
    {
    int i;
    global = 0;
    for (i = 0; i < 1000; ++i)
    {
    if (global) break;
    foo();
    }
    }

    NB: global is initialized just before entering the loop.

    $ gcc -O3 -fomit-frame-pointer -S global.c

    _globalvar:
    pushl %ebx
    xorl %ebx, %ebx
    subl $8, %esp
    movl $0, _global
    L2:
    call _foo
    cmpl $999, %ebx
    je L5
    movl _global, %eax
    addl $1, %ebx
    testl %eax, %eax
    je L2
    L5:
    addl $8, %esp
    popl %ebx
    ret

    Two unrelated comments.

    1) The register allocation seems sub-optimal. I would have
    used ecx (scratch register) instead of ebx.

    2) Why is gcc allocating 8 octets on the stack?

    > It has nothing to do with signals.


    It seems the compiler would have to consider this possibility?
    An (asynchronous) signal handler might modify global at any
    time during the execution of the loop?

    > If you want to use a variable for communication between
    > signals/threads it must be volatile, global is not enough.


    volatile, yes.

    And sig_atomic_t (which is an int on my platform).

    Regards.
     
    Noob, May 17, 2010
    #5
  6. Noob

    Alan Curry Guest

    In article <hsr4at$njv$>, Noob <root@127.0.0.1> wrote:
    >
    >_globalvar:
    > pushl %ebx
    > xorl %ebx, %ebx
    > subl $8, %esp
    > movl $0, _global
    >L2:
    > call _foo
    > cmpl $999, %ebx
    > je L5
    > movl _global, %eax
    > addl $1, %ebx
    > testl %eax, %eax
    > je L2
    >L5:
    > addl $8, %esp
    > popl %ebx
    > ret
    >
    >Two unrelated comments.
    >
    >1) The register allocation seems sub-optimal. I would have
    >used ecx (scratch register) instead of ebx.


    But the function foo is allowed to clobber %ecx, so you'd have to save %ecx
    before calling foo, and restore it afterward. That would be 1000 saves and
    restores instead of 1.

    >
    >2) Why is gcc allocating 8 octets on the stack?


    stack alignment. Together with the 4 bytes of %ebx and 4 byte return address
    already on the stack, that makes 16 bytes, which is a nice round number. It's
    considered polite (a.k.a. "an ABI requirement") to make sure the offset from
    the stack pointer you received at the start of your function to the stack
    pointer you give to a called function is a nice round number.

    If all functions in a program obey that rule, and the initial stack pointer
    was a nice round number, then every function gets an aligned stack pointer,
    which is easier than making each function responsible for checking the
    incoming stack pointer's low bits before using it as a base for local
    variables.

    --
    Alan Curry
     
    Alan Curry, May 17, 2010
    #6
  7. Noob

    Noob Guest

    Alan Curry wrote:

    > Noob wrote:
    >
    >> _globalvar:
    >> pushl %ebx
    >> xorl %ebx, %ebx
    >> subl $8, %esp
    >> movl $0, _global
    >> L2:
    >> call _foo
    >> cmpl $999, %ebx
    >> je L5
    >> movl _global, %eax
    >> addl $1, %ebx
    >> testl %eax, %eax
    >> je L2
    >> L5:
    >> addl $8, %esp
    >> popl %ebx
    >> ret
    >>
    >> Two unrelated comments.
    >>
    >> 1) The register allocation seems sub-optimal. I would have
    >> used ecx (scratch register) instead of ebx.

    >
    > But the function foo is allowed to clobber %ecx, so you'd have to save %ecx
    > before calling foo, and restore it afterward. That would be 1000 saves and
    > restores instead of 1.


    Your explanation makes perfect sense.

    >> 2) Why is gcc allocating 8 octets on the stack?

    >
    > stack alignment. Together with the 4 bytes of %ebx and 4 byte return address
    > already on the stack, that makes 16 bytes, which is a nice round number. It's
    > considered polite (a.k.a. "an ABI requirement") to make sure the offset from
    > the stack pointer you received at the start of your function to the stack
    > pointer you give to a called function is a nice round number.


    I had forgotten about stack alignment.

    Thanks for the insight.
     
    Noob, May 17, 2010
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. niju

    Optimizations

    niju, Aug 21, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    397
    tom pester
    Aug 21, 2005
  2. ben
    Replies:
    5
    Views:
    597
    Ulrich Eckhardt
    Jan 11, 2005
  3. llothar
    Replies:
    7
    Views:
    344
    Tim Prince
    May 13, 2007
  4. sammy

    Compiler optimizations

    sammy, Jan 15, 2008, in forum: C Programming
    Replies:
    32
    Views:
    966
    jacob navia
    Jan 21, 2008
  5. Replies:
    2
    Views:
    356
    Andrey Tarasevich
    Mar 30, 2008
Loading...

Share This Page