What's an "atomic" operation (in a threaded context)?

Discussion in 'Python' started by Paul Moore, Nov 12, 2003.

  1. Paul Moore

    Paul Moore Guest

    I can't find anything which spells this out in the manuals. I guess
    that, at some level, the answer is "a single bytecode operation", but
    I'm not sure that explains it for me.

    This thought was triggered by a comment on the Python Cookbook site,
    which basically said that it was OK to do
    tss = {}
    ...
    id = thread.get_ident()
    tss[id] = {}

    (where tss is a global) without a lock, because id is unique to the
    thread.

    But couldn't this result in 2 threads allocating a new entry in tss at
    the same time, and so get tss in an inconsistent state?

    I tried to understand this with the dis module:

    >>> import dis
    >>> d = {}
    >>> def f(n):

    .... d[n] = {}
    ....
    >>> dis.dis(f)

    2 0 BUILD_MAP 0
    3 LOAD_GLOBAL 0 (d)
    6 LOAD_FAST 0 (n)
    9 STORE_SUBSCR
    10 LOAD_CONST 0 (None)
    13 RETURN_VALUE

    But I'm not sure to make of that.

    Can anyone clarify this for me? I'd like to avoid littering my
    threaded code with huge numbers of unnecessary locks...

    Thanks,
    Paul.
    --
    This signature intentionally left blank
     
    Paul Moore, Nov 12, 2003
    #1
    1. Advertising

  2. Paul Moore

    Peter Hansen Guest

    Paul Moore wrote:
    >
    > I can't find anything which spells this out in the manuals. I guess
    > that, at some level, the answer is "a single bytecode operation", but
    > I'm not sure that explains it for me.
    >
    > This thought was triggered by a comment on the Python Cookbook site,
    > which basically said that it was OK to do
    > tss = {}
    > ...
    > id = thread.get_ident()
    > tss[id] = {}
    >
    > (where tss is a global) without a lock, because id is unique to the
    > thread.
    >
    > But couldn't this result in 2 threads allocating a new entry in tss at
    > the same time, and so get tss in an inconsistent state?
    >
    > I tried to understand this with the dis module:
    >
    > >>> import dis
    > >>> d = {}
    > >>> def f(n):

    > ... d[n] = {}
    > ...
    > >>> dis.dis(f)

    > 2 0 BUILD_MAP 0
    > 3 LOAD_GLOBAL 0 (d)
    > 6 LOAD_FAST 0 (n)
    > 9 STORE_SUBSCR
    > 10 LOAD_CONST 0 (None)
    > 13 RETURN_VALUE


    In a nutshell, your "single bytecode operation" theory is correct,
    and each of the above lines represents a single bytecode. Therefore
    the only line which matters is STORE_SUBSCR (the actual assignment)
    and it is atomic, so you can have any number of threads running but
    no matter how they try, they won't be able to interrupt each other's
    STORE_SUBSCR operations and the dictionary will stay in a consistent
    state.

    Much or most of the time, basic operation such as this on the standard
    mutable primitive types (i.e. lists, dictionaries) are atomic/safe for
    use in the way you wish to use them, without needing locks.

    -Peter
     
    Peter Hansen, Nov 12, 2003
    #2
    1. Advertising

  3. Paul Moore

    Dave Brueck Guest

    Paul Moore wrote:
    > I can't find anything which spells this out in the manuals. I guess
    > that, at some level, the answer is "a single bytecode operation", but


    Bytecode instructions are atomic in that you won't switch to another Python
    thread "mid-bytecode", but that's only sorta interesting.
    ;-)

    > This thought was triggered by a comment on the Python Cookbook site,
    > which basically said that it was OK to do
    > tss = {}
    > ...
    > id = thread.get_ident()
    > tss[id] = {}
    >
    > (where tss is a global) without a lock, because id is unique to the
    > thread.
    >
    > But couldn't this result in 2 threads allocating a new entry in tss at
    > the same time, and so get tss in an inconsistent state?


    This is what's more interesting: object access is threadsafe in the sense that
    you can't get Python's internal data structures into an inconsistent state via
    pure Python code (or, for that matter, well-behaved C extensions). IOW, the
    above code won't ever crash the interpreter or screw up Python's internal
    state. Even if two different threads did tss[5] = <somevalue> (both setting a
    value using the same key) you wouldn't get an inconsistent internal state,
    although you would obviously have no guarantees about the ordering of the
    operations - you'd never know if thread A or thread B changed it last.

    > Can anyone clarify this for me? I'd like to avoid littering my
    > threaded code with huge numbers of unnecessary locks...


    The above applies to CPython - I'm not too sure about Jython. If you're pretty
    familiar with multithreading in other languages and view Python's built-in data
    types as fundamental types like integers, then the rules for when to use
    locking are pretty much the same as in other languages.

    -Dave
     
    Dave Brueck, Nov 12, 2003
    #3
  4. Paul Moore

    Paul Moore Guest

    Peter Hansen <> writes:

    > In a nutshell, your "single bytecode operation" theory is correct,
    > and each of the above lines represents a single bytecode. Therefore
    > the only line which matters is STORE_SUBSCR (the actual assignment)
    > and it is atomic, so you can have any number of threads running but
    > no matter how they try, they won't be able to interrupt each other's
    > STORE_SUBSCR operations and the dictionary will stay in a consistent
    > state.


    Thanks. I think I'd got bogged down in C-level thinking ("STORE_SUBSCR
    can reallocate the dictionary, so it's address changes, so doesn't
    that mean another thread might have loaded its address and then it
    moves...")

    > Much or most of the time, basic operation such as this on the standard
    > mutable primitive types (i.e. lists, dictionaries) are atomic/safe for
    > use in the way you wish to use them, without needing locks.


    That's good to know. And the trick of using dis to analyze it will
    help as well.

    Thanks for the explanation,
    Paul
    --
    This signature intentionally left blank
     
    Paul Moore, Nov 12, 2003
    #4
  5. Peter Hansen <> writes on Wed, 12 Nov 2003 17:49:10 -0500:
    > ...
    > > 9 STORE_SUBSCR

    > ...
    > In a nutshell, your "single bytecode operation" theory is correct,
    > and each of the above lines represents a single bytecode. Therefore
    > the only line which matters is STORE_SUBSCR (the actual assignment)
    > and it is atomic


    Are you sure? "STORE_SUBSCR" may dereference an object causing
    the object to be deleted. Its destructor may execute arbitrary
    code.

    Almost surely (I trust the Python developpers), this happens
    after the assignment has been made (but before "STORE_SUBSCR" finished).
    Maybe, this is enough "atomicity" for you.


    Dieter
     
    Dieter Maurer, Nov 14, 2003
    #5
  6. Paul Moore

    Parzival Guest

    Dieter Maurer wrote:

    > Peter Hansen <> writes on Wed, 12 Nov 2003 17:49:10

    -0500:
    >> > 9 STORE_SUBSCR

    >> In a nutshell, your "single bytecode operation" theory is correct,
    >> and each of the above lines represents a single bytecode. Therefore
    >> the only line which matters is STORE_SUBSCR (the actual assignment)
    >> and it is atomic

    >
    > Are you sure? "STORE_SUBSCR" may dereference an object causing

    <snip>
    > Almost surely (I trust the Python developpers), this happens
    > after the assignment has been made (but before "STORE_SUBSCR" finished).
    > Maybe, this is enough "atomicity" for you.


    Do you really think this is a sound approach? You are obviously concerned
    about correct concurrency, yet you are prepared to disassemble Python byte
    code and to speculate about the actions, and then you are going to risk the
    correctness of your program on what you observe and deduce?

    I'll wager that nobody who is somebody in the Python world is going to offer
    *any* guarantees about what source code generates what byte-code. AFAIK,
    the Python GIL guarantees that at least one bytecode operation is atomic,
    but the _language_ makes absolutely no commitments as to what source code
    construction is guaranteed to be (now and forever) atomic. If I am wrong
    about this, I want to see it in writing in the official documentation.

    Until then, for correct concurrent operation, you *must* lock around all
    shared concurrent data accesses.

    (-: Stop, smile, and enjoy your breathing :)
    -- Parzival
    -- Reply-to is confuggled: parzp (@) shaw (.) ca
     
    Parzival, Nov 15, 2003
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sumukh
    Replies:
    8
    Views:
    3,179
    Roedy Green
    Sep 23, 2005
  2. Kashish
    Replies:
    1
    Views:
    342
    msalters
    May 27, 2005
  3. Atomic Operation

    , Aug 27, 2006, in forum: Java
    Replies:
    3
    Views:
    456
  4. Srinu

    Functions for atomic operation.

    Srinu, Sep 30, 2009, in forum: C Programming
    Replies:
    1
    Views:
    346
    Nick Keighley
    Sep 30, 2009
  5. Charles Oliver Nutter

    [ANN] atomic 0.0.1 - An atomic reference for Ruby

    Charles Oliver Nutter, Jun 8, 2010, in forum: Ruby
    Replies:
    5
    Views:
    234
    Robert Dober
    Jun 8, 2010
Loading...

Share This Page