locks

Discussion in 'Python' started by Ajay, Oct 13, 2004.

  1. Ajay

    Ajay Guest

    hi!

    what would happen if i try to access a variable locked by another thread? i
    am not trying to obtain a lock on it, just trying to access it.

    cheers





    ----------------------------------------------------------------
    This message was sent using IMP, the Internet Messaging Program.
    Ajay, Oct 13, 2004
    #1
    1. Advertising

  2. Ajay wrote:

    > what would happen if i try to access a variable locked by another thread?
    > i am not trying to obtain a lock on it, just trying to access it.


    If you first tell us haw you actually lock a variable, we then might be able
    to tell you what happens if you access it....

    And in general: python has the PIL - Python Interpreter Lock - that
    "brutally" serializes (hopefully) all accesses to python data-structures -
    so e.g. running several threads, appending to the same list, won't result
    in messing up the internal list structure causing segfaults or the like.
    That makes programming pretty easy, at the cost of lots of waiting for the
    individual threads.

    --
    Regards,

    Diez B. Roggisch
    Diez B. Roggisch, Oct 13, 2004
    #2
    1. Advertising

  3. Ajay

    Cliff Wells Guest

    On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:
    > Ajay wrote:
    >
    > > what would happen if i try to access a variable locked by another thread?
    > > i am not trying to obtain a lock on it, just trying to access it.

    >
    > If you first tell us haw you actually lock a variable, we then might be able
    > to tell you what happens if you access it....
    >
    > And in general: python has the PIL - Python Interpreter Lock - that


    I think you mean the GIL (Global Interpreter Lock). PIL is the
    excellent Python Imaging Library.

    > "brutally" serializes (hopefully) all accesses to python data-structures -


    Nope. It doesn't do this. For access to items such as integers you are
    probably fine, but for things like lists, dictionaries, class
    attributes, etc, you're on your own. The GIL only ensures that two
    threads won't be executing Python bytecode simultaneously. It locks the
    Python *interpreter*, not your program or data structures.

    > so e.g. running several threads, appending to the same list, won't result
    > in messing up the internal list structure causing segfaults or the like.


    True, you won't get segfaults. However, you may very well get a
    traceback or mangled data.

    > That makes programming pretty easy, at the cost of lots of waiting for the
    > individual threads.


    Threading in Python is pretty easy, but certainly not *that* easy. And
    just to be certain, importing PIL won't help you here either <wink>.

    Regards,
    Cliff

    --
    Cliff Wells <>
    Cliff Wells, Oct 13, 2004
    #3
  4. Fredrik Lundh, Oct 13, 2004
    #4
  5. > I think you mean the GIL (Global Interpreter Lock). PIL is the
    > excellent Python Imaging Library.


    I certainly did - to less caffeine in system yet...


    > Nope. It doesn't do this. For access to items such as integers you are
    > probably fine, but for things like lists, dictionaries, class
    > attributes, etc, you're on your own. The GIL only ensures that two
    > threads won't be executing Python bytecode simultaneously. It locks the
    > Python *interpreter*, not your program or data structures.

    <snip>
    > True, you won't get segfaults. However, you may very well get a
    > traceback or mangled data.


    I thougth that e.g. list manipulations are bytecodes, thus atomic.
    So far, I never ran into serious problems giving me garbage lists or
    stacktraces.

    Nevertheless, I of course used queues and locked access to certain
    datastructures when critical sections had to be entered - but in
    comparision to java, I never had to ask for a specially thread-hardened
    variant of a collection.


    > Threading in Python is pretty easy, but certainly not *that* easy. And
    > just to be certain, importing PIL won't help you here either <wink>.


    Unless you plan to do some nifty image manipulation work multithreaded....
    --
    Regards,

    Diez B. Roggisch
    Diez B. Roggisch, Oct 13, 2004
    #5
  6. Cliff Wells wrote:
    > On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:
    >>"brutally" serializes (hopefully) all accesses to python data-structures -

    >
    > Nope. It doesn't do this. For access to items such as integers you are
    > probably fine, but for things like lists, dictionaries, class
    > attributes, etc, you're on your own. The GIL only ensures that two
    > threads won't be executing Python bytecode simultaneously. It locks the
    > Python *interpreter*, not your program or data structures.
    >
    >>so e.g. running several threads, appending to the same list, won't result
    >>in messing up the internal list structure causing segfaults or the like.

    >
    > True, you won't get segfaults. However, you may very well get a
    > traceback or mangled data.
    >
    >>That makes programming pretty easy, at the cost of lots of waiting for the
    >>individual threads.

    >
    > Threading in Python is pretty easy, but certainly not *that* easy.


    Cliff, do you have any references, or even personal experience to
    relate about anything on which you comment above?

    In my experience, and to my knowledge, Python threading *is*
    that easy (ignoring higher level issues such as race conditions
    and deadlocks and such), and the GIL *does* do exactly what Diez
    suggests, and you will *not* get tracebacks nor (again, ignoring
    higher level issues) mangled data.

    You've tentatively upset my entire picture of the CPython (note,
    CPython only) interpreter's structure and concept. Please tell
    me you were going a little overboard to protect a possible
    newbie from himself or something.

    -Peter
    Peter L Hansen, Oct 13, 2004
    #6
  7. Ajay

    Cliff Wells Guest

    On Wed, 2004-10-13 at 08:52 -0400, Peter L Hansen wrote:
    > Cliff Wells wrote:
    > > On Wed, 2004-10-13 at 14:11 +0200, Diez B. Roggisch wrote:
    > >>"brutally" serializes (hopefully) all accesses to python data-structures -

    > >
    > > Nope. It doesn't do this. For access to items such as integers you are
    > > probably fine, but for things like lists, dictionaries, class
    > > attributes, etc, you're on your own. The GIL only ensures that two
    > > threads won't be executing Python bytecode simultaneously. It locks the
    > > Python *interpreter*, not your program or data structures.
    > >
    > >>so e.g. running several threads, appending to the same list, won't result
    > >>in messing up the internal list structure causing segfaults or the like.

    > >
    > > True, you won't get segfaults. However, you may very well get a
    > > traceback or mangled data.
    > >
    > >>That makes programming pretty easy, at the cost of lots of waiting for the
    > >>individual threads.

    > >
    > > Threading in Python is pretty easy, but certainly not *that* easy.

    >
    > Cliff, do you have any references, or even personal experience to
    > relate about anything on which you comment above?


    I'm no expert on Python internals but it seems clear that an operation
    such as [].append() is going to span multiple bytecode instructions. It
    seems to me that if those instructions span the boundary defined by
    sys.getcheckinterval() that the operation won't happen in a single
    thread context switch (unless the interpreter has explicit code to keep
    the entire operation within a single context).

    I'm no expert at dis nor Python bytecode, but I'll give it a shot :)

    >>> l = []
    >>> dis.dis(l.append(1))

    134 0 LOAD_GLOBAL 0 (findlabels)
    3 LOAD_FAST 0 (code)
    6 CALL_FUNCTION 1
    9 STORE_FAST 5 (labels)


    ....
    <snip dis spitting out over 500 lines of bytecode>
    ....

    172 >> 503 PRINT_NEWLINE
    504 JUMP_ABSOLUTE 33
    >> 507 POP_TOP

    508 POP_BLOCK
    >> 509 LOAD_CONST 0 (None)

    512 RETURN_VALUE
    >>>



    It looks fairly non-atomic to me. It's certainly smaller than the
    default value for sys.getcheckinterval() (which defaults to 1000, iirc),
    but that's hardly a guarantee that the operation won't cross the
    boundary for a context switch (unless, as I mentioned above, the
    interpreter has specific code to prevent the switch until the operation
    is complete <shrug>).

    I recall a similar discussion about three years ago on this list about
    this very thing where people who know far more about it than I do flamed
    it out a bit, but damned if I recall the outcome :p I do recall that it
    didn't convince me to alter the approach I recommended to the OP.

    > In my experience, and to my knowledge, Python threading *is*
    > that easy (ignoring higher level issues such as race conditions
    > and deadlocks and such), and the GIL *does* do exactly what Diez
    > suggests, and you will *not* get tracebacks nor (again, ignoring
    > higher level issues) mangled data.


    Okay, to clarify, for the most part I *was* in fact referring to "higher
    level issues". I doubt tracebacks or mangled data would occur simply
    due to the operation's being non-atomic. However, if you have code that
    say, checks for an item's existence in a list and then appends it if it
    isn't there, it may cause the program to fail if another thread adds
    that item between the time of the check and the time of the append.
    This is what I was referring to by potential for mangled data and/or
    tracebacks.

    > You've tentatively upset my entire picture of the CPython (note,
    > CPython only) interpreter's structure and concept. Please tell
    > me you were going a little overboard to protect a possible
    > newbie from himself or something.


    Certainly protecting the newbie, but not going overboard, IMHO. I've
    written quite a number of threaded Python apps and I religiously
    acquire/release whenever dealing with mutable data structures (lists,
    etc). To date this approach has served me well. I code fairly
    conservatively when it comes to threads as I am *absolutely* certain
    that debugging a broken threaded application is very near the bottom of
    my list of favorite things ;)

    Regards,
    Cliff

    --
    Cliff Wells <>
    Cliff Wells, Oct 13, 2004
    #7
  8. Ajay

    Duncan Booth Guest

    Cliff Wells wrote:

    > I'm no expert at dis nor Python bytecode, but I'll give it a shot :)
    >
    >>>> l = []
    >>>> dis.dis(l.append(1))

    > 134 0 LOAD_GLOBAL 0 (findlabels)
    > 3 LOAD_FAST 0 (code)
    > 6 CALL_FUNCTION 1
    > 9 STORE_FAST 5 (labels)
    >
    >
    > ...
    ><snip dis spitting out over 500 lines of bytecode>
    > ...
    >
    > 172 >> 503 PRINT_NEWLINE
    > 504 JUMP_ABSOLUTE 33
    > >> 507 POP_TOP

    > 508 POP_BLOCK
    > >> 509 LOAD_CONST 0 (None)

    > 512 RETURN_VALUE
    >>>>

    >
    >
    > It looks fairly non-atomic to me.


    The append method of a list returns None. dis.dis(None) disassembles the
    code from the last traceback object, nothing at all to do with your
    l.append(1) code.

    Try this instead:

    >>> def f():

    l.append(1)


    >>> dis.dis(f)

    2 0 LOAD_GLOBAL 0 (l)
    3 LOAD_ATTR 1 (append)
    6 LOAD_CONST 1 (1)
    9 CALL_FUNCTION 1
    12 POP_TOP
    13 LOAD_CONST 0 (None)
    16 RETURN_VALUE
    Duncan Booth, Oct 13, 2004
    #8
  9. > Okay, to clarify, for the most part I *was* in fact referring to "higher
    > level issues". I doubt tracebacks or mangled data would occur simply
    > due to the operation's being non-atomic. However, if you have code that
    > say, checks for an item's existence in a list and then appends it if it
    > isn't there, it may cause the program to fail if another thread adds
    > that item between the time of the check and the time of the append.
    > This is what I was referring to by potential for mangled data and/or
    > tracebacks.


    _That_ of course I'm very well aware of - but to my expirience, with several
    dozen threads appending to one list I never encountered a interpreter
    failure. That is in contrast to java, where you get an
    "ConcurrentModificationException" unless you don't ask specifically for a
    synchronized variant of the collection of yours.


    --
    Regards,

    Diez B. Roggisch
    Diez B. Roggisch, Oct 13, 2004
    #9
  10. Ajay

    Cliff Wells Guest

    On Wed, 2004-10-13 at 14:03 +0000, Duncan Booth wrote:
    > Cliff Wells wrote:
    >
    > > I'm no expert at dis nor Python bytecode, but I'll give it a shot :)
    > >
    > >>>> l = []
    > >>>> dis.dis(l.append(1))

    > > 134 0 LOAD_GLOBAL 0 (findlabels)
    > > 3 LOAD_FAST 0 (code)
    > > 6 CALL_FUNCTION 1
    > > 9 STORE_FAST 5 (labels)
    > >
    > >
    > > ...
    > ><snip dis spitting out over 500 lines of bytecode>
    > > ...
    > >
    > > 172 >> 503 PRINT_NEWLINE
    > > 504 JUMP_ABSOLUTE 33
    > > >> 507 POP_TOP

    > > 508 POP_BLOCK
    > > >> 509 LOAD_CONST 0 (None)

    > > 512 RETURN_VALUE
    > >>>>

    > >
    > >
    > > It looks fairly non-atomic to me.

    >
    > The append method of a list returns None. dis.dis(None) disassembles the
    > code from the last traceback object, nothing at all to do with your
    > l.append(1) code.


    Ah, thanks. I thought 500+ lines of bytecode was a bit excessive for a
    simple append(), but didn't see any reason why. I saw the comment in
    the docs about dis returning the last traceback if no argument was
    provided but didn't see how that applied here.

    >
    > Try this instead:
    >
    > >>> def f():

    > l.append(1)
    >
    >
    > >>> dis.dis(f)

    > 2 0 LOAD_GLOBAL 0 (l)
    > 3 LOAD_ATTR 1 (append)
    > 6 LOAD_CONST 1 (1)
    > 9 CALL_FUNCTION 1
    > 12 POP_TOP
    > 13 LOAD_CONST 0 (None)
    > 16 RETURN_VALUE


    Much more reasonable. Still, I think my argument stands since this
    appears non-atomic as well, although I do note this:

    >>> l = []
    >>> dis.dis(l.append)

    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.3/dis.py", line 46, in dis
    raise TypeError, \
    TypeError: don't know how to disassemble builtin_function_or_method
    objects
    >>>


    This suddenly gave me a smack on the head that list.append is
    undoubtedly written in C and might, in fact, retain the GIL for the
    duration of the function in which case the operation might, in fact, be
    atomic (yes, I know that isn't necessarily what the above traceback was
    saying, but it served as a clue-stick).

    Still, despite the probability of being quite mistaken about the low-
    level internals of the operation, I still stand by my assertion that not
    using locks for mutable data is ill-advised at best, for the reasons I
    outlined in my previous post (aside from the poorly executed
    disassembly).

    Regards,
    Cliff

    --
    Cliff Wells <>
    Cliff Wells, Oct 13, 2004
    #10
  11. Ajay

    Cliff Wells Guest

    On Wed, 2004-10-13 at 16:10 +0200, Diez B. Roggisch wrote:
    > > Okay, to clarify, for the most part I *was* in fact referring to "higher
    > > level issues". I doubt tracebacks or mangled data would occur simply
    > > due to the operation's being non-atomic. However, if you have code that
    > > say, checks for an item's existence in a list and then appends it if it
    > > isn't there, it may cause the program to fail if another thread adds
    > > that item between the time of the check and the time of the append.
    > > This is what I was referring to by potential for mangled data and/or
    > > tracebacks.

    >
    > _That_ of course I'm very well aware of - but to my expirience, with several
    > dozen threads appending to one list I never encountered a interpreter
    > failure. That is in contrast to java, where you get an
    > "ConcurrentModificationException" unless you don't ask specifically for a
    > synchronized variant of the collection of yours.


    Have you looked at the Queue module? It was explicitly designed for
    this sort of thing and removes all doubt about thread-safety.

    Regards,
    Cliff

    --
    Cliff Wells <>
    Cliff Wells, Oct 13, 2004
    #11
  12. >
    > Have you looked at the Queue module? It was explicitly designed for
    > this sort of thing and removes all doubt about thread-safety.


    Sure, and I use it when appropriate, as said in response to another post of
    yours.

    But so far in my expirience, explicit serialization of access to certain
    data structures was only necessary when more complicated structural
    modifications where under way - but the usual suspects, as appending to
    lists, insertion of values in dicts and the like never needed this. And
    that I wanted to point out.

    --
    Regards,

    Diez B. Roggisch
    Diez B. Roggisch, Oct 13, 2004
    #12
  13. Ajay

    Duncan Booth Guest

    Cliff Wells wrote:

    > This suddenly gave me a smack on the head that list.append is
    > undoubtedly written in C and might, in fact, retain the GIL for the
    > duration of the function in which case the operation might, in fact, be
    > atomic (yes, I know that isn't necessarily what the above traceback was
    > saying, but it served as a clue-stick).
    >


    Roughly correct. list.append is written in C and therefore you might assume
    it is atomic. However, it increases the size of a list which means it may
    allocate memory which could cause the garbage collector to kick in which in
    turn might free up cycles which could release references to objects with
    __del__ methods which could execute other byte code at which point all bets
    are off.
    Duncan Booth, Oct 13, 2004
    #13
  14. Ajay

    Tim Peters Guest

    [Cliff Wells]
    >> This suddenly gave me a smack on the head that list.append is
    >> undoubtedly written in C and might, in fact, retain the GIL for the
    >> duration of the function in which case the operation might, in fact, be
    >> atomic (yes, I know that isn't necessarily what the above traceback was
    >> saying, but it served as a clue-stick).


    [Duncan Booth]
    > Roughly correct. list.append is written in C and therefore you might assume
    > it is atomic. However, it increases the size of a list which means it may
    > allocate memory which could cause the garbage collector to kick in which in
    > turn might free up cycles which could release references to objects with
    > __del__ methods which could execute other byte code at which point all bets
    > are off.


    Not in CPython, no. The only things that can trigger CPython's cyclic
    gc are calling gc.collect() explicitly, or (from time to time)
    creating a new container object (a new object that participates in
    cyclic gc). If list.append() can't get enough memory "on the first
    try" to extend the existing list, it raises MemoryError. And if a
    thread is in the bowels of list.append, the GIL prevents any other
    thread from triggering cyclic GC for the duration.
    Tim Peters, Oct 13, 2004
    #14
  15. Ajay

    Jeff Shannon Guest

    Diez B. Roggisch wrote:

    >But so far in my expirience, explicit serialization of access to certain
    >data structures was only necessary when more complicated structural
    >modifications where under way - but the usual suspects, as appending to
    >lists, insertion of values in dicts and the like never needed this. And
    >that I wanted to point out.
    >
    >


    .... provided, of course, that no thread expects to maintain internal
    consistency. For example, a thread doing something like:

    foo_list.append(foo)
    assert(foo == foo_list[-1])

    The assertion here is *not* guaranteed to be true, and if one is
    modifying and then reading a mutable object, it can be somewhat tricky
    to ensure that no such assumptions creep into code.

    Similarly, in

    if foo_list[-1] is not None:
    foo = foo_list.pop()

    Here, foo may indeed be None, because another thread may have appended
    to the list in between the test and the call to pop() -- but this is
    getting into the "more complicated structural modifications" that you
    mention.

    A slightly trickier example, though:

    foo_list[-1] == foo_list[-1]

    I believe that this can't be guaranteed to always evaluate to True in a
    (non-locking) multithreaded case, because foo_list can be modified in
    between the two lookups.

    Thus, while it's pretty safe to assume that accessing shared objects
    won't, in and of itself, cause an exception, the case about "mangled
    data" is hazier and depends on how exactly you mean "mangled".

    I expect that most of us know this and would never assume otherwise, I
    just wanted to make that explicit for the benefit of the O.P. and others
    who're unfamiliar with threading, as it seems to me that this point
    might've gotten a bit confused for those who didn't already understand
    it well. :)

    Jeff Shannon
    Technician/Programmer
    Credit International
    Jeff Shannon, Oct 13, 2004
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jim Heavey

    ASPNET_wp locks my machine up

    Jim Heavey, Oct 15, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    384
    Jim Cheshire [MSFT]
    Oct 16, 2003
  2. Tom Pester

    Migrating to ADO.NET and locks

    Tom Pester, Oct 19, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    399
    Natty Gur
    Oct 20, 2003
  3. Rob Mayo
    Replies:
    3
    Views:
    459
    Joao S Cardoso [MVP]
    Nov 22, 2003
  4. Earl Teigrob
    Replies:
    3
    Views:
    342
    Alvin Bruney [MVP]
    Mar 4, 2004
  5. Karl Bauer

    Server.Execute locks dll

    Karl Bauer, Jun 3, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    347
    Karl Bauer
    Jun 3, 2004
Loading...

Share This Page