Queue can result in nested monitor deadlock

J

Jonathan Amsterdam

I think there's a slight design flaw in the Queue class that makes it
hard to avoid nested monitor deadlock. The problem is that the mutex
used by the Queue is not easy to change. You can then easily get
yourself into the following situation (nested monitor deadlock):

Say we have a class that contains a Queue and some other things. The
class's internals are protected by a mutex M. Initially the Queue is
empty. The only access to the queue is through this class.

Thread 1 locks M, then calls Queue.get(). It blocks. At this point the
Queue's mutex is released, but M is still held.

Thread 2 tries to put something in the Queue by calling the enclosing
class's methods, but it blocks on M. Deadlock.

The solution would be to expose the Queue's mutex via a constructor
argument and/or method, so that, to take the above example, one could
create M and give it to the Queue as its mutex. Of course this is
doable now, as the code below demonstrates. But it should be documented
and exposed.

As I'm new to the Python community, I'm not sure that this is the right
forum for this suggestion. Is it the sort of thing one would put on the
SourceForge bug list? Advice appreciated.

"Fixed" Queue class:

class LQueue(Queue.Queue):
"""
Fixes a problem with the standard Queue implementation: you can't
use your own mutex,
so you are subject to nested monitor deadlock. This version lets
you pass in your
own lock.
"""

def __init__(self, maxsize=0, lock=None):
"Optionally takes a lock (threading.Lock or threading.RLock) to
use for the queue's lock."
Queue.Queue.__init__(self, maxsize)
if lock:
self.mutex = lock
# Re-create the condition objects with the new mutex.
self.not_empty = threading.Condition(self.mutex)
self.not_full = threading.Condition(self.mutex)
 
A

Alex Martelli

Jonathan Amsterdam said:
As I'm new to the Python community, I'm not sure that this is the right
forum for this suggestion. Is it the sort of thing one would put on the
SourceForge bug list? Advice appreciated.

Posting a patch and/or bug to Sourceforge is probably the best approach.


Alex
 
P

Paul McGuire

Jonathan Amsterdam said:
I think there's a slight design flaw in the Queue class that makes it
hard to avoid nested monitor deadlock. The problem is that the mutex
used by the Queue is not easy to change. You can then easily get
yourself into the following situation (nested monitor deadlock):

Say we have a class that contains a Queue and some other things. The
class's internals are protected by a mutex M. Initially the Queue is
empty. The only access to the queue is through this class.

Thread 1 locks M, then calls Queue.get(). It blocks. At this point the
Queue's mutex is released, but M is still held.

Thread 2 tries to put something in the Queue by calling the enclosing
class's methods, but it blocks on M. Deadlock.
This isn't really a deadlock, it is just a blocking condition, until Thread
1 releases M.

There are 3 requirements for deadlock:
1. Multiple resources/locks
2. Multiple threads
3. Out-of-order access

You have definitely created the first two conditions, but the deadlock only
occurs if Thread 2 first acquires the Queue lock, then needs to get M before
continuing. By forcing all access to the Queue through the class's methods,
which presumably lock and unlock M as part of their implementation, then you
have successfully prevented 3. No deadlocks, just proper
blocking/synchronization.

-- Paul
 
J

Jonathan Amsterdam

If you don't want to call it deadlock, fine, but the program execution
I describe will make no progress to the end of time. Thread 2 can never
put anything in the queue, because Thread 1 holds M, and Thread 1 will
never release M because that can only happen if someone puts something
on the queue.
 
J

Jack Diederich

If you don't want to call it deadlock, fine, but the program execution
I describe will make no progress to the end of time. Thread 2 can never
put anything in the queue, because Thread 1 holds M, and Thread 1 will
never release M because that can only happen if someone puts something
on the queue.
Right, the problem isn't with Queue it is with your global lock M.
Here is the pseudo code for your program:

acquire_lock("No_one_else_can_do_anything")
wait_for_someone_else_to_do_something() # waits forever

-Jack
 
D

Duncan Booth

Jonathan said:
If you don't want to call it deadlock, fine, but the program execution
I describe will make no progress to the end of time. Thread 2 can never
put anything in the queue, because Thread 1 holds M, and Thread 1 will
never release M because that can only happen if someone puts something
on the queue.
If your class is using a queue to handle the inter-thread communication why
is it also using a semaphore?

If your methods are already locked so only one thread runs at a time why
are you then using a queue to provide another layer of locking: a simple
list should suffice?
 
P

Paul McGuire

Jonathan Amsterdam said:
If you don't want to call it deadlock, fine, but the program execution
I describe will make no progress to the end of time. Thread 2 can never
put anything in the queue, because Thread 1 holds M, and Thread 1 will
never release M because that can only happen if someone puts something
on the queue.

Ah, I get it now, and yes as you describe it, this is a deadlock. But why
do you need both mutex M and a Queue, then? You should release M before
calling Queue.get(). Then if you need to, you should reacquire it
afterward.

-- Paul
 
T

Terry Reedy

Jonathan Amsterdam said:
As I'm new to the Python community, I'm not sure that this is the right
forum for this suggestion. Is it the sort of thing one would put on the
SourceForge bug list? Advice appreciated.

As a sometimes bug reviewer who has noticed that a substantial fraction of
'bug' reports do not actually report CPython bugs, I wish more people would
do as you did and post here first and get opinions from the wider variety
of readers here.

A CPython bug is a discrepancy between a reasonable interpretation of the
doc and the behavior of the CPython implementation. Your post did not even
claim such a discrepancy that I could see.

A request for a design change can be put on the RFE (request for
enhancement) list. Patches, whether to fix bugs or implement enhancements,
usually go on the patch list. Note that you do not necessarily need to
convince people that the current design is 'flawed' to demonstrate that a
change would be good.

Perhaps your recipe should be added to the oreilly online python cookbook.

Terry Jan Reedy
 
A

Alan Morgan

If you don't want to call it deadlock, fine, but the program execution
I describe will make no progress to the end of time. Thread 2 can never
put anything in the queue, because Thread 1 holds M, and Thread 1 will
never release M because that can only happen if someone puts something
on the queue.

That's not a problem in the design of the queue class, it is a problem
in how you are using it. Two possible solutions are:

1. Don't have the global lock on the object (or, at the very least,
don't have that global lock taken when you read from the queue).
2. Don't use a syncronized queue. If the only access to the queue is
through the object and the object is protected then you don't need
a synchronized queue.

Alan
 
J

Jonathan Amsterdam

This is a reply to Alan Morgan, Paul McGuire and Duncan Booth.

I need mutex M because I have other fields in my class that need to be
thread-safe.

The reason I want to use a Queue and not a list is that a Queue has
additional synchronization besides the mutex. For instance, Queue.get()
will block the caller until the queue is non-empty. Of course I could
build this behavior on top of a list, but then I'm reinventing the
wheel: Queue.Queue is the perfect thing, except for the problem I
describe.

I can't release my mutex M and then call Queue.get(). That could be a
race condition: between the time M is released and Queue's mutex is
acquired, another thread could get into my object and mess things up.
(We'd need a specific example to see this, and there may be many cases
where I could safely release M before calling Queue.get(), but in
general it won't work.)
 
P

Paul McGuire

Jonathan Amsterdam said:
This is a reply to Alan Morgan, Paul McGuire and Duncan Booth.

I need mutex M because I have other fields in my class that need to be
thread-safe.

The reason I want to use a Queue and not a list is that a Queue has
additional synchronization besides the mutex. For instance, Queue.get()
will block the caller until the queue is non-empty. Of course I could
build this behavior on top of a list, but then I'm reinventing the
wheel: Queue.Queue is the perfect thing, except for the problem I
describe.

I can't release my mutex M and then call Queue.get(). That could be a
race condition: between the time M is released and Queue's mutex is
acquired, another thread could get into my object and mess things up.
(We'd need a specific example to see this, and there may be many cases
where I could safely release M before calling Queue.get(), but in
general it won't work.)
What if you expose an unguarded method on your class, that allows external
put()'s to the Queue without first locking M?

-- Paul
 
D

Dennis Lee Bieber

This is a reply to Alan Morgan, Paul McGuire and Duncan Booth.

I need mutex M because I have other fields in my class that need to be
thread-safe.
Unfortunately, since you haven't shown us /your/ code (or a minimal
subset that represents your problem) it is not all that easy to propose
valid solutions...

I would suggest, for the above "other fields", that you release M
prior to doing the Q.get(), and reacquire M whenever Q.get() returns.
I can't release my mutex M and then call Queue.get(). That could be a
race condition: between the time M is released and Queue's mutex is
acquired, another thread could get into my object and mess things up.
(We'd need a specific example to see this, and there may be many cases
where I could safely release M before calling Queue.get(), but in
general it won't work.)

Well -- since you say that won't work... You appear to have a major
redesign effort ahead -- since you need to decouple blocking on Q.get()
from inside your other lock (or move Q.put() outside of the other
potentially blocking lock calls)


http://greenteapress.com/semaphores/
might have something of applicability
--
 
P

Piet van Oostrum

PM> Ah, I get it now, and yes as you describe it, this is a deadlock. But why
PM> do you need both mutex M and a Queue, then? You should release M before
PM> calling Queue.get(). Then if you need to, you should reacquire it
PM> afterward.

What Jonathan says is that in his example M and the mutex for the Queue
should be the same to solve the problem. And that the Queue API should
allow you to do that. I think he is right.
 
J

Jonathan Amsterdam

No redesign necessary. I simply make M be the Queue's mutex, via the
LQueue class I posted. I am making the modest suggestion that this
feature be documented and exposed in the Queue class.
 
O

Ove Svensson

Jonathan Amsterdam said:
No redesign necessary. I simply make M be the Queue's mutex, via the
LQueue class I posted. I am making the modest suggestion that this
feature be documented and exposed in the Queue class.

Even though LQueue is the correct sollution to the problem, why not fix
Queue.Queue? Fixing Queue.Queue would not break any existing code and we
don't have to pollute the namespace with yet another class.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top