Deadlock detection

Duncan Grisby · Dec 6, 2004

Hi,

Does anyone know of a deadlock detector for Python? I don't think it
would be too hard to hook into the threading module and instrument
mutexes so they can be tested for deadlocks. I've googled around but I
haven't found anything.

Cheers,

Duncan.

Stephen Kellett · Dec 6, 2004

Does anyone know of a deadlock detector for Python? I don't think it

would be too hard to hook into the threading module and instrument
mutexes so they can be tested for deadlocks. I've googled around but I
haven't found anything.

Software Verification have Python Thread Validator. Its not publicly
advertised on the beta part of the website, but it is available.

http://www.softwareverify.com/publicBeta.html

Take a look at Thread Validator or Java Thread Validator to see what the
tool looks like (the Python tool is similar). If you are interested send
an email to support asking about the Python port.

Stephen

Josiah Carlson · Dec 6, 2004

Duncan Grisby said:
Hi,

Does anyone know of a deadlock detector for Python? I don't think it
would be too hard to hook into the threading module and instrument
mutexes so they can be tested for deadlocks. I've googled around but I
haven't found anything.

I'm not aware of a deadlock dector for any language. I propose that a
completely working deadlock detector is NP-Complete, as it would, I
believe, necessitate solving the halting problem.

- Josiah

Paul Du Bois · Dec 7, 2004

(warning: pedantic and off-topic response) NP-Complete does not mean
"equivalent to the halting problem." It means "poly-time equivalent to
any other NP-Complete problem".

NP-Complete problems are "only" exponential-time. The halting problem
is much harder! And of course, just the fact that a problem is
NP-complete doesn't mean that you can't construct algorithms that do a
pretty good job a pretty good fraction of the time.

Duncan Grisby · Dec 7, 2004

I'm not aware of a deadlock dector for any language. I propose that a
completely working deadlock detector is NP-Complete, as it would, I
believe, necessitate solving the halting problem.

As Paul Du Bois has pointed out, NP complete is much easier than the
halting problem (i.e. the halting problem simply can't be done in
general, no matter how much time you have).

Aside from that, you're right that statically analysing code for
deadlocks is equivalent to the halting problem, and therefore
impossible. What I'm looking for, though, is a dynamic detector that
detects deadlocks at run-time when they happen. Doing that is well
understood, and there are plenty of systems that do it. I just haven't
been able to find one for Python.

Cheers,

Duncan.

Stephen Kellett · Dec 7, 2004

Duncan said:
understood, and there are plenty of systems that do it. I just haven't
been able to find one for Python.

There is one at http://www.softwareverify.com as I mentioned in a
previous posting.

Stephen

Adam DePrince · Dec 11, 2004

Hi,

Does anyone know of a deadlock detector for Python? I don't think it
would be too hard to hook into the threading module and instrument
mutexes so they can be tested for deadlocks. I've googled around but I
haven't found anything.

In general, accurate deadlock detection is intractable. Like many
problems of this class you have two choices - either reduce the scope of
the problem to a non-general one or try a heuristic to guess.

As I recall, deadlock prevention is similar to the halting problem; the
only question you can answer is "which category am I in:"

A) I know for sure there are no deadlocks
B) I don't know, maybe there are, maybe there arn't.

In the halting problem, the answer to your question is B until you
actually halt, in which case the answer to your problem is obvious.

Here is a quick and dirty heuristics to filter some programs as being in
bin A or bin B.

First, for bin B. Instrument your mutex so that every time you lock, it
creates a directed edge in a global system wide graph from your current
mutex (mutex_N) to the next to most recently locked mutex you are
currently holding for the locking thread. If your graph becomes
cyclic, you might be in bin B (well, you were never in bin A to being
with.) Throw a "I've lost faith in my inability to deadlock" exception.

If you can *prove* that there is a strict topological order between
nested mutex invocations, then you are in bin A. The degree of rigor in
your proof is up to you, your level of comfort and how many people die
if your code deadlocks (your personal website and medical instruments
have different standards.)

Not everybody trusts themselves. There are a number of alternative
approaches, including having a single global critical section lock
(ancient linux smp code) or designing your mutex operation to imply the
release of all held locks. Of course, if you must hold more than one
lock at a time, your mutex function can take a list of locks that it
will atomically apply. The proper design of this is left as an exercise
for the reader.

Unless you thought of this from the beginning, retrofitting safe locks
into your existing large project will be expensive. The next
possibility won't work for python, but it is useful to keep in mind.

The halting problem has a small caveat, it is applicable to "general"
Turing machines. Place restrictions on your Turing machine that makes
it not a Turing machine and the problem goes away. In real time systems
(oh my, cat < aborted.phd.thesis.tex > mail python-list@... ) where you
have to compute the longest time any piece of code will take to execute
this sort of analysis is common place. Get rid of function pointers
(including weakly typed OOPLs like Python.) You don't have to worry
about limiting loop counts like in a RTL, because we arn't interested in
timing information. Oh, and ditch recursion. Maybe you don't have to,
but I'm not sure.

Now walk through your code taking each possible path. You can collapse
loops to each meaningful combination (depends on the nature of your
languages loop construct), collapse paths that don't have any mutex
operations. You get the idea.

Unless mutex calls are rare, or your code simple, you might spend a
while. Largely this problem is intractable, even with simplifications,
but it is done which is why safety critical programs are (well, should
be) small and their languages not very expressive (as in finite state
machine, and not in the "but my computer is a FSM sense.")

Adam DePrince

Mike Meyer · Dec 11, 2004

Adam DePrince said:
In general, accurate deadlock detection is intractable. Like many
problems of this class you have two choices - either reduce the scope of
the problem to a non-general one or try a heuristic to guess.

I've recently been involved in a thread on another list that dealt
with the problems of programming with threads. There are people
approaching the problems (including deadlocks) by providing constructs
for threading that mean you can't write such code. At least, that's
the idea.

http://www.jot.fm/issues/issue_2004_04/article6 was posted as one
such solution. I'm not convinced it works, but the paper only
discusses half the solution.

<mike

Python deadlock using subprocess.popen and communicate	18	Sep 22, 2011
How to debug deadlock?	1	Feb 9, 2009
avoid import short-circuiting	0	Mar 16, 2012
read / write locks and deadlock detection	2	Nov 29, 2005
Queue can result in nested monitor deadlock	14	Apr 17, 2006
Condition.wait() behavior with timeout	0	Jan 30, 2012
safely returning objects from a Process to the parent through aQueue()	0	Nov 6, 2009
collision detection and penetration depth	3	Oct 25, 2004

Deadlock detection

Duncan Grisby

Stephen Kellett

Josiah Carlson

Paul Du Bois

Duncan Grisby

Stephen Kellett

Adam DePrince

Mike Meyer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads