Is it possible to stop corrupt threads?

M

Morten Jensen

Hello,

We are writing a big program with dozens of threads and there is constantly
a danger that a programming error will result in an unforseen deadlock. If
this occurs we would like to just kill the deadlocked threads, write out an
error message and restart them if possible. For some reason even the
deprecated stop method can't kill threads that are in a deadlock. Is there
anything other than System.exit() that can do this?

Thanks

Morten
 
M

Michael Borgwardt

Morten said:
We are writing a big program with dozens of threads and there is constantly
a danger that a programming error will result in an unforseen deadlock. If
this occurs we would like to just kill the deadlocked threads, write out an
error message and restart them if possible. For some reason even the
deprecated stop method can't kill threads that are in a deadlock.

It can't, in fact, do anything at all. It's not a case of "deprecated but still
working". This method and some others were made nonfunctional because they
could not be implemented in a way that guarantees data consistency.
Is there
anything other than System.exit() that can do this?

No. Except writing your stuff so that it *doesn't* deadlock.
 
N

NOBODY

If they are monitor-waiting, there is nothing to do.

If they are stalled in a wait(), you could try the "violent" approach by
calling interrupt() on them, but that means you were prepared for that
(you had a tool ready to list threads and grab one to interrupt() it.)

That rarely works in cheap 3rd party modules because they just
catch-wide all Exceptions and stupidly don't recognize the
InterruptedException. So you end up creating an exception, but the
thread loops again. If you are lucky, you would have killed the thread
without leaving garbage behind.

Like the other one said, don't code deadlocks...
That means, do the homework of:
-Listing ALL synchronized zones.
-Graph ALL the lock acquisition paths.
-Resolve lock ordering, and reduce critical zones.

-Rule of thumb: get observer/listeners events firing out of the critical
zone.

And if you don't know what is a "critical zone", then do your boss a
favor, quit thread programming immediately.

Welcome to major league...
 
C

Chris Uppal

Morten said:
We are writing a big program with dozens of threads and there is
constantly a danger that a programming error will result in an unforseen
deadlock.

Then you are totally buggered. Sorry, but it is that simple.

Is there anything other than System.exit() that can do
this?

No. In fact until you can unbugger yourselves, probably the second best thing
you can do is exit() and restart the program. Something like:

tell all threads to accept no more work and die when they've
finished their current task.

start no more threads.

wait for a decent period for non-deadlocked threads to finish

exit(), thus killing any that were deadlocked, and loosing any
work that wasn't deadlocked, but was taking longer than
expected.

Not nice.

-- chris
 
M

Morten Jensen

Thanks for the answers.

NOBODY said:
If they are monitor-waiting, there is nothing to do.

If they are stalled in a wait(), you could try the "violent" approach by
calling interrupt() on them, but that means you were prepared for that
(you had a tool ready to list threads and grab one to interrupt() it.)

Hmmm, I hadn't thought of this possibility - I guess it would be worth
calling interrupt() just in case the deadlock is one of these easy ones.
Luckily, losing some threads is often not catastrophic in our case because
we have already been forced to have many fall back strategies to handle
other inconveniences that we must be prepared for.
That rarely works in cheap 3rd party modules because they just
catch-wide all Exceptions and stupidly don't recognize the
InterruptedException. So you end up creating an exception, but the
thread loops again. If you are lucky, you would have killed the thread
without leaving garbage behind.

Fortunately this doesn't apply to us - the only third party software we use
is the JVM and a tiny bit of SWT.
Like the other one said, don't code deadlocks...
That means, do the homework of:
-Listing ALL synchronized zones.
-Graph ALL the lock acquisition paths.
-Resolve lock ordering, and reduce critical zones.

-Rule of thumb: get observer/listeners events firing out of the critical
zone.

Yes, these things seem like the best advice - is there any software (other
than grep) that could help with this?

Morten
 
J

Josef Garvi

NOBODY said:
-Rule of thumb: get observer/listeners events firing out of the critical
zone.

Do you mean by extending the Observable class and implementing Observer
interfaces, or by some other mechanism?

Will the notifyObservers() method of the Observable object place a message
on the stack of the thread holding the Observer, so that the Observer's
update(...) is called from the other thread, or will it jump in using the
Observable's thread and execute the code? (in which case i guess it will be
unadvicable to use...)

--
Josef Garvi

"Reversing desertification through drought tolerant trees"
http://www.eden-foundation.org/

new income - better environment - more food - less poverty
 
N

NOBODY

-Listing ALL synchronized zones.
Yes, these things seem like the best advice - is there any software
(other than grep) that could help with this?

Short answer: no. It is very runtime specific, especially with factory
patterns and interfaced implementations.


Eclipse IDE It can take you half way.
-Find synchronized keyword everywhere. I note all the monitor down with
boxes.
-note the method name that is/use a synchronized block.
-right the method name and 'open call hierarchy'. You will find who calls
this method. But that is only good to go up. You have to read the code
yourself to drill down synchronized zones. This is tedious when you have a
lot of interfaces. You must fetch ALL implementations.

For every zone that acuires another zone, I draw an arrow from a box to
another. Following the arrows, if there is any path from box to another and
back, you got a potential deadlock and you must confirm that 2 threads can
actually realise each of those paths.

A -> B
B -> C
C -> D

D -> A

You got a potential deadlock between A and D
 
N

NOBODY

-Rule of thumb: get observer/listeners events firing out of the
Do you mean by extending the Observable class and implementing
Observer interfaces, or by some other mechanism?

Will the notifyObservers() method of the Observable object place a
message on the stack of the thread holding the Observer, so that the
Observer's update(...) is called from the other thread, or will it
jump in using the Observable's thread and execute the code? (in which
case i guess it will be unadvicable to use...)


not even close.
Frankly, I never used jdk Observer stuff.

It is about implementing the event source. When some client class add a
listener on you, you will fire event to it, upon some state change or
whatever.

Usually, state changes are critical and protected by synchronized block.
People make the mistake of putting the whole method synchronized.

Example: synchronized SomeConnection.disconnect()

Sure, no 2 disconnect should be called at the same time.
But if disconnect() is entirely synchronized and you fire some
disconnetedEvents within it, the zone is locked the whole time your
listeners are performing their event handling.

That, by itself it not so bad, but there is no justification to lock the
object for event firing. It is not changing state.

If ever another thread reacts to the disconnectevent by calling, let's
say a synchronized SomeConnection.x() method, it will stay monitor
waiting untill ALL listeners got their event and disconnet is exited.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top