Directly calling threaded class instance methods and attributes

M

Matthew Bell

Hi,

I've got a question about whether there are any issues with directly
calling attributes and/or methods of a threaded class instance. I
wonder if someone could give me some advice on this.

Generally, the documentation suggests that queues or similar constructs
should be used for thread inter-process comms. I've had a lot of
success in doing that (generally by passing in the queue during the
__init__ of the thread) and I can see situations where it's pretty much
the only way of doing it. But, there are other situations -
particularly where you've got a "main" thread that creates, and is the
sole communicator with, one or more "worker" threads - where keeping
track of the different queues can get a bit unwieldy.

So I thought about it some more, and came to the conclusion that -
again, in some situations - it could be a lot cleaner if you could call
methods and/or access attributes of a threaded class instance directly.
Here's some example code to show what I'm talking about:

-----------------------------------------------------------------
import threading
from time import sleep

class SimpleThread(threading.Thread):
def __init__(self):
self.total = 0
threading.Thread.__init__(self)

def add(self, number):
self.total += number

def run(self):
while(True):
# In reality, there'd be much more here
sleep(1)

adder = SimpleThread()
adder.start()
for i in range(20):
adder.add(1)
print adder.total
-------------------------------------------------------------------

This example code works. Well, it does for me, anyway :)

My question is simply, can anyone see any issues with calling methods
and/or attributes of a threaded class instance like this? It looks ok
to me but, as the docs never seem to mention using threads like this,
I'm wondering if I've missed something important. If it helps, I
know the basic considerations of threading, such as locking, exception
handling and so on; I only really need advice on whether there could
be issues with directly calling class instance attributes of a
running thread.

If anyone could let me know either "Yeah, that's fine" or "NO!!! That
can break <foo> / cause a deadlock in <bar> / etc!!!" I'd be much
obliged.

Thanks,
Matthew.
 
P

Peter Otten

Matthew said:
import threading
from time import sleep

class SimpleThread(threading.Thread):
def __init__(self):
self.total = 0
threading.Thread.__init__(self)

def add(self, number):
self.total += number

def run(self):
while(True):
# In reality, there'd be much more here
sleep(1)

adder = SimpleThread()
adder.start()
for i in range(20):
adder.add(1)
print adder.total
-------------------------------------------------------------------

This example code works.  Well, it does for me, anyway :)

My question is simply, can anyone see any issues with calling methods
and/or attributes of a threaded class instance like this?  It looks ok
to me but, as the docs never seem to mention using threads like this,
I'm wondering if I've missed something important.  If it helps, I

I know _very_ little about threads, so forgive me if my conclusion that you
know even less is wrong. From what I see in your example you do not have
any data that is shared by multiple threads - total just happens to be
stored in a SimpleThread object but is never accessed by it.

I have tried to desimplify your code a bit

import time
from time import sleep
import threading


class SimpleThread(threading.Thread):
def __init__(self):
self.total = 0
threading.Thread.__init__(self)

def add(self, number):
total = self.total
sleep(.3)
self.total = total + number

def run(self):
for i in range(10):
sleep(.1)
self.add(1)

adder = SimpleThread()
adder.start()
for i in range(10):
adder.add(1)
adder.join()
print "total:", adder.total

and here's the output:

$ python testthread.py
total: 12
$

I don't know whether

self.total += number

is atomic, but even if it were, I wouldn't rely on it.
Conclusion: stick with queues, or wait for an expert's advice - or both :)

Peter
 
D

Diez B. Roggisch

My question is simply, can anyone see any issues with calling methods
and/or attributes of a threaded class instance like this? It looks ok
to me but, as the docs never seem to mention using threads like this,
I'm wondering if I've missed something important. If it helps, I
know the basic considerations of threading, such as locking, exception
handling and so on; I only really need advice on whether there could
be issues with directly calling class instance attributes of a
running thread.

If anyone could let me know either "Yeah, that's fine" or "NO!!! That
can break <foo> / cause a deadlock in <bar> / etc!!!" I'd be much
obliged.

Python is very therad-friendly in a way that you don't get SIGSEGVs for
doing this - that means that at least the internal data-structures are
alwasys consistent. As a rule of thumb one can say that every expression is
atomic, and thus leaves the interpreter in a consistent state. But beware!
This is more than one expression:

a = b + c * d

It could be rewritten like this:

h = c* d
a = b + h

which makes it at least two - maybe there are even more. But

l.append(10)

on a list will at least be atomic when the actual appending occurs - thus
its perfectly ok to have 10 workerthreads appending to one list, and one
consumer thread pop'ing values from it.

So your code is perfectly legal, and as long as you are aware that having
more complex operations on objects undergoing can be interrupted at any
time, maybe leaving data inconsistent _from and applications POV_ - then
you'll need explicid sync'ing, by locks, queues or whatever...
 
J

Josiah Carlson

Hi,

I've got a question about whether there are any issues with directly
calling attributes and/or methods of a threaded class instance. I
wonder if someone could give me some advice on this.

No problem.

[Snip code and text]
If anyone could let me know either "Yeah, that's fine" or "NO!!! That
can break <foo> / cause a deadlock in <bar> / etc!!!" I'd be much
obliged.

With what you offered, it would not cause a deadlock, though it would
cause what is known as a race condition, where two threads are trying to
modify the same variable at the same time. Note that attributes of a
thread object are merely attributes of an arbitrary Python object, so
nothing special happens with them.


Here is a far more telling example...... global val
... for i in xrange(n):
... val += 1
...... threading.Thread(target=foo, args=(100000,)).start()
...
If there were no race condition, that value should be 1000000. Let us
use locks to fix it.
... global val2, lock
... for i in xrange(n):
... lock.acquire()
... val2 += 1
... lock.release()
...... threading.Thread(target=goo, args=(100000,)).start()
...1000000


- Josiah
 
J

Just

... global val2, lock
... for i in xrange(n):
... lock.acquire()
... val2 += 1
... lock.release()
...... threading.Thread(target=goo, args=(100000,)).start()
...1000000[/QUOTE]

FWIW, you don't need a global statement for globals you don't assign to,
so you don't need to declare lock global.

Just
 
J

Josiah Carlson

Just said:
FWIW, you don't need a global statement for globals you don't assign to,
so you don't need to declare lock global.

I was going to say that I did it for speed, and I could have sworn that
stating something was a global resulted in a fewer namespace lookups,
but testing does not confirm this (it actually refutes it). I guess this
says that I should be aliasing globals when I really care about speed,
making it...
... global val2
... _lock = lock
... for i in xrange(n):
... _lock.acquire()
... val2 += 1
... _lock.release()


...for a little bit faster (though the locking/unlocking will overwhelm
the actual time spent.

- Josiah
 
D

David Bolen

(...)
My question is simply, can anyone see any issues with calling methods
and/or attributes of a threaded class instance like this? It looks ok
to me but, as the docs never seem to mention using threads like this,
I'm wondering if I've missed something important. If it helps, I
know the basic considerations of threading, such as locking, exception
handling and so on; I only really need advice on whether there could
be issues with directly calling class instance attributes of a
running thread.

Others have focused more on the locking issues if the methods you use
access data that the separate thread is also accessing, so I'll try to
hit the general question of just sharing the instance itself.

Clearly directly accessing a non-callable attribute has the potential
requirement for locking and/or race conditions. But for callables,
and if I understand what you might be getting at, the answer is
definitely yes. There's absolutely no problem calling methods on a
thread object from separate threads, and even have methods used from
multiple threads simultaneously. The execution flow itself is fine,
but as you note, you have to handle shared data access issues
yourself, to the extent that it applies.

I do think this could simplify your thread communication in some cases
because you can export a more typical "object" interface from your
thread object (at least in the forward direction) rather than having
the user of the thread have to handle queue management.

For example, it's very common for me to have thread objects structured
like (to extend your example):

class SimpleThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
# I often handle the communication queue to the background
# thread internally, so callers need not be aware of it.
self.queue = Queue.Queue()
# Often my thread objects are self starting (so a user is just
# instantiating the object and not knowing a thread is involved)
self.start()

#
# Typically several methods used by the background thread during
# processing.
#
def _spam(self):
pass
def _eggs(self):
pass
def _ham(self):
pass

#
# The main thread itself - processes queue requests
#
def run(self):
while 1:
operation = self.queue.get()
if operation is None:
return

# Perform operation


#
# "Public" (not background thread) operations
#
def shutdown(self):
self.queue.put(None)

def someoperation(self, args):
self.queue.put(("dosomething", args))


So to the outside party, they are just creating an instance of my
object, and using methods on it. The methods happen to then use a queue
internally to get to the processing portion of the object which is in
a background thread, but you don't need to expose that to the caller.

Where this falls down a little is in the result of the processing.
Generally you need to provide for a query mechanism on your object
(which itself might be using an internal queue, or you could just
permit the caller to access an attribute which is the queue), or a
callback system, in which case the caller should clearly be made aware
that the callback will be executing in a separate thread.

Or, if you're using some async, event-driven approach even the thread
is probably completely hideable. For example, with Twisted, your public
thread instance methods can just appear as deferrable methods, using
standard deferreds as return values. Then when the result is ready in the
background thread, you have twisted fire the deferred in the main reactor
loop.

-- David
 
M

Matthew Bell

So your code is perfectly legal, and as long as you are aware that having
more complex operations on objects undergoing can be interrupted at any
time, maybe leaving data inconsistent _from and applications POV_ - then
you'll need explicid sync'ing, by locks, queues or whatever...

Diez,

Thanks for your reply. Basically, then, it seems that there aren't
any special considerations required for directly calling threaded
class instance methods / attributes. I'll still need to take care
of ensuring that the application's data structures are kept
internally consistent (which is fair enough - I was expecting to
have to do that anyway) but it looks like I'm not going to cause
any unusual issues.

If so, that's great - it'll help tidy things up quite nicely in
areas where I've got way too many queues to easily keep track of.

Thanks for your help Diez, and thanks also to everyone else who's
commented. It's always an education!

Matthew.
 
P

Peter Hansen

Josiah said:
... global val2
... _lock = lock
... for i in xrange(n):
... _lock.acquire()
... val2 += 1
... _lock.release()


...for a little bit faster (though the locking/unlocking will overwhelm
the actual time spent.

If you're intent on making the code less maintainable in
order to achieve tiny improvements in speed, at least
store local references to the entire method, not just
to the object:

global val2
acq = lock.acquire
rel = lock.release
for i in xrange(n):
acq()
val2 += 1
rel()

-Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top