returning a value from a thread

K

Ken Godee

module1 calls a function in module2

module2 starts a thread that calls a function in module3
and then returns to module1

thread finishes and I need the return value from the thread
to use in module1 where program flow is continuing.

err, I hope I explained that well enough?

In other words need value from module3 passed
back to module1 (global variable)

I thought there was some sort of memory type container
I could use...

ie. store value in memory from module3 and be able to
read it from module1.

The closest I can think of is to pass a queue reference
and stor values from module3 in the queue and then read them
back while in module1.

Maybe I'm missing something?
 
C

Christopher T King

I thought there was some sort of memory type container
I could use...

ie. store value in memory from module3 and be able to
read it from module1.

The closest I can think of is to pass a queue reference
and stor values from module3 in the queue and then read them
back while in module1.

Actually, any kind of container object would do: you could pass a list,
dictionary, or class to the thread, and the thread could store its results
in it.

A hackish method would be to pass the dictionary returned by locals() in
module1 to the thread. This way the thread could set a value in module1
directly.

Just don't forget to .join() on the thread in module1 before accessing the
container object! (Assuming you're using threading) I just wish Python's
..join() could return values, like pthread_join can.
 
D

David Bolen

Christopher T King said:
Just don't forget to .join() on the thread in module1 before accessing the
container object! (Assuming you're using threading) I just wish Python's
.join() could return values, like pthread_join can.

Although if you can join on the thread, then you have to have a
reference to the thread object, at which point you can do anything you
want in terms of permitting state to be interrogated on that object
(direct attribute access, getters, etc...), which is even better than
having join return an object.

-- David
 
J

Jeff Shannon

Christopher said:
Actually, any kind of container object would do: you could pass a list,
dictionary, or class to the thread, and the thread could store its results
in it.

A hackish method would be to pass the dictionary returned by locals() in
module1 to the thread. This way the thread could set a value in module1
directly.

Just don't forget to .join() on the thread in module1 before accessing the
container object! (Assuming you're using threading) I just wish Python's
.join() could return values, like pthread_join can.

This may work if the worker thread will perform a relatively short task
and then die *before* you access the result. But lists and dictionaries
are not thread-safe -- if they are potentially accessed by multiple
threads concurrently, then the behavior will be unpredictable. (Think
of a case where thread B starts to update a dictionary, inserts a key
but is interrupted before it can attach a value to that key, and then
while thread B is interrupted thread A looks at the dictionary and finds
the key it's looking for, but with no valid reference as its value...
[Disclaimer: I don't know the inner workings of dictionaries well enough
to know if this exact situation is possible, but I do know that dicts
are not threadsafe, so something *similar* is possible...])

Unless you're certain that your worker thread can die before you get the
results, and you're able to have your main thread potentially sit around
doing nothing until that happens (which is what join() does), you need
to use a threadsafe method of passing data back and forth. The simplest
such method is indeed to use a queue. You *can* use some other
container, and protect access to that container through the use of
locks.... but that's exactly what a queue does, so why reinvent the wheel?

Jeff Shannon
Technician/Programmer
Credit International
 
C

Christopher T King

This may work if the worker thread will perform a relatively short task
and then die *before* you access the result. But lists and dictionaries
are not thread-safe -- if they are potentially accessed by multiple
threads concurrently, then the behavior will be unpredictable.

That's where the .join() comes in handy (it blocks until the thread dies)
;)
 
P

Peter Hansen

Ken Godee wrote:

[detailed version of the question in the subject line]

This question is asked often enough that it maybe should
become a FAQ, but in any case it was asked a few weeks
ago and I posted sample code with the idiomatic approach,
as I recall. A Google Groups search with your keywords
and my name should find it pretty quick.

-Peter
 
P

Peter Hansen

Peter said:
Ken Godee wrote:

[detailed version of the question in the subject line]

This question is asked often enough that it maybe should
become a FAQ, but in any case it was asked a few weeks
ago and I posted sample code with the idiomatic approach,
as I recall. A Google Groups search with your keywords
and my name should find it pretty quick.

Oops, doesn't look like I'm remembering the right thread,
since I didn't post in it. Anand Pillai, however, did
post sample code:

http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&th=8b49efa529006da0&rnum=1

-Peter
 
P

Peter Hansen

Peter said:
Peter said:
Ken Godee wrote:

[detailed version of the question in the subject line]

This question is asked often enough that it maybe should
become a FAQ, but in any case it was asked a few weeks
ago and I posted sample code with the idiomatic approach,
as I recall. A Google Groups search with your keywords
and my name should find it pretty quick.

Oops, doesn't look like I'm remembering the right thread,
since I didn't post in it. Anand Pillai, however, did
post sample code:

Actually, my memory is better than my search skills today:

http://groups.google.ca/[email protected]
 
K

Ken Godee

ie. store value in memory from module3 and be able to
Actually, any kind of container object would do: you could pass a list,
dictionary, or class to the thread, and the thread could store its results
in it.

This was my orginal problem, being the middle man (module2) that spins
the thread(calling function in module3) has returned to module1 before
the thread result was available. So I couldn't figure a way to get the
result from module3 back to a global var in module1, since middle man
was gone.
A hackish method would be to pass the dictionary returned by locals() in
module1 to the thread. This way the thread could set a value in module1
directly.
Hhmmm, interesting, didn't know you could do that. Might have to try it.
Just don't forget to .join() on the thread in module1 before accessing the
container object! (Assuming you're using threading) I just wish Python's
.join() could return values, like pthread_join can.
Using thread here.
But I really have nothing against using queue, I've used queues before
but more for a kind of stream of data and was thinking I was missing
something because I only want to pass back a single value, so Queue(1)
would probally be alright.

Thanks for all the responses.

Ken
 
J

Jeff Shannon

Christopher said:
That's where the .join() comes in handy (it blocks until the thread dies)
;)

True, but quite frankly, I don't see much value in starting a new thread
if you're only going to be sitting around waiting for that thread to
finish. At that point, where's the gain over simply calling a
function? This may be a matter of style, but I see two major uses for
worker threads. In one case, you're doing a little bit of work here, a
little bit of work there, a little bit here again... back and forth.
The other is the case where you have a long-running task, but want your
user interface to remain responsive while that task running. (For
instance, most GUI frameworks will have problems if their event queues
aren't serviced regularly, and a lengthy calculation can prevent that
unless you put it in another thread.) This would also include the case
of a service/daemon which must remain responsive to new requests, so it
fires up a separate thread to handle each incoming request. In the
first case, you need to synchronize execution between the threads at
multiple points; in the second, the threads each go about their own
thing until the worker is finished, at which point it must notify the
main thread. In neither case is it practical to join() the worker
thread, because the whole point is that the main thread can't just sit
and do nothing.

If you *are* going to join(), then the advantages of concurrent
execution get thrown away while sitting at that join(). I see no
practical advantage of this:

workerthread = threading.Thread(target=somefunc)
workerthread.run()
somestuff()
someotherstuff()
workerthread.join()

vs this:

somestuff()
someotherstuff()
somefunc()

The only way you could possibly get a speed advantage from using a
thread is on a multiprocessor machine, and I don't believe that Python
currently makes use of multiple processors anyhow (though I think that
it's possible for C extension modules to do so). The flow-control
advantage of using a thread, that your program flow doesn't have to wait
for the task to finish, is lost when you are waiting to join(). All
you're left with is the added complexity. Now, if the worker thread
needs to finish some subtask before, say, someotherstuff() is called,
then you've got a potential for flow-control advantage... but you also
need synchronization that's a lot more advanced than just waiting for
the thread to die. If you don't need thread-safe communication between
the threads, then you don't need threads, at least IMO.

Jeff Shannon
Technician/Programmer
Credit International
 
C

Christopher T King

If you *are* going to join(), then the advantages of concurrent
execution get thrown away while sitting at that join(). I see no
practical advantage of this:

workerthread = threading.Thread(target=somefunc)
workerthread.run()
somestuff()
someotherstuff()
workerthread.join()

vs this:

somestuff()
someotherstuff()
somefunc()

The only way you could possibly get a speed advantage from using a
thread is on a multiprocessor machine, and I don't believe that Python
currently makes use of multiple processors anyhow (though I think that
it's possible for C extension modules to do so). The flow-control
advantage of using a thread, that your program flow doesn't have to wait
for the task to finish, is lost when you are waiting to join(). All
you're left with is the added complexity.

How about:

blocking_IO_thread = threading.Thread(target=blocking_IO_func)
blocking_IO_thread.run()
somestuff()
someotherstuff()
blocking_IO_thread.join()

or:

workerthread = threading.Thread(target=somefunc)
workerthread.run()
do_screen_updates()
wait_for_user_request()
workerthread.join()

Neither of those needs multiple processors to show (possibly huge)
performance gains. Either way, I didn't make this problem up: the OP
asked how to get a value back from a thread (a reasonable thing to do),
not whether doing such a thing was the correct way to code his problem
(which I have no reason to doubt).
 
J

Jeff Shannon

Christopher said:
How about:

blocking_IO_thread = threading.Thread(target=blocking_IO_func)
blocking_IO_thread.run()
somestuff()
someotherstuff()
blocking_IO_thread.join()

or:

workerthread = threading.Thread(target=somefunc)
workerthread.run()
do_screen_updates()
wait_for_user_request()
workerthread.join()

Neither of those needs multiple processors to show (possibly huge)
performance gains.

Hmmm... I suppose so, although it seems to me that in most cases, you
will be waiting for multiple chunks of blocking IO (in which case the
worker thread should feed them through a queue for the main thread to
process as they arrive), or needing to do an arbitrary number of screen
updates / other user requests before the thread is finished (i.e., the
"responsive UI" case I mentioned). The blocking IO case does have some
merit, I'll admit -- I may be somewhat biased by the problems I've been
working on, where blocking IO hasn't been an issue, so I didn't consider
this. (I still have to say that I can't imagine *many* cases where one
would have background processing where a specific number of UI
interactions is appropriate -- if you need UI updates, then you're
probably going to need an arbitrary number of them, in which case you
need to check whether the thread is done rather than wait for it -- but
I suppose that such cases may exist.)
Either way, I didn't make this problem up: the OP
asked how to get a value back from a thread (a reasonable thing to do),
not whether doing such a thing was the correct way to code his problem
(which I have no reason to doubt).

You did, however, give an answer that will only work effectively in a
particular subset of cases that fit the OP's description, without being
specific about those limitations. It may well be that your solution
will work for the OP, but since this list is archived and frequently
searched, it's always good to explain *why* a given solution works or
doesn't work. The OP is not the only person who'll be reading, so it is
(at least IMO) beneficial to speak to a somewhat more general case, or
at least to be clear about what cases one *is* speaking to.

Jeff Shannon
Technician/Programmer
Credit International
 
D

David Bolen

(...)
This was my orginal problem, being the middle man (module2) that spins
the thread(calling function in module3) has returned to module1 before
the thread result was available. So I couldn't figure a way to get the
result from module3 back to a global var in module1, since middle man
was gone. (...)
Using thread here.
But I really have nothing against using queue, I've used queues before
but more for a kind of stream of data and was thinking I was missing
something because I only want to pass back a single value, so Queue(1)
would probally be alright.

Have you considered simply making the function in module2 return the
thread object that it has started as its response? That thread object
could easily support an interface to interrogate its status or result
(when available).

That's not very different from returning a Queue object to which the
thread will write its result, but by returning the thread object
directly you can support a much richer interface for the original
caller, whether a blocking call to wait for a result, or a way to
interrogate if a result is available yet. It would also work with
more than a single result or action to perform upon completion if you
needed that.

-- David
 
K

Ken Godee

Using thread here.
Have you considered simply making the function in module2 return the
thread object that it has started as its response? That thread object
could easily support an interface to interrogate its status or result
(when available).

I'm weaving quite a snake here already, there's even more modules
in the flow then what I've mentioned, including GUI stuff.
Didn't want to rewrite or make it tougher than need be just
to get a single value back to module1.

The returned value is not very time sensitive, just needed later
in program execution. So far this is the only thread I've needed
to retain UI response.

Although I'm learning alot more about threading and would set things
up quite differently if I was starting this program from scratch.
 
J

Jeff Shannon

Ken said:
I'm weaving quite a snake here already, there's even more modules
in the flow then what I've mentioned, including GUI stuff.
Didn't want to rewrite or make it tougher than need be just
to get a single value back to module1.


You may want to look further into your GUI toolkit, as well. I know
that wxPython has a thread-safe way of sending messages to a particular
window (wxPostEvent), which can be used to signal task completion. I
don't specifically know about other toolkits, but I wouldn't be at all
surprised if this is a common feature, since signalling worker-thread
completion *is* a fairly common task in GUI programs.

Jeff Shannon
Technician/Programmer
Credit International
 
K

Ken Godee

Jeff said:
You may want to look further into your GUI toolkit, as well. I know
that wxPython has a thread-safe way of sending messages to a particular
window (wxPostEvent), which can be used to signal task completion. I
don't specifically know about other toolkits, but I wouldn't be at all
surprised if this is a common feature, since signalling worker-thread
completion *is* a fairly common task in GUI programs.

Using PyQt here. Looked into QThreads already.
Also already using python threads and don't want to mix.
Been testing out using a Queue this evening and is working
perfectly.

Thanks,
Ken
 
A

Antoon Pardon

Op 2004-07-14 said:
This may work if the worker thread will perform a relatively short task
and then die *before* you access the result. But lists and dictionaries
are not thread-safe -- if they are potentially accessed by multiple
threads concurrently, then the behavior will be unpredictable.

I thought the GIL was supposed to take care of that.
(Think
of a case where thread B starts to update a dictionary, inserts a key
but is interrupted before it can attach a value to that key, and then
while thread B is interrupted thread A looks at the dictionary and finds
the key it's looking for, but with no valid reference as its value...
[Disclaimer: I don't know the inner workings of dictionaries well enough
to know if this exact situation is possible, but I do know that dicts
are not threadsafe, so something *similar* is possible...])

My understanding was that the GIL is there to guarantee that python
statements are atomic. Now if your statements here are correct that
is not the case. So what is the GIL supposed to do?
 
D

David Bolen

Antoon Pardon said:
I thought the GIL was supposed to take care of that.

It takes care of basic integrity of the interpreter, but depending on
your definition of "thread-safe" (which is a fairly ambiguous term)
you will likely need additional synchronization controls above and
beyond the GIL. There have been a number of discussions relating to
this in the past.
(Think
of a case where thread B starts to update a dictionary, inserts a key
but is interrupted before it can attach a value to that key, and then
while thread B is interrupted thread A looks at the dictionary and finds
the key it's looking for, but with no valid reference as its value...
[Disclaimer: I don't know the inner workings of dictionaries well enough
to know if this exact situation is possible, but I do know that dicts
are not threadsafe, so something *similar* is possible...])

My understanding was that the GIL is there to guarantee that python
statements are atomic. Now if your statements here are correct that
is not the case. So what is the GIL supposed to do?

Yes, I do believe the GIL will protect against the specific example
cited, at least for the built-in dict type (no guarantees against
Python level subclasses). That is, the act of inserting a key/value
pair into a built-in dictionary is atomic with respect to the Python
byte code interpreter since it occurs within the C core under control
of the GIL.

To the extent that you only care about the physical integrity of a
dictionary (e.g., the sort of internal state mismatch discussed
above), a dictionary can be considered thread-safe. There's no way
(from Python code) to create a key in a built-in dictionary without
some sort of value, nor for that operation to be interrupted (at the
Python bytecode level) once begun.

Likewise, a list is thread-safe to the point that there is no way to
create an "inconsistent" list from Python code - it may or may not
have the precise values you expect depending on sequence of execution,
but it'll have or not have the values, nothing in between.

The risk is in thinking that the above makes any general use of a
mutable container or other state objects within a multi-threaded
application thread-safe. At that point you need to consider
thread-safety at the appropriate level of granularity, which is
typically more than a single object or container, but often the
interaction of multiple state elements within the thread object that
need to remain consistent. Or even the need to have multiple elements
of a container kept in sync.

In other cases, thread objects may be using a mutable container object
(perhaps a Python class that works just like a dictionary) that itself
has imposed additional state information above and beyond the built-in
object it emulates or subclasses. In such cases the above guarantee
no longer holds since there is Python code handling state that can be
interrupted and result in an inconsistency.

The way I tend to think of it is that Python's job (from the
perspective of supporting a multi-threaded application) is to ensure
that its native data types remain internally consistent, in terms of
providing their proper functionality, in the presence of multiple
threads, but nothing more. Anything above that level is the
application's responsibility, and in almost all cases means you need
your own synchronization control to manage any state information.

Lastly, while this discussion has been, I believe, CPython specific
due to the GIL, for my part I have believed the prior point (basic
built-in object internal consistency) to be true in any Python
implementation, including Jython, but can't recall if I saw that
stated anywhere, so I suppose it's possible there may be some
additional risk in Jython or other implementations.

-- David
 
A

Antoon Pardon

Op 2004-07-16 said:
[ ... ]

The way I tend to think of it is that Python's job (from the
perspective of supporting a multi-threaded application) is to ensure
that its native data types remain internally consistent, in terms of
providing their proper functionality, in the presence of multiple
threads, but nothing more. Anything above that level is the
application's responsibility, and in almost all cases means you need
your own synchronization control to manage any state information.

Lastly, while this discussion has been, I believe, CPython specific
due to the GIL, for my part I have believed the prior point (basic
built-in object internal consistency) to be true in any Python
implementation, including Jython, but can't recall if I saw that
stated anywhere, so I suppose it's possible there may be some
additional risk in Jython or other implementations.

Thanks for the explanation.

I still have a question. Is the GIL necessary even if there is
no sharing of data between threads?
 
D

David Bolen

Antoon Pardon said:
Thanks for the explanation.

I still have a question. Is the GIL necessary even if there is
no sharing of data between threads?

Yes, because the GIL's primary purpose is to protect the Python
interpreter itself, and not the application. That is, the GIL ensures
the integrity of the Python interpreter state, and the state of
individual objects managed by the C core code.

Without the GIL, problems such as that envisioned by the prior poster
(a dictionary becoming internally inconsistent) could certainly arise.

You can approach this problem in other ways than a GIL (e.g.,
fine-grained object resource locks), but as many historical threads
have discussed, the GIL has proven to be a reasonable trade-off in
reliability, including maintainability, and performance - albeit with
a significant con with respect to SMP. Attempts to implement
fine-grained locking haven't generally panned out to the point where
its worth switching.

-- David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top