Threading a lengthy C function

L

Leo Breebaart

Hi all,

In a GUI program I'm currently developing (wxWindows, if that
matters), I am trying to get some time-consuming tasks (triggered
by e.g. the user choosing a menu item) out of the interface
thread and into 'worker' threads of their own, in order to stop
them from locking up the GUI.


If I subclass threading.Thread, and I override run() as follows:

def run(self):
retvalue = time.sleep(50)

then everything works perfectly fine when I start up this Thread
in response to the user trigger -- the GUI unlocks and the
application windows will repaint correctly, etc.

If, however, I use the kind of run() method that I *really* need:

def run(self):
retvalue = mypackage.sleep(50)

where mypackage.sleep() is in fact a SWIG-generated wrapper
around the C library sleep() function, I have no such luck: the
entire application will *still* lock up completely until this
task is done.

My guess would be (but I'm new to Python, so I may be entirely
wrong) that this has something to do with that Global Interpreter
Lock thing I keep reading about. I can imagine that if my lengthy
task is a Python task, the interpreter will get the chance to
switch between threads, but that if the task is an external C
function, this will be executed 'atomically' with no chance for
the interpreter to release the GIL to other threads.

I'd like to know if this interpretation of what's happening is
correct (or if it isn't, then what is?), but most importantly I
was wondering if anybody could point me in the direction of a
solution or workaround to my actual problem.

Many thanks in advance,
 
D

Duncan Booth

Leo Breebaart said:
I'd like to know if this interpretation of what's happening is
correct (or if it isn't, then what is?), but most importantly I
was wondering if anybody could point me in the direction of a
solution or workaround to my actual problem.
Your interpretation sounds spot on.

What you need to do is to release the GIL as the last thing that happens in
your C code before you call the lengthy function, and reclaim it as the
first thing you do when the function returns.

It may be easiest simply to wrap the library function in another C function
that does this, and then use SWIG to wrap the new function instead of the
original. e.g.

#include <python.h>
int safe_thefunction(char *arg, int arg2)
{
int res;
Py_BEGIN_ALLOW_THREADS
res = thefunction(arg, arg2);
Py_END_ALLOW_THREADS
return res;
}

and then wrap safe_thefunction instead of thefunction.

If the function you are wrapping can call back into Python then you must
also be sure to reclaim the GIL during each callback. This is easy enough
to do if the callbacks all happen on the same thread, but can be a pain if
there are asynchronous callbacks on other threads.
 
S

sdd

Leo said:
...
If, however, I use the kind of run() method that I *really* need:
def run(self):
retvalue = mypackage.sleep(50)

where mypackage.sleep() is in fact a SWIG-generated wrapper
around the C library sleep() function the entire application
will *still* lock up completely until this task is done.
As you suspect, this is the GIL at work.
I was wondering if anybody could point me in the direction of
a solution or workaround to my actual problem.
The question is what you want to do. It seems you want the
python code to run in a thread in parallel with the mypackage.
Do you want the code in mypackage to be able to run in parallel
with itself? Lots of C code assumes it is the only thread
manipulating its variables, but some doesn't. Typically
you will need a "mypackage" lock. Your C interface code will
have to do something conceptually like:
release_the_GIL(); /*A*/
acquire_the_mypackage_lock(); /*B*/
perform mypackage.whatever();/*C*/
release_the_mypackage_lock(); /*D*/
acquire_the_GIL(); /*E*/

_BUT_ after you "release the GIL" (between points A and E) you
may no longer talk to python. You cannot call conversion
functions, python memory allocators, etc. because you might be
running during the garbage collector, or an awkwardly timed
call to the same or a similar memory allocator.

For similar reasons, you cannot even read memory inside python
objects that is mutable, nor should you access memory inside
immutable python objects where you don't "hold a refcnt."

There are similar constraints on data from mypackage. You
need to think long and hard about the orders of both the
"drop the GIL" and "grab the mypackage lock" prelude to
calling whatever, as well as a long similar thought about
"grab the GIL" and "drop the mypackage lock" postlude to
the call.

The simplest way to think about this stuff is that with GIL
you may read/write python memory. With the whatever lock,
you may read/write whatever memory safely. Only if you
hold both locks can you move information between the two.
Think of acquire as a wait, and release as a potentially
very slow operation (it may immediately go do the work
that the lock was holding up). The potentially slow nature
of both acquire and release is what makes these decisions
tough.

If you didn't really care about the latencies at the lock
operations, this would be easiest:

acquire_the_mypackage_lock(); /*AA*/
release_the_GIL(); /*BB*/
perform mypackage.whatever();/*CC*/
acquire_the_GIL(); /*DD*/
release_the_mypackage_lock(); /*EE*/

At point AA, you move data (inputs) from python to mypackage
memory, At point DD, you move data (outputs) from mypackage
to python memory.

With all of the above in mind, the python version you are
working with makes a big difference; the the GIL re-acquire
is being simplified. 2.3 is simpler than earlier versions.
A good strategy is to grab recent python source that behaves
in a way you like, understand how it does its locks, and copy
it.

-Scott David Daniels
(e-mail address removed)
 
T

Thomas Heller

sdd said:
As you suspect, this is the GIL at work.

I always wondered why (apparently) SWIG does not release the lock before
calling into C code and acquire it back afterwards.

If the original posters intent is simply to call functions in a dll, he
should probably try out ctypes - ctypes handles the GIL automatically
(release before calling the function, acquire it back after return of
the function, and even grabbing the GIL if a callback into Python is
done). And it avoids SWIG completely.

Thomas
 
S

Skip Montanaro

Thomas> I always wondered why (apparently) SWIG does not release the
Thomas> lock before calling into C code and acquire it back afterwards.

Because it can't tell if the C code it's going to call will or won't make a
call back to Python. In most cases that would work, but in a few cases it
could lead to nasty bugs.

Skip
 
R

Robin Becker

..
......
If the original posters intent is simply to call functions in a dll, he
should probably try out ctypes - ctypes handles the GIL automatically
(release before calling the function, acquire it back after return of
the function, and even grabbing the GIL if a callback into Python is
done). And it avoids SWIG completely.

Thomas
I'm not sure I understand how the last part is done unless all callback
functions are known to ctypes and even then I'm not sure how ctypes
would know how to associate a particular call back event with the owner
thread. Are ctypes callbacks somehow defined uniquely for each callout?

The other thing I never quite get is what part of the C api I'm allowed
to use without having the GIL I think it safest to assume I may not call
any python api without having it.

Wrapping completely independent DLLs seems reasonable, but how does
ctypes know that a particular extension/dll actually assumes that it has
the GIL at entry which is mostly what extensions need.
 
T

Thomas Heller

Skip Montanaro said:
Thomas> I always wondered why (apparently) SWIG does not release the
Thomas> lock before calling into C code and acquire it back afterwards.

Because it can't tell if the C code it's going to call will or won't make a
call back to Python. In most cases that would work, but in a few cases it
could lead to nasty bugs.

Hm, which kind of bug could that be? If the function doesn't call back
into python, everything's fine. And if it does call back, it has to
acquire the lock again...

Thomas
 
T

Thomas Heller

[Thomas]
If the original posters intent is simply to call functions in a dll, he
should probably try out ctypes - ctypes handles the GIL automatically
(release before calling the function, acquire it back after return of
the function, and even grabbing the GIL if a callback into Python is
done). And it avoids SWIG completely.
[Robin]

I'm not sure I understand how the last part is done unless all callback
functions are known to ctypes and even then I'm not sure how ctypes
would know how to associate a particular call back event with the owner
thread. Are ctypes callbacks somehow defined uniquely for each callout?

No. In Python 2.2, ctypes creates a new threadtstate for each callback
into Python. In 2.3, the mechanism from PEP 311 is used, which
magically gets the threadstate.
The other thing I never quite get is what part of the C api I'm allowed
to use without having the GIL I think it safest to assume I may not call
any python api without having it.

*Very* few functions, it's mentioned in the docs.
Wrapping completely independent DLLs seems reasonable, but how does
ctypes know that a particular extension/dll actually assumes that it has
the GIL at entry which is mostly what extensions need.

ctypes isn't (or shouldn't be) used to call Python C api functions, only
independent DLLs. (Hm, didn't I post sick examples calling Python's C
api in the past myself <wink>).

Thomas
 
L

Leo Breebaart

My thanks to everyone who participated in this, ahem, thread.
Your responses have been very helpful -- this newsgroup is a
wonderful resource for the beginning Python programmer.


Duncan Booth said:
It may be easiest simply to wrap the library function in another C function
that does this, and then use SWIG to wrap the new function instead of the
original. e.g.

#include <python.h>
int safe_thefunction(char *arg, int arg2)
{
int res;
Py_BEGIN_ALLOW_THREADS
res = thefunction(arg, arg2);
Py_END_ALLOW_THREADS
return res;
}

This is more or less exactly what I ended up doing, and
everything works like a charm now.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top