Multiple independent Python interpreters in a C/C++ program?

S

skip

This question was posed to me today. Given a C/C++ program we can clearly
embed a Python interpreter in it. Is it possible to fire up multiple
interpreters in multiple threads? For example:

C++ main
thread 1
Py_Initialize()
thread 2
Py_Initialize()

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

Thanks,

Skip
 
D

Diez B. Roggisch

This question was posed to me today. Given a C/C++ program we can clearly
embed a Python interpreter in it. Is it possible to fire up multiple
interpreters in multiple threads? For example:

C++ main
thread 1
Py_Initialize()
thread 2
Py_Initialize()

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

AFAIK there was a thread a few month ago that stated that this is
actually possible - but mostly 3rd-party-c-extension aren't capable of
supporting that. Martin von Loewis had a word in that, maybe googling
with that in mind reveals the discussion.

And of course its a *bad* idea to pass objects between threads...

Diez
 
M

Matimus

This question was posed to me today. Given a C/C++ program we can clearly
embed a Python interpreter in it. Is it possible to fire up multiple
interpreters in multiple threads? For example:

C++ main
thread 1
Py_Initialize()
thread 2
Py_Initialize()

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

Thanks,

Skip

AFAIK it is only `sort of' possible. But not like that. See the API
documentation on Py_NewInterpreter http://www.python.org/doc/api/initialization.html#l2h-825

As you will see from the documentation it creates an interpreter that
isn't 100% separate. It may work for your application though.

Also, I don't think using the new interpreter is as simple as just
instantiating the new interpreter in a separate thread. But there is
much relevant information on the page that I linked to.

Matt
 
C

Christian Heimes

Diez said:
AFAIK there was a thread a few month ago that stated that this is
actually possible - but mostly 3rd-party-c-extension aren't capable of
supporting that. Martin von Loewis had a word in that, maybe googling
with that in mind reveals the discussion.

And of course its a *bad* idea to pass objects between threads...

http://www.python.org/dev/peps/pep-3121/

Christian
 
R

Rhamphoryncus

This question was posed to me today. Given a C/C++ program we can clearly
embed a Python interpreter in it. Is it possible to fire up multiple
interpreters in multiple threads? For example:

C++ main
thread 1
Py_Initialize()
thread 2
Py_Initialize()

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

Since they interpreters would never truly be separate, I think it's
best to bite the bullet and use multiple threads within one
interpreter. The only really difference is pure python modules aren't
duplicated, but why would you need that? Surely not for any kind of
security.
 
S

sturlamolden

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

You can create a new interpreter with a call to Py_NewInterpreter.

However, the GIL is a global object for the process. If you have more
than two interpreters in the process, they share the same GIL.

In tcl, each thread has its own interpreter instance and no GIL is
shared. This circumvents most of the problems with a global GIL.

In theory, a GIL private to each interpreter would make Python more
scalable. The current GIL behaves like the BKL in earlier Linux
kernels. However, some third-party software, notably Apache's
mod_python, is claimed to depend on this behaviour.
 
S

sturlamolden

Do I wind up with two completely independent interpreters, one per thread?
I'm thinking this doesn't work (there are bits which aren't thread-safe and
are only protected by the GIL), but wanted to double-check to be sure.

You can create a new subinterpreter with a call to Py_NewInterpreter.
You get a nwe interpreter, but not an independent one. The GIL is a
global object for the process. If you have more than one interpreter
in the process, they share the same GIL.

In tcl, each thread has its own interpreter instance and no GIL is
shared. This circumvents most of the problems with a global GIL.

In theory, a GIL private to each (sub)interpreter would make Python
more scalable. The current GIL behaves like the BKL in earlier Linux
kernels. However, some third-party software, notably Apache's
mod_python, is claimed to depend on this behaviour.
 
S

sturlamolden

In theory, a GIL private to each (sub)interpreter would make Python
more scalable. The current GIL behaves like the BKL in earlier Linux
kernels. However, some third-party software, notably Apache's
mod_python, is claimed to depend on this behaviour.

I just looked into the reason why ctypes, mod_python, etc. depend on a
shared GIL. Well ... PyGILState_Ensure() does not take an argument, so
it does not know which interpreter's GIL to acquire. Duh!

The sad fact is, the functions in Python's C API does not take the
interpreter as an argument, so they default to the one that is
currently active (i.e. the one that PyThreadState_Get()->interp points
to). This is unlike Java's JNI, in which all functions take the
"environment" (a Java VM instance) as the first argument.

IMHO, I consider this a major design flaw in Python's C API. In a more
well thought API, PyGILState_Ensure would take the interpreter
returned by Py_NewInterpreter as an argument, and thus know the
interpreter with which to synchronize.

I complained about this before, but the answer I got was that the
simplified GIL API would not work if interpreters has a separate GIL.
Obviously it would not work as long as the PyGILState_Ensure does not
take any arguments. The lack of a leading VM argument in
PyGILState_Ensure, and almost every function in Python's C API, is the
heart of the problem.
 
G

Graham Dumpleton

You can create a new subinterpreter with a call to Py_NewInterpreter.
You get a nwe interpreter, but not an independent one. The GIL is a
global object for the process. If you have more than one interpreter
in the process, they share the same GIL.

In tcl, each thread has its own interpreter instance and no GIL is
shared. This circumvents most of the problems with a global GIL.

In theory, a GIL private to each (sub)interpreter would make Python
more scalable. The current GIL behaves like the BKL in earlier Linux
kernels. However, some third-party software, notably Apache'smod_python, is claimed to depend on this behaviour.

I wouldn't use mod_python as a good guide on how to do this as it
doesn't properly use PyGILState_Ensure() for main interpreter like it
should. If you want an example of how to do this, have a look at code
for mod_wsgi instead. If you want it to work for Python 3.0 as well as
Python 2.X, make sure you look at mod_wsgi source code from subversion
repository trunk as tweak had to be made to source to support Python
3.0. This is because in Python 3.0 it is no longer sufficient to hold
only the GIL when using string/unicode functions, you also need a
proper thread state to be active now.

Do note that although multiple sub interpreters can be made to work,
destroying sub interpreters within the context of a running process,
ie., before the process ends, can be a cause for various problems with
third party C extension modules and thus would advise that once a sub
interpreter has been created, you keep it and use it for the life of
the process.

Graham
 
R

Rhamphoryncus

I just looked into the reason why ctypes, mod_python, etc. depend on a
shared GIL. Well ... PyGILState_Ensure() does not take an argument, so
it does not know which interpreter's GIL to acquire. Duh!

The sad fact is, the functions in Python's C API does not take the
interpreter as an argument, so they default to the one that is
currently active (i.e. the one that PyThreadState_Get()->interp points
to). This is unlike Java's JNI, in which all functions take the
"environment" (a Java VM instance) as the first argument.

IMHO, I consider this a major design flaw in Python's C API. In a more
well thought API, PyGILState_Ensure would take the interpreter
returned by Py_NewInterpreter as an argument, and thus know the
interpreter with which to synchronize.

I complained about this before, but the answer I got was that the
simplified GIL API would not work if interpreters has a separate GIL.
Obviously it would not work as long as the PyGILState_Ensure does not
take any arguments. The lack of a leading VM argument in
PyGILState_Ensure, and almost every function in Python's C API, is the
heart of the problem.

Alternatively, the multiple interpreter API could be ripped out, and
you could just use separate threads. Do you have a use case that
requires it?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top