I just killed GIL!!!

S

sturlamolden

Have you tackled the communication problem? The way I see it, one
interpreter cannot "see" objects created in the other because they
have separate pools of ... everything. They can communicate by
passing serialized objects through ctypes, but that sounds like the
solutions that use processes.

I see two solutions to that.

It is possible to use fine-grained locking on the objects that need to
be communicated. ctypes can pass PyObject* pointers
(ctypes.py_object), or we could resort to some C. When interpreter A
get a PyObject* pointer from B (a total of four bytes on my computer),
interpreter A can:

(1) Acquire B's GIL
(2) Increment the refcount of the PyObject*
(3) Release B's GIL
(4) Use the object as its own (synchronization is required)

Sharing PyObject* pointer has the possibility of shooting your leg
off. But I believe the small quirks can be worked out.

The other option is to use serialized Queues like the processing
module in cheese shop. I don't think it will be faster than similar
mechanisms based on IPC, because object serialization and
deserialization are the more expensive parts.
 
J

Jonathan Gardner

I'm sorry, but I don't like being told to use C. Perhaps I would like
the expressiveness of Python, am willing to accept the cost in
performance, but would also like to take advantage of technology to
get performance gains when I can? What's so unreasonable about that?

If you're satisfied then don't take the next optimization step.
 
S

sturlamolden

Interesting. Windows specific, but there's other ways to do the same
thing more portably.

I believe you can compile Python as a shared object (.so) on Linux as
well, and thus loadable by ctypes.
The bigger issue is that you can't share any objects.

Why not?

This
effectively gives you a multiprocess model - a bit cheaper than that,
but not enough to really supply GIL-free threading.

That solution is safe. But I am looking into sharing objects. I don't
think its impossible.

PyObject* pointers can be passed around. GILs can be acquired and
released, refcounts increased and decreased, etc. but we have to sort
out some synchronization details for the shared objects. For one
thing, we have to make sure that a garbage collector does not try to
reclaim a PyObject* belonging to another interpreter. But here we are
talking about minor changes to CPython's source, or perhaps none at
all.
 
J

Jonathan Gardner

I'd love to be wrong about that, but the GIL *has* been the subject of
extensive efforts to kill it over the last five years, and it has
survived despite the best efforts of the developers.

To add to that...

In my mind, I see three options for multi-process systems:

(1) Locks.
(2) A global lock (GIL)
(3) Learning to live with the possibility of things disappearing out
from under you.

In the SQL world, they chose (3). In the Java/C++/C# world, they chose
(1). I like Python's compromise a lot, even though it means in a
single process, you can only have one thread doing Python at a time.
Usually the bits I want to parallelize on are blocking system calls to
the network or disk anyway, or the result of a long calculation that
updates its result all at once. So having the OS handle the tough bits
while I program in a fantasy world where threads are an illusion is
fine with me.

Discovering a way to get rid of the GIL and not have to do (1) and (3)
is truly exciting, but I've lost hope a long time ago.

Besides, if it gets in the way I can always do something novel like, I
don't know, spawn another Python process?
 
J

Jonathan Gardner

An there you have the answer. It's really very simple :)

That's an interesting hack.

Now, how do the processes communicate with each other without stepping
on each other's work and without using a lock?

Once you get that solved, I am sure the entire computing world will
beat a path to your door. (Hint: This is the kind of thing Stroustrup,
Guido, and Alonzo Church have thought a lot about, just to name a
few.)

By the way, you do know that you can recompile Python from source
code, and that you have the freedom to modify that source code. If you
want to remove the GIL and see what happens, just make the calls to
acquire and release the GIL do nothing. See how far that will get you.
 
J

Jonathan Gardner

I see two solutions to that.

It is possible to use fine-grained locking on the objects that need to
be communicated.

And you'll pay a price for every lock/unlock operation, in addition to
the added complexity of the code (which you are already beginning to
see.) That's been tried in Python, and everyone agreed that the GIL
was the better compromise.
The other option is to use serialized Queues like the processing
module in cheese shop.

You mean pipes, files, and sockets?

You should check out Stackless's channels. Not a new or unique
concept, but a very powerful one that everyone should be familiar with.
 
R

Rhamphoryncus

I believe you can compile Python as a shared object (.so) on Linux as
well, and thus loadable by ctypes.

Python is compiled as a .so, but I don't believe that makes everything
private to that .so. Other means may be necessary.

Not that important though.

That solution is safe. But I am looking into sharing objects. I don't
think its impossible.

PyObject* pointers can be passed around. GILs can be acquired and
released, refcounts increased and decreased, etc. but we have to sort
out some synchronization details for the shared objects. For one
thing, we have to make sure that a garbage collector does not try to
reclaim a PyObject* belonging to another interpreter. But here we are
talking about minor changes to CPython's source, or perhaps none at
all.

But can you automatically manage the reference count? ie, does your
interpreter have a proxy to the other interpreter's object, or does
the object itself gain a field indicating who owns it?

Either way you'll need to keep the number of shared objects to a
minimum, as the use of locking creates a bottleneck - only one thread
can run at a time for a given object.
 
S

sturlamolden

That's an interesting hack.

Now, how do the processes communicate with each other without stepping
on each other's work and without using a lock?

Why can't I use a lock?

There is a big difference between fine-grained locking on each object
(cf. Java) and a global lock for everything (cf. CPython's GIL). Fine-
grained locking for each object has been tried, and was found to be a
significant slow down in the single-threaded case.

What if we just do fine grained locking on objects that need to be
shared?

What if we accept that "shared" objects are volatile and may suddenly
disappear (like a weakref), and trap that as an exception?
 
M

Martin v. Löwis

An there you have the answer. It's really very simple :)

I'm fairly skeptical that it actually works. If the different
Python interpreters all import the same extension module
(say, _socket.pyd), windows won't load the DLL twice, but
only one time. So you end up with a single copy of _socket,
which will have a single copy of socket.error, socket.socket,
and so on.

The single copy of socket.error will inherit from one specific
copy of IOError. So if you import socket in a different
interpreter, and raise socket.error there, and try to catch
IOError, the exception won't be caught - because *that*
IOError is then not a base class of socket.error.

A more general approach similar to this one is known
as ".NET appdomains" or "Java isolates". It requires
the VM to refrain from having any global state, and
to associate all "global" state with the appdomain
or isolate. Python can't easily support that kind of
model, because extension modules have global state
all the time.

Even if Python supported appdomains, you still wouldn't
get any object sharing out of it. As soon as you start
to share some object (but not their types), your entire
type hierarchy gets messed up.

FWIW, Tcl implements this model precisely: Tcl is
not thread-safe in itself, but supports a
interpreter-per-thread model. I'm also skeptical that
this is any better than the GIL.

Regards,
Martin
 
R

Rhamphoryncus

Sounds very interesting. I particularly liked this bit from the web
page - an excellent solution to fine grained locking. Sending only
immutable objects between threads is very like the functional approach
used by Erlang which is extremely good at concurrency.

Although superficially similar, the details of Erlang are actually
pretty different. It copies all objects passed between threads - it
was originally designed for fault tolerance (entire nodes going down),
not concurrency. If you want a shared mutable object you need to use
a "process" as one, treating it as an actor. Hopefully you can do
most of it in a one-way, message driven style, as otherwise you're
going to be transforming your synchronous calls into a series of
callbacks.

If you have a node farm, you want to upgrade it incrementally, and you
want it to be fault tolerant (nodes go down at random without kill the
whole thing), Erlang is much better than safethread. That's at a
significant cost though, as it's only good at the one style.
Safethread is much better at a style useful on a single desktop.
 
S

sjdevnull

Well... I like the processing module. Except that Wintendo toy OS has
no fork() availabe for the Win32 subsystem

Passing a NULL SectionHandle to NTCreateProcess/CreateProcessEx
results in a fork-style copy-on-write duplicate of the current process.
 
S

sturlamolden

Passing a NULL SectionHandle to NTCreateProcess/CreateProcessEx
results in a fork-style copy-on-write duplicate of the current process.

I know about NtCreateProcess and ZwCreateProcess, but they just create
an empty process - no context, no thread(s), no DLLs loaded, etc.
There is even an example code of how to implement fork() with
ZwCreateProcess in Nebbet's book on NT kernel internals, but
apparently it doesn't work quite well. (Right now I cannot even make
it compile, because WDK headers are fubar with invalid C; even
Microsoft's own compiler does not accept them.)

Searching with Google, I find several claims that there is a
"CreateProcessEx", which can do a COW fork of a process in the Win32
subsystem. I cannot find it documented anywhere. It is also not
exported by kernel32.dll. If you know how this function is defined and
which DLL exports it, please post it. But I suspect it does not
exist.
 
S

sjdevnull

I know about NtCreateProcess and ZwCreateProcess, but they just create
an empty process - no context, no thread(s), no DLLs loaded, etc.
There is even an example code of how to implement fork() with
ZwCreateProcess in Nebbet's book on NT kernel internals, but
apparently it doesn't work quite well.

It works fine for a copy-on-write process creation. It doesn't work
100% compatibly to fork. Nebbet is the best reference out there on
the method.

FWIW, NT's POSIX subsytem fork() uses (or used to use) the NULL
SectionHandle method and was POSIX certified, so it's certainly
possible.
Searching with Google, I find several claims that there is a
"CreateProcessEx"

Yeah my bad, I meant zwCreateProcess. It's been almost a decade now
since I used it.
 
S

sturlamolden

FWIW, NT's POSIX subsytem fork() uses (or used to use) the NULL
SectionHandle method and was POSIX certified, so it's certainly
possible.

Windows Vista Ultimate comes with Interix integrated, renamed
'Subsystem for Unix based Applications' or SUA for short. Interix is
even UNIX certified when a C compiler is installed. Windows also has a
OS/2 subsystem which has a COW fork. Yes it is possible. One may
wonder why the Win32 subsystem don't have this feature. Perhaps fork()
is unfriendly to threads, like fork on Linux used to be (or is?)
pthread unfriendly. Or perhaps M$ (MegaDollar) just did this to be
mean. I don't know. I see the lack of fork() in Win32 as one of the
major shortcomings of Windows.

Anyhow, I just downloaded the WDK which supersedes the DDK. The
examples in Nebbet's book do not build anymore, as there are invalid C
in the WDK header files. :-(
 
A

Aahz

No it isn't. That idea is borne of the narrowmindedness of people who
write server-like network apps. What's true for web servers isn't
true for every application.

Only when you have only one application running on a machine.
 
C

Carl Banks

Only when you have only one application running on a machine.

Needless pedantry.

"Using 100% of the CPU time a OS allow a process to have is not
necessarily a bug." Happy?


Carl Banks
 
A

Aahz

Needless pedantry.

"Using 100% of the CPU time a OS allow a process to have is not
necessarily a bug." Happy?

Not really; my comment is about the same level of pedantry as yours.
Jonathan's comment was clearly in the context of inappropriate CPU usage
(e.g. spin-polling). Obviously, there are cases where hammering on the
CPU for doing a complex calculation may be appropriate, but in those
cases, you will want to ensure that your application gets as much CPU as
possible by removing all unnecessary CPU usage by other apps.
 
C

Carl Banks

Not really; my comment is about the same level of pedantry as yours.
Jonathan's comment was clearly in the context of inappropriate CPU usage
(e.g. spin-polling).

That's far from evident. Jonathan's logic went from "I'm using 100%
CPU" to "You must be spin-polling". At best, Jonathan was making some
unsupported assumptions about the type of program sturlamolden had in
mind, and criticized him based on it. But frankly, I've seen enough
people who seem to have no conception that anyone could write a useful
program without an I/O loop that it wouldn't surprise me it he meant
it generally.

Obviously, there are cases where hammering on the
CPU for doing a complex calculation may be appropriate, but in those
cases, you will want to ensure that your application gets as much CPU as
possible by removing all unnecessary CPU usage by other apps.

Nonsense. If I'm running a background task on my desktop, say
formating a complex document for printing, I would like it to take up
as much of CPU as possible, but still have secondary priority to user
interface processes so that latency is low.


Carl Banks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,262
Messages
2,571,044
Members
48,769
Latest member
Clifft

Latest Threads

Top