The future of "frozen" types as the number of CPU cores increases

S

sjdevnull

    Forking in multithreaded programs is iffy.

One more thing: the above statement ("forking in multithreaded
programs is iffy"), is absolutely true, but it's also completely
meaningless in modern multiprocessing programs--it's like saying
"gotos in structured programs are iffy". That's true, but it also has
almost no bearing on decently constructed modern programs.
 
M

mk

Of course. Multithreading also fails miserably if the threads all try
to call exec() or the equivalent.
It works fine if you use os.fork().

What about just using subprocess module to run system commands in worker
threads? Is this likely to have problems?

Regards,
mk
 
J

John Nagle

mk said:
What about just using subprocess module to run system commands in worker
threads? Is this likely to have problems?

Regards,
mk

The discussion above was about using "fork" to avoid duplicating the
entire Python environment for each subprocess. If you use the subprocess
module, you load a new program, so you don't get much memory sharing.
This increases the number of cache misses, which is a major bottleneck
in many-CPU systems with shared caches.

The issue being discussed was scaling Python for CPUs with many cores.
With Intel shipping 4 cores/8 hyperthread CPUs, the 6/12 part working,
and the 8/16 part coming along, this is more than a theoretical
issue.

John Nagle
 
N

Nobody

Basically, multiprocessing is always hard--but it's less hard to start
without shared everything. Going with the special case (sharing
everything, aka threading) is by far the stupider and more complex way
to approach multiprocssing.

Multi-threading hardly shares anything (only dynamically-allocated
and global data), while everything else (the entire stack) is per-thread.

Yeah, I'm being facetious. Slightly. If you're going to write
multi-threaded code, it's a good idea to wean yourself off of using global
variables and shared mutable state.
 
S

sjdevnull

Multi-threading hardly shares anything (only dynamically-allocated
and global data), while everything else (the entire stack) is per-thread.

Yeah, I'm being facetious. Slightly.

I'm afraid I may be missing the facetiousness here.

The only definitional difference between threads and processes is that
threads share memory, while processes don't.

There are often other important practical implementation details, but
sharing memory vs not sharing memory is the key theoretical
distinction between threads and processes. On most platforms, whether
or not you want to share memory (and abandon memory protection wrt the
rest of the program) is the key factor a programmer should consider
when deciding between threads and processes--the only time that's not
true is when the implementation forces ancillary details upon you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,797
Messages
2,569,647
Members
45,380
Latest member
LatonyaEde

Latest Threads

Top