Threads vs Processes

  • Thread starter Carl J. Van Arsdall
  • Start date
S

sjdevnull

John said:
If you're sharing things, I would thread. I would not want to pay the
expense of a process.

This is generally a false cost. There are very few applications where
thread/process startup time is at all a fast path, and there are
likewise few where the difference in context switching time matters at
all. Indeed, in a Python program on a multiprocessor system, process
are potentially faster than threads, not slower.

Moreover, to get at best a small performance gain you pay a huge cost
by sacrificing memory protection within the threaded process.

You can share things between processes, but you can't memory protect
things between threads. So if you need some of each (some things
shared and others protected), processes are the clear choice.

Now, for a few applications threads make sense. Usually that means
applications that have to share a great number of complex data
structures (and normally, making the choice for performance reasons
means your design is flawed and you could help performance greatly by
reworking it--though there may be some exceptions). But the general
rule when choosing between them should be "use processes when you can,
and threads when you must".

Sadly, too many programmers greatly overuse threading. That problem is
exacerbated by the number of beginner-level programming books that talk
about how to use threads without ever mentioning processes (and without
going into the design of multi-execution apps).
 
J

John Henry

Nick said:
Here is test prog...

<snip>

Here's a more real-life like program done in both single threaded mode
and multi-threaded mode. You'll need PythonCard to try this. Just to
make the point, you will notice that the core code is identical between
the two (method on_menuFileStart_exe). The only difference is in the
setup code. I wanted to dismiss the myth that multi-threaded programs
are inherently *evil*, or that it's diffcult to code, or that it's
unsafe.....(what ever dirty water people wish to throw at it).

Don't ask me to try this in process!

To have fun, first run it in single threaded mode (change the main
program to invoke the MyBackground class, instead of the
MyBackgroundThreaded class):

Change:

app = model.Application(MyBackgroundThreaded)

to:

app = model.Application(MyBackground)

Start the process by selecting File->Start, and then try to stop the
program by clicking File->Stop. Note the performance of the program.

Now, run it in multi-threaded mode. Click File->Start several times
(up to 4) and then try to stop the program by clicking File->Stop.

If you want to show off, add several more StaticText items in the
resource file, add them to the textAreas list in MyBackgroundThreaded
class and let it rip!

BTW: This ap also demonstrates the weakness in Python thread - the
threads don't get preempted equally (not even close).

:)

Two files follows (test.py and test.rsrc.py):

#!/usr/bin/python

"""
__version__ = "$Revision: 1.1 $"
__date__ = "$Date: 2004/10/24 19:21:46 $"
"""

import wx
import threading
import thread
import time

from PythonCard import model

class MyBackground(model.Background):

def on_initialize(self, event):
# if you have any initialization
# including sizer setup, do it here
self.running(False)
self.textAreas=(self.components.TextArea1,)
return

def on_menuFileStart_select(self, event):
on_menuFileStart_exe(self.textAreas[0])
return

def on_menuFileStart_exe(self, textArea):
textArea.visible=True
self.running(True)
for i in range(10000000):
textArea.text = "Got up to %d" % i
## print i
for j in range(i):
k = 0
time.sleep(0)
if not self.running(): break
try:
wx.SafeYield(self)
except:
pass
if not self.running(): break
textArea.text = "Finished at %d" % i
return

def on_menuFileStop_select(self, event):
self.running(False)

def on_Stop_mouseClick(self, event):
self.on_menuFileStop_select(event)
return

def running(self, flag=None):
if flag!=None:
self.runningFlag=flag
return self.runningFlag


class MyBackgroundThreaded(MyBackground):

def on_initialize(self, event):
# if you have any initialization
# including sizer setup, do it here
self.myLock=thread.allocate_lock()
self.myThreadCount = 0
self.running(False)
self.textAreas=[self.components.TextArea1, self.components.TextArea2,
self.components.TextArea3, self.components.TextArea4]
return

def on_menuFileStart_select(self, event):
res=MyBackgroundWorker(self).start()

def on_menuFileStop_select(self, event):
self.running(False)
self.menuBar.setEnabled("menuFileStart", True)

def on_Stop_mouseClick(self, event):
self.on_menuFileStop_select(event)

def running(self, flag=None):
self.myLock.acquire()
if flag!=None:
self.runningFlag=flag
flag=self.runningFlag
self.myLock.release()
return flag

class MyBackgroundWorker(threading.Thread):
def __init__(self, parent):
threading.Thread.__init__(self)
self.parent=parent
self.parent.myLock.acquire()
threadCount=self.parent.myThreadCount
self.parent.myLock.release()
self.textArea=self.parent.textAreas[threadCount]

def run(self):
self.parent.myLock.acquire()
self.parent.myThreadCount += 1
if self.parent.myThreadCount==len(self.parent.textAreas):
self.parent.menuBar.setEnabled("menuFileStart", False)
self.parent.myLock.release()

self.parent.on_menuFileStart_exe(self.textArea)

self.parent.myLock.acquire()
self.parent.myThreadCount -= 1
if self.parent.myThreadCount==0:
self.parent.menuBar.setEnabled("menuFileStart", True)
self.parent.myLock.release()

return

if __name__ == '__main__':
app = model.Application(MyBackgroundThreaded)
app.MainLoop()



Here's the associated resource file:

{'application':{'type':'Application',
'name':'Template',
'backgrounds': [
{'type':'Background',
'name':'bgTemplate',
'title':'Standard Template with File->Exit menu',
'size':(400, 300),
'style':['resizeable'],

'menubar': {'type':'MenuBar',
'menus': [
{'type':'Menu',
'name':'menuFile',
'label':'&File',
'items': [
{'type':'MenuItem',
'name':'menuFileStart',
'label':u'&Start',
},
{'type':'MenuItem',
'name':'menuFileStop',
'label':u'Sto&p',
},
{'type':'MenuItem',
'name':'menuFile--',
'label':u'--',
},
{'type':'MenuItem',
'name':'menuFileExit',
'label':'E&xit',
'command':'exit',
},
]
},
]
},
'components': [

{'type':'StaticText',
'name':'TextArea1',
'position':(10, 100),
'text':u'This is a test',
'visible':False,
},

{'type':'StaticText',
'name':'TextArea2',
'position':(160, 100),
'text':u'This is a test',
'visible':False,
},

{'type':'StaticText',
'name':'TextArea3',
'position':(10, 150),
'text':u'This is a test',
'visible':False,
},

{'type':'StaticText',
'name':'TextArea4',
'position':(160, 150),
'text':u'This is a test',
'visible':False,
},

] # end components
} # end background
] # end backgrounds
} }
 
D

Dennis Lee Bieber

Ah, alright, I think I understand, so threading works well for sharing
python objects. Would a scenario for this be something like a a job
queue (say Queue.Queue) for example. This is a situation in which each
process/thread needs access to the Queue to get the next task it must
work on. Does that sound right? Would the same apply to multiple
threads needed access to a dictionary? list?
Python's Queue module is only (to my knowledge) an internal
(thread-shared) communication channel; you'd need something else to work
IPC -- VMS mailboxes, for example (more general than UNIX pipes with
their single reader/writer concept)
"shared memory" mean something more low-level like some bits that don't
necessarily mean anything to python but might mean something to your
application?
Most OSs support creation and allocation of memory blocks with an
attached name; this allows multiple processes to map that block of
memory into their address space. The contents of said memory block is
totally up to application agreements (won't work well with Python native
objects).

mmap()

is one such system. By rough description, it maps a disk file into a
block of memory, so the OS handles loading the data (instead of, say,
file.seek(somewhere_long) followed by file.read(some_data_type) you
treat the mapped memory as an array and use x = mapped[somewhere_long];
if somewhere_long is not yet in memory, the OS will page swap that part
of the file into place). The "file" can be shared, so different
processes can map the same file, and thereby, the same memory contents.

This can be useful, for example, with multiple identical processes
feeding status telemetry. Each process is started with some ID, and the
ID determines which section of mapped memory it is to store its status
into. The controller program can just do a loop over all the mapped
memory, updating a display with whatever is current -- doesn't matter if
process_N manages to update a field twice while the monitor is
scanning... The display always shows the data that was current at the
time of the scan.

Carried further -- special memory cards can (at least they were
where I work) be obtained. These cards have fiber-optic connections. In
a closely distributed system, each computer has one of these cards, and
the fiber-optics link them in a cycle. Each process (on each computer)
maps the memory of the card -- the cards then have logic to relay all
memory changes, via fiber, to the next card in the link... Thus, all the
closely linked computers "share" this block of memory.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
B

bryanjugglercryptographer

Carl said:
Ah, alright, I think I understand, so threading works well for sharing
python objects. Would a scenario for this be something like a a job
queue (say Queue.Queue) for example. This is a situation in which each
process/thread needs access to the Queue to get the next task it must
work on. Does that sound right?

That's a reasonable and popular technique. I'm not sure what "this"
refers to in your question, so I can't say if it solves the
problem of which you are thinking.
Would the same apply to multiple
threads needed access to a dictionary? list?

The Queue class is popular with threads because it already has
locking around its basic methods. You'll need to serialize your
operations when sharing most kinds of objects.
Now if you are just passing ints and strings around, use processes with
some type of IPC, does that sound right as well?

Also reasonable and popular. You can even pass many Python objects
by value using pickle, though you lose some safety.
Or does the term
"shared memory" mean something more low-level like some bits that don't
necessarily mean anything to python but might mean something to your
application?

Shared memory means the same memory appears in multiple processes,
possibly at different address ranges. What any of them writes to
the memory, they can all read. The standard Python distribution
now offers shared memory via os.mmap(), but lacks cross-process
locks.

Python doesn't support allocating objects in shared memory, and
doing so would be difficult. That's what the POSH project is
about, but it looks stuck in alpha.
 
G

Grant Edwards

This is generally a false cost. There are very few
applications where thread/process startup time is at all a
fast path,

Even if it were, on any sanely designed OS, there really isn't
any extra expense for a process over a thread.
Moreover, to get at best a small performance gain you pay a
huge cost by sacrificing memory protection within the threaded
process.

Threading most certainly shouldn't be done in some attempt to
improve performance over a multi-process model. It should be
done because it fits the algorithm better. If the execution
contexts don't need to share data and can communicate in a
simple manner, then processes probably make more sense. If the
contexts need to operate jointly on complex shared data, then
threads are usually easier.
 
C

Carl J. Van Arsdall

That's a reasonable and popular technique. I'm not sure what "this"
refers to in your question, so I can't say if it solves the
problem of which you are thinking.



The Queue class is popular with threads because it already has
locking around its basic methods. You'll need to serialize your
operations when sharing most kinds of objects.
Yes yes, of course. I was just making sure we are on the same page, and
I think I'm finally getting there.

Also reasonable and popular. You can even pass many Python objects
by value using pickle, though you lose some safety.
I actually do use pickle (not for this, but for other things), could you
elaborate on the safety issue?

Shared memory means the same memory appears in multiple processes,
possibly at different address ranges. What any of them writes to
the memory, they can all read. The standard Python distribution
now offers shared memory via os.mmap(), but lacks cross-process
locks.

Python doesn't support allocating objects in shared memory, and
doing so would be difficult. That's what the POSH project is
about, but it looks stuck in alpha.


--

Carl J. Van Arsdall
(e-mail address removed)
Build and Release
MontaVista Software
 
B

bryanjugglercryptographer

Carl J. Van Arsdall wrote:
[...]
I actually do use pickle (not for this, but for other things), could you
elaborate on the safety issue?

Warning: The pickle module is not intended to be secure
against erroneous or maliciously constructed data. Never
unpickle data received from an untrusted or unauthenticated
source.

A corrupted pickle can crash Python. An evil pickle could probably
hijack your process.
 
C

Carl J. Van Arsdall

Carl J. Van Arsdall wrote:
[...]
I actually do use pickle (not for this, but for other things), could you
elaborate on the safety issue?

Warning: The pickle module is not intended to be secure
against erroneous or maliciously constructed data. Never
unpickle data received from an untrusted or unauthenticated
source.

A corrupted pickle can crash Python. An evil pickle could probably
hijack your process.
Ah, i the data is coming from someone else. I understand. Thanks.

--

Carl J. Van Arsdall
(e-mail address removed)
Build and Release
MontaVista Software
 
M

mark

Alright, based a on discussion on this mailing list, I've started to
wonder, why use threads vs processes.

The debate should not be about "threads vs processes", it should be
about "threads vs events". Dr. John Ousterhout (creator of Tcl,
Professor of Comp Sci at UC Berkeley, etc), started a famous debate
about this 10 years ago with the following simple presentation.

http://home.pacbell.net/ouster/threads.pdf

That sentiment has largely been ignored and thread usage dominates but,
if you have been programming for as long as I have, and have used both
thread based architectures AND event/reactor/callback based
architectures, then that simple presentation above should ring very
true. Problem is, young people merely equate newer == better.

On large systems and over time, thread based architectures often tend
towards chaos. I have seen a few thread based systems where the
programmers become so frustrated with subtle timing issues etc, and they
eventually overlay so many mutexes etc, that the implementation becomes
single threaded in practice anyhow(!), and very inefficient.

BTW, I am fairly new to python but I have seen that the python Twisted
framework is a good example of the event/reactor design alternative to
threads. See

http://twistedmatrix.com/projects/core/documentation/howto/async.html .

Douglas Schmidt is a famous designer and author (ACE, Corba Tao, etc)
who has written much about reactor design patterns, see
"Pattern-Oriented Software Architecture, Vol 2", Wiley 2000, amongst
many other references of his.
 
B

bryanjugglercryptographer

mark said:
The debate should not be about "threads vs processes", it should be
about "threads vs events".

We are so lucky as to have both debates.
Dr. John Ousterhout (creator of Tcl,
Professor of Comp Sci at UC Berkeley, etc), started a famous debate
about this 10 years ago with the following simple presentation.

http://home.pacbell.net/ouster/threads.pdf

The Ousterhout school finds multiple lines of execution
unmanageable, while the Tannenbaum school finds asynchronous I/O
unmanageable.

What's so hard about single-line-of-control (SLOC) event-driven
programming? You can't call anything that might block. You have to
initiate the operation, store all the state you'll need in order
to pick up where you left off, then return all the way back to the
event dispatcher.
That sentiment has largely been ignored and thread usage dominates but,
if you have been programming for as long as I have, and have used both
thread based architectures AND event/reactor/callback based
architectures, then that simple presentation above should ring very
true. Problem is, young people merely equate newer == better.

Newer? They're both old as the trees. That can't be why the whiz
kids like them. Threads and process rule because of their success.
On large systems and over time, thread based architectures often tend
towards chaos.

While large SLOC event-driven systems surely tend to chaos. Why?
Because they *must* be structured around where blocking operations
can happen, and that is not the structure anyone would choose for
clarity, maintainability and general chaos avoidance.

Even the simplest of modular structures, the procedure, gets
broken. Whether you can encapsulate a sequence of operations in a
procedure depends upon whether it might need to do an operation
that could block.

Going farther, consider writing a class supporting overriding of
some method. Easy; we Pythoneers do it all the time; that's what
O.O. inheritance is all about. Now what if the subclass's version
of the method needs to look up external data, and thus might
block? How does a method override arrange for the call chain to
return all the way back to the event loop, and to and pick up
again with the same call chain when the I/O comes in?
I have seen a few thread based systems where the
programmers become so frustrated with subtle timing issues etc, and they
eventually overlay so many mutexes etc, that the implementation becomes
single threaded in practice anyhow(!), and very inefficient.

While we simply do not see systems as complex as modern DBMS's
written in the SLOC event-driven style.
BTW, I am fairly new to python but I have seen that the python Twisted
framework is a good example of the event/reactor design alternative to
threads. See

http://twistedmatrix.com/projects/core/documentation/howto/async.html .

And consequently, to use Twisted you rewrite all your code as
those 'deferred' things.
 
N

Nick Vatamaniuc

It seems that both ways are here to stay. If one was so much inferior
and problem-prone, we won't be talking about it now, it would have been
forgotten on the same shelf with a stack of punch cards.

The rule of thumb is 'the right tool for the right job.'

Threading model is very useful for long CPU-bound processing, as it can
potentially take advantage of multiple CPUs/Cores (alas not in Python
now because of GIL). The events will not work as well here. But note,
if there is not much sharing of resources between threads processes
could be used! It turns out that there are very few cases where threads
are simply indispensable.

The event model is usually well suited for I/O or for any large number
of shared resources occurs that would require lots of synchronizations
if threads would be used.

DBMS' are not a good example of typical large, so 'saying see DBMS use
threads -- therefore threads are better' doesn't make a good example.
DBMS are highly optimized, only a few of them actually manage to
successfully take advantage of the multiple execution units. One could
as well cite a hundred of other projects and say 'see it uses an event
model -- therefore event models are better' and so on. Again "right
tool for the right job". A good programmer should know both...
And consequently, to use Twisted you rewrite all your code as
those 'deferred' things.
Then, try re-writing Twisted using threads in the same number of lines
having the same or better performance. I bet you'll end up having a
whole bunch of 'locks', 'waits' and 'notify's instead of a bunch of
"those 'deferred' things." Debugging all those threads should be a
project in an of itself.

-Nick
 
H

H J van Rooyen

| On Thu, 27 Jul 2006 09:17:56 -0700, "Carl J. Van Arsdall"
| <[email protected]> declaimed the following in comp.lang.python:
|
| > Ah, alright, I think I understand, so threading works well for sharing
| > python objects. Would a scenario for this be something like a a job
| > queue (say Queue.Queue) for example. This is a situation in which each
| > process/thread needs access to the Queue to get the next task it must
| > work on. Does that sound right? Would the same apply to multiple
| > threads needed access to a dictionary? list?
| >
| Python's Queue module is only (to my knowledge) an internal
| (thread-shared) communication channel; you'd need something else to work
| IPC -- VMS mailboxes, for example (more general than UNIX pipes with
| their single reader/writer concept)
|
| > "shared memory" mean something more low-level like some bits that don't
| > necessarily mean anything to python but might mean something to your
| > application?
| >
| Most OSs support creation and allocation of memory blocks with an
| attached name; this allows multiple processes to map that block of
| memory into their address space. The contents of said memory block is
| totally up to application agreements (won't work well with Python native
| objects).
|
| mmap()
|
| is one such system. By rough description, it maps a disk file into a
| block of memory, so the OS handles loading the data (instead of, say,
| file.seek(somewhere_long) followed by file.read(some_data_type) you
| treat the mapped memory as an array and use x = mapped[somewhere_long];
| if somewhere_long is not yet in memory, the OS will page swap that part
| of the file into place). The "file" can be shared, so different
| processes can map the same file, and thereby, the same memory contents.
|
| This can be useful, for example, with multiple identical processes
| feeding status telemetry. Each process is started with some ID, and the
| ID determines which section of mapped memory it is to store its status
| into. The controller program can just do a loop over all the mapped
| memory, updating a display with whatever is current -- doesn't matter if
| process_N manages to update a field twice while the monitor is
| scanning... The display always shows the data that was current at the
| time of the scan.
|
| Carried further -- special memory cards can (at least they were
| where I work) be obtained. These cards have fiber-optic connections. In
| a closely distributed system, each computer has one of these cards, and
| the fiber-optics link them in a cycle. Each process (on each computer)
| maps the memory of the card -- the cards then have logic to relay all
| memory changes, via fiber, to the next card in the link... Thus, all the
| closely linked computers "share" this block of memory.

This is nice to share inputs from the real world - but there are some hairy
issues if it is to be used for general purpose consumption - unless there are
hardware restrictions to stop machines stomping on each other's memories - i.e.
the machines have to be *polite* and *well behaved* - or you can easily have a
major smash...
A structure has to agreed on, and respected...

- Hendrik
 
M

mark

Debugging all those threads should be a project in an of itself.

Ahh, debugging - I forgot to bring that one up in my argument! Thanks
Nick ;)

Certainly I agree of course that there are many applications which suit
a threaded design. I just think there is a general over-emphasis on
using threads and see it applied very often where an event based
approach would be cleaner and more efficient. Thanks for your comments
Bryan and Nick, an interesting debate.
 
D

Dennis Lee Bieber

said:
| This can be useful, for example, with multiple identical processes
| feeding status telemetry. Each process is started with some ID, and the
| ID determines which section of mapped memory it is to store its status
| into. The controller program can just do a loop over all the mapped
said:
| maps the memory of the card -- the cards then have logic to relay all
| memory changes, via fiber, to the next card in the link... Thus, all the
| closely linked computers "share" this block of memory.

This is nice to share inputs from the real world - but there are some hairy
issues if it is to be used for general purpose consumption - unless there are
hardware restrictions to stop machines stomping on each other's memories - i.e.
the machines have to be *polite* and *well behaved* - or you can easily have a
major smash...
A structure has to agreed on, and respected...
As I'd mentioned in the prior paragraph, in one form...

But yes, anything mapping into that memory should be using a common
memory layout and a protocol...
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
S

sturlamolden

Chance said:
Not quite that simple. In most modern OS's today there is something
called COW - copy on write. What happens is when you fork a process
it will make an identical copy. Whenever the forked process does
write will it make a copy of the memory. So it isn't quite as bad.

A noteable exception is a toy OS from a manufacturer in Redmond,
Washington. It does not do COW fork. It does not even fork.

To make a server system scale well on Windows you need to use threads,
not processes. That is why the global interpreter lock sucks so badly
on Windows.
 
B

bryanjugglercryptographer

sturlamolden said:
A noteable exception is a toy OS from a manufacturer in Redmond,
Washington. It does not do COW fork. It does not even fork.

To make a server system scale well on Windows you need to use threads,
not processes.

Here's one to think about: if you have a bunch of threads running,
and you fork, should the child process be born running all the
threads? Neither answer is very attractive. It's a matter of which
will probably do the least damage in most cases (and the answer
the popular threading systems choose is 'no'; the child process
runs only the thread that called fork).

MS-Windows is more thread-oriented than *nix, and it avoids this
particular problem by not using fork() to create new processes.
That is why the global interpreter lock sucks so badly
on Windows.

It sucks about he same on Windows and *nix: hardly at all on
single-processors, moderately on multi-processors.
 
S

sjdevnull

sturlamolden said:
A noteable exception is a toy OS from a manufacturer in Redmond,
Washington. It does not do COW fork. It does not even fork.

That's only true for Windows 98/95/Windows 3.x and other DOS-based
Windows versions.

NTCreateProcess with SectionHandle=NULL creates a new process with a
COW version of the parent process's address space.

It's not called "fork", but it does the same thing. There's a new name
for it in Win2K or XP (maybe CreateProcessEx?) but the functionality
has been there since the NT 3.x days at least and is in all modern
Windows versions.
 
S

sjdevnull

mark said:
The debate should not be about "threads vs processes", it should be
about "threads vs events".

Events serve a seperate problem space.

Use event-driven state machine models for efficient multiplexing and
fast network I/O (e.g. writing an efficient static HTTP server)

Use multi-execution models for efficient multiprocessing. No matter
how scalable your event-driven app is it's not going to take advantage
of multi-CPU systems, or modern multi-core processors.

Event-driven state machines can be harder to program and maintain than
multi-process solutions, but they are usually easier than
multi-threaded solutions.

On-topic: If your problem is one where event-driven state machines are
a good solution, Python generators can be a _huge_ help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,281
Latest member
Pedroaciny

Latest Threads

Top