threads and sleep?

  • Thread starter Jeffrey Maitland
  • Start date
J

Jeffrey Maitland

Hello all,

I am in the process of writing a multithreading program and what I was
wondering is a sleep command in an executing function will affect the
threads below it? Here is a basic example of what I mean.

def main():
temp_var = True
while temp_var == True:
if
t = threading.Thread( target = func1, args = "String") #note
this is probably non functional (example purposes for the question
only)
t.start()
temp_var = t.isAlive()
else:
print "This line should display a lot"
sleep(2)

def func1(s):
print s

so the question I was wondering is if the sleep will pause the t
thread as well as the main function or is the thread independat of the
main function sleep?

Thanks in advance.
 
P

Peter Hansen

Jeffrey said:
I am in the process of writing a multithreading program and what I was
wondering is a sleep command in an executing function will affect the
threads below it?

Note that the "executing function" is itself running inside a thread,
specifically the "main thread". The other threads are not "below" it in
any particular sense, more like beside it. (Or, to put it another way,
there is no such concept of relative position with threads in Python.)

Here is a basic example of what I mean.
[snip code]
so the question I was wondering is if the sleep will pause the t
thread as well as the main function or is the thread independat of the
main function sleep?

Your code is very close to working already... why don't you just run it
and observe how it works?

In any case, the answer is "no, time.sleep() affects only the current
thread".

-Peter
 
D

Dennis Lee Bieber

In any case, the answer is "no, time.sleep() affects only the current
thread".
Heck, it is one of the means of getting number-crunchers to give
up the CPU to allow other threads/processes to run <G>

--
 
J

Jeffrey Maitland

Hello all,

First off Thanks for the responses on the sleep() part I suspected as
much but I wasn't 100% sure.

To continue this I just want to pre thank anyone that contributes to
this thread(lol).

Ok here goes. The problem I have is I had an application
(wrote/co-wrote) that has a long run time dependant on some variables
passed to it (mainly accuracy variables, the more accurate the longer
the run time - makes sense). However in the hopes to speed it up I
decided to write a threaded version of the program to try and speed
it up. How ever what I am noticing is that the threaded version is
taking as long possibly longer to run. The thing is the threaded
version is running on an 8 ia-64 proccessor system and it seems to
only be using 2 or 3 porcessors at about 30% (fluxiates). My guess is
that 6 threads are running they are using 30% sprox each of a 2 given
CPUS.

What I would like to do is have say 1 thread use as much of a given
CPU as possible and if a new thread is started (added) that if a CPU
is available use it instead of using the same cpu. That way it should
speed the application up. The standalone (non-threaded) app uses 90+
% of a single cpu when in this part of the algorithm, that is why I
split and threaded that part of the algorithm there to try and speed
it up because this part is repeated several times. I can post
generic code of what I am doing but can't post the actuall code
because of the confidentially of it. Thanks again in advance for any
and all comments (even any spitefull ones)

Jeff
 
J

Jonathan Ellis

Jeffrey said:
The problem I have is I had an application
(wrote/co-wrote) that has a long run time dependant on some variables
passed to it (mainly accuracy variables, the more accurate the longer
the run time - makes sense). However in the hopes to speed it up I
decided to write a threaded version of the program to try and speed
it up. How ever what I am noticing is that the threaded version is
taking as long possibly longer to run. The thing is the threaded
version is running on an 8 ia-64 proccessor system and it seems to
only be using 2 or 3 porcessors at about 30% (fluxiates). My guess is
that 6 threads are running they are using 30% sprox each of a 2 given
CPUS.

In many ways, Python is an incredibly bad choice for deeply
multithreaded applications. One big problem is the global interpreter
lock; no matter how many CPUs you have, only one will run python code
at a time. (Many people who don't run on multiple CPUs anyway try to
wave this off as a non-problem, or at least worth the tradeoff in terms
of a simpler C API, but with multicore processors coming from every
direction I think the "let's pretend we don't have a problem" approach
may not work much longer.)

If the GIL isn't an issue (and in your case it clearly is), you'll
quickly find that there's little support for debugging multithreaded
applications, and even less for profiling.

Sometimes running multiple processes is an acceptable workaround; if
not, good luck with the rewrite in Java or something else with real
thread support. (IIRC Jython doesn't have a GIL; that might be an
option too.)

Python is a great tool but if you really need good threading support
you will have to look elsewhere.

-Jonathan
 
G

Grant Edwards

Ok here goes. The problem I have is I had an application
(wrote/co-wrote) that has a long run time dependant on some variables
passed to it (mainly accuracy variables, the more accurate the longer
the run time - makes sense). However in the hopes to speed it up I
decided to write a threaded version of the program to try and speed
it up. How ever what I am noticing is that the threaded version is
taking as long possibly longer to run. The thing is the threaded
version is running on an 8 ia-64 proccessor system and it seems to
only be using 2 or 3 porcessors at about 30% (fluxiates). My guess is
that 6 threads are running they are using 30% sprox each of a 2 given
CPUS.

Because of the global interpreter lock, a multi-threaded python
program does not take advantage of multiple processors. No
matter how many CPUs you have, only one thread is allowed to
run at any point in time.

Multi-threading in Python is useful for simplifying the
architecture of a program that has to do multiple independent
tasks, but it isn't useful for actually running multiple
threads in parallel.
 
D

Dennis Lee Bieber

it up. How ever what I am noticing is that the threaded version is
taking as long possibly longer to run. The thing is the threaded

Well, threading does add some overhead in terms of the task swap
time.
version is running on an 8 ia-64 proccessor system and it seems to
only be using 2 or 3 porcessors at about 30% (fluxiates). My guess is
that 6 threads are running they are using 30% sprox each of a 2 given
CPUS.

What I would like to do is have say 1 thread use as much of a given
CPU as possible and if a new thread is started (added) that if a CPU
is available use it instead of using the same cpu. That way it should

Don't think you can do that with Python... The Python runtime
interpreter itself is running on a single processor. The second thing is
the infamous "global interpreter lock" (pull up the Python documentation
and do a search for that phrase). Basically, even if the threads could
be assigned to processors, this lock means only one thread can be
performing Python operations at a time -- a C-language number crunching
module /could/ release the lock, then do its number crunching in
parallel, reacquiring the lock when it finishes so it can return its
result(s) as Python objects.

You might get the results you want by not using threads, instead
spawning off completely new Python invocations assigned to other
processors.

--
 
M

Mike Meyer

Jeffrey Maitland said:
What I would like to do is have say 1 thread use as much of a given
CPU as possible and if a new thread is started (added) that if a CPU
is available use it instead of using the same cpu. That way it should
speed the application up. The standalone (non-threaded) app uses 90+
% of a single cpu when in this part of the algorithm, that is why I
split and threaded that part of the algorithm there to try and speed
it up because this part is repeated several times. I can post
generic code of what I am doing but can't post the actuall code
because of the confidentially of it. Thanks again in advance for any
and all comments (even any spitefull ones)

This kind of fine control over CPU allocation is very
plaform-specific. Some don't allow it at all. If your platform does,
the details on how you go about doing it will vary depending on the
platform.

<mike
 
J

Jeffrey Maitland

Thanks.

I was hoping that python would allow for the cpu threading such in
Java etc.. but I guess not. (from the answers,and other findings) I
guess I will have to write this part of the code in something such as
java or c or something that allows for it then I can either wrap it in
python or avoid python for this part of the app.

Thanks for all the help.
 
G

Grant Edwards

Don't think you can do that with Python... The Python runtime
interpreter itself is running on a single processor.

I don't see how that can be. Under Linux at least, the Python
threading module uses "real" OS threads, so there are multiple
instances of the interpreter, right? Generally all but one of
them will be blocked on the GIL, but there are still multiple
interpreter threads (which can be on multiple different CPUs).

Or is the Python interpreter actually doing the context
switches itself?
The second thing is the infamous "global interpreter lock"
(pull up the Python documentation and do a search for that
phrase).

The GIL is the issue.
Basically, even if the threads could be assigned to
processors,

Can somebody explani why they can't?
this lock means only one thread can be performing Python
operations at a time -- a C-language number crunching module
/could/ release the lock, then do its number crunching in
parallel, reacquiring the lock when it finishes so it can
return its result(s) as Python objects.

True. Python can execute C code in parallel, but not Python
code.
Tou might get the results you want by not using threads,
instead spawning off completely new Python invocations
assigned to other processors.

That should work, but managing the inter-process communication
and syncronization is a pain.
 
G

Grant Edwards

I don't see how that can be. Under Linux at least, the Python
threading module uses "real" OS threads, so there are multiple
instances of the interpreter, right? Generally all but one of
them will be blocked on the GIL, but there are still multiple
interpreter threads (which can be on multiple different CPUs).

Or is the Python interpreter actually doing the context
switches itself?

Upon further thought, that just can't be the case. There has
to be multiple instances of the intepreter because the
interpreter can make C system calls that block (thus blocking
that instance of the interpreter). Other Python threads within
the program continue to run, so there must be multiple Python
intepreters.
 
P

Peter Hansen

Jeffrey said:
I was hoping that python would allow for the cpu threading such in
Java etc.. but I guess not. (from the answers,and other findings) I
guess I will have to write this part of the code in something such as
java or c or something that allows for it then I can either wrap it in
python or avoid python for this part of the app.

Or investigate the use of Irmen's Pyro package and how it could let you
almost transparently move your code to a *multi-process* architecture
which would then let you take advantage of all the CPUs you have
available (even if they _aren't_ on the same machine!).

-Peter
 
D

Dennis Lee Bieber

Or is the Python interpreter actually doing the context
switches itself?
It would seem to be close to doing that, if it has that internal
"quantum" of releasing the GIL every 100 bytecodes... At the least, the
GIL release/reacquire would be similar to having a C-language program
doing sleep() to let other tasks run. I'll admit that I don't know if
creating a Python thread also creates a new interpreter from scratch
(after all, Windows doesn't have a fork() operation). It may be that the
GIL toggle is part of a thread state save/restore operation, and could
thereby be looked on as a high-level context switch with the OS-level
context switch basically selecting from the threads blocked on the GIL.


{I'm going to louse up the message tracking here by pasting part of your
follow-up into one response}

2> Upon further thought, that just can't be the case. There has
2> to be multiple instances of the intepreter because the
2> interpreter can make C system calls that block (thus blocking
2> that instance of the interpreter). Other Python threads within
2> the program continue to run, so there must be multiple Python
2> intepreters.

From the documentation:

"""
The lock is also released and reacquired around potentially blocking I/O
operations like reading or writing a file, so that other threads can run
while the thread that requests the I/O is waiting for the I/O operation
to complete.
"""

It will take someone who's actually worked on the runtime
interpreter, or studied the code, to, uhm, "interpret" all the above
tidbits...

That should work, but managing the inter-process communication
and syncronization is a pain.

No argument there... If processor affinities and heavy cpu usage
are involved, with no pre-existing language bias, I'd suggest Ada (GNAT
or current derivative) with maybe the distributed systems annex (I'm not
sure if that is just inter-box distributed, or applicable to
multi-processor boxes).

Otherwise, I'd say convert the number cruncher to a compiled
module that can be started as a Python thread, drop into the compiled
code, give up the GIL, and crunch away -- only acquiring the GIL when it
has results to give back.

--
 
J

Jonathan Ellis

Peter said:
Or investigate the use of Irmen's Pyro package and how it could let you
almost transparently move your code to a *multi-process* architecture

Unless you're doing anything that would require distributed locking.
Many if not most such projects do, which is why almost everyone prefers
to use threads on an SMP machine instead of splitting it across
multiple smaller boxes.

-Jonathan
 
P

Peter Hansen

Jonathan said:
Unless you're doing anything that would require distributed locking.
Many if not most such projects do, which is why almost everyone prefers
to use threads on an SMP machine instead of splitting it across
multiple smaller boxes.

I can't address the issue of whether or not "most" such projects require
distributed locking, because I'm not familiar with more than half of
such projects, as you appear to be. <wink>

On the other hand, I am (somewhat) familiar with Jeffrey's stated
problem area (two postings of his earlier in the thread) and it really
doesn't sound like he needs such a thing. Would you normally expect to
need distributed locking for a simple system where you had long-running
computations and wanted to improve performance by using multiple CPUs?

Of course, only he can tell for sure.

-Peter
 
P

Peter Hansen

Grant said:
Upon further thought, that just can't be the case. There has
to be multiple instances of the intepreter because the
interpreter can make C system calls that block (thus blocking
that instance of the interpreter). Other Python threads within
the program continue to run, so there must be multiple Python
intepreters.

Maybe you should consider and explain what you mean by "multiple
interpreters"? As I understand the concept, and based on my several
years' old reading of the virtual machine code, I wouldn't say there are
multiple interpreters.

There's a reason the GIL is the *global* interpreter lock...

-Peter
 
G

Grant Edwards

It would seem to be close to doing that, if it has that internal
"quantum" of releasing the GIL every 100 bytecodes...

Right, but I think that's just _allowing_ a context switch
rather than performing one. The other interpreters are blocked
waiting for the GIL and releasing it lets one of them run.
Though the effect is pretty much the same, it's quite different
than having a single interpreter that does the scheduling and
context switching itself within a single OS process/thread.
At the least, the GIL release/reacquire would be similar to
having a C-language program doing sleep() to let other tasks
run. I'll admit that I don't know if creating a Python thread
also creates a new interpreter from scratch (after all,
Windows doesn't have a fork() operation).

If I were doing it, I don't think I'd use fork() and create a
second address space. I'd use a "lightweight" thread. All of
the interpreter instances would share a single address space.
I know that Win32 has threads.
It may be that the GIL toggle is part of a thread state
save/restore operation, and could thereby be looked on as a
high-level context switch with the OS-level context switch
basically selecting from the threads blocked on the GIL.

Right -- I think that's what's hapenning. I really ought to go
look at the CPython source code instead of just sputing
conjecture.
{I'm going to louse up the message tracking here by pasting part of your
follow-up into one response}

2> Upon further thought, that just can't be the case. There has
2> to be multiple instances of the intepreter because the
2> interpreter can make C system calls that block (thus blocking
2> that instance of the interpreter). Other Python threads within
2> the program continue to run, so there must be multiple Python
2> intepreters.

From the documentation:

"""
The lock is also released and reacquired around potentially blocking I/O
operations like reading or writing a file, so that other threads can run
while the thread that requests the I/O is waiting for the I/O operation
to complete.
"""

I know. I've worked on modules that release the GIL and call
blocking operations. My point is that when an interpreter
calls a blocking operation, the interpreter itself blocks. It
stops running. It goes to sleep. But, other Python threads
keep running, so there must be other interpreters running those
threads.
Otherwise, I'd say convert the number cruncher to a compiled
module that can be started as a Python thread, drop into the
compiled code, give up the GIL, and crunch away -- only
acquiring the GIL when it has results to give back.

Unfortunately that means you've got to debug a number cruncher
that's written in C.
 
G

Grant Edwards

Maybe you should consider and explain what you mean by
"multiple interpreters"?

That in a multi-theraded Python program, the code that
impliments the Python VM is executing "simultaneously" in
multiple contexts: one for each thread (and possibly one master
thread).

I was responding to somebody who said that there were two issue
with using multiple CPUs:

1) the interpreter (singular) only ran on one CPU.

2) the GIL.

My point was that 1) couldn't be true. There must be multiple
instances of the interpreter since in a multi-threaded Python
program, the interpeter blocks when making libc calls like
read() write() recv() send(), and yet other Python threads
continue to run. If there _were_ only a single interpeter, and
it ran only on a single CPU, then the GIL wouldn't be needed.
As I understand the concept, and based on my several years'
old reading of the virtual machine code, I wouldn't say there
are multiple interpreters.

There's a reason the GIL is the *global* interpreter lock...

Exactly.
 
J

Jeffrey Maitland

Thanks for the info.

I was doing some more diggging and I came across a module/class
called POSH which should allow me to do what I want. My question now
is, has anyone here used this and if so what it as easy to implement
as what I am reading it is? (I have to wait for the sys admin to
install the module on that server, but I have made a modified
implemted the syntax into a copy of the code to test it as soon as the
admin installs it.)

Once again thanks for the information that you all have shared.

Jeff
 
J

Jonathan Ellis

Peter said:
I can't address the issue of whether or not "most" such projects require
distributed locking, because I'm not familiar with more than half of
such projects, as you appear to be. <wink>

Your sarcasm is cute, I suppose, but think about it for a minute. If
the opposite of what I assert is true, why would even the mainstream
press be running articles along the lines of "multicore CPUs mean
programming will get tougher because locking is hard to get right and
you can't just scale by relying on the cpu to run your one
thread/process really fast anymore."

http://www.gotw.ca/publications/concurrency-ddj.htm for one example.

-Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top