subprocess call is not waiting.

P

paulstaten

I have a subprocess.call which tries to download a data from a remote server using HTAR. I put the call in a while loop, which tests to see if the download was successful, and if not, loops back around up to five times, just in case my internet connection has a hiccup.

Subprocess.call is supposed to wait.

But it doesn't work as intended. The loop quickly runs 5 times, starting a new htar command each time. After five times around, my program tells me mydownload failed, because the target file doesn't yet exist. But it turns out that the download is still happening---five times.

When I run htar from the shell, I don't get a shell prompt again until after the download is complete. How come control is returned to python before the htar command is through?

I've tried using Popen with wait and/or communicate, but no waiting ever happens. This is troublesome not only because I don't get to post process my data, but because when I run this script for multiple datasets (checking tosee whether I have local copies), I quickly get a "Too many open files" error. (I began working on that by trying to use Popopen with fds_close, etc.)

Should I just go back to os.system?
 
M

MRAB

I have a subprocess.call which tries to download a data from a remote server using HTAR. I put the call in a while loop, which tests to see if the download was successful, and if not, loops back around up to five times, just in case my internet connection has a hiccup.

Subprocess.call is supposed to wait.

But it doesn't work as intended. The loop quickly runs 5 times, starting a new htar command each time. After five times around, my program tells me my download failed, because the target file doesn't yet exist. But it turns out that the download is still happening---five times.

When I run htar from the shell, I don't get a shell prompt again until after the download is complete. How come control is returned to python before the htar command is through?

I've tried using Popen with wait and/or communicate, but no waiting ever happens. This is troublesome not only because I don't get to post process my data, but because when I run this script for multiple datasets (checking to see whether I have local copies), I quickly get a "Too many open files" error. (I began working on that by trying to use Popopen with fds_close, etc.)

Should I just go back to os.system?
Which OS? Is there some documentation somewhere?
 
W

woooee

It possibly requires a "shell=True", but without any code on any way to test, we can not say.
 
C

Chris Rebert

I have a subprocess.call
But it doesn't work as intended.
Should I just go back to os.system?

Did the os.system() version work?

As of recent Python versions, os.system() is itself implemented using
the `subprocess` module, so if it does work, then it assuredly can be
made to work using the `subprocess` module instead.

Cheers,
Chris
 
P

paulstaten

os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error.

I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?
 
W

Wanderer

os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error.



I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?

I never got the hang of subprocess, either. I ended up wrapping os.system in a python file and using subprocess to call that with:

subprocess.Popen([sys.executable, 'Wrapper.py'])

This works for me. I'm using Windows 7.
 
C

Chris Rebert

os.system worked fine, and I found something in another section of code that was causing the "Too many open errors." (I was fooled, because output from subprocess call didn't seem to be coming out until the open files error..

I'll go back and play with subprocess.call more, since os.system works. That's interesting about using shlex at run time. Is that just for the sake of computational cost?

No, like I said, you'll also get incorrect results. shlex isn't magic.
If the exact command line it's given wouldn't work in the shell, then
it won't magically fix things. Many (most?) dynamic invocations of
shlex.split() are naive and flawed:
import shlex
filename = "my summer vacation.txt"
# the following error is less obvious when the command is more complex
# (and when the filename isn't hardcoded)
cmd = "cat " + filename
shlex.split(cmd) ['cat', 'my', 'summer', 'vacation.txt']
# that's wrong; the entire filename should be a single list element

Equivalent bash error:
chris@mbp ~ $ cat my summer vacation.txt
cat: my: No such file or directory
cat: summer: No such file or directory
cat: vacation.txt: No such file or directory

The right way, in bash:
chris@mbp ~ $ cat my\ summer\ vacation.txt
Last summer, I interned at a tech company and...
chris@mbp ~ $ cat 'my summer vacation.txt'
Last summer, I interned at a tech company and…

And indeed, shlex will get that right too:
shlex.split("cat my\ summer\ vacation.txt") ['cat', 'my summer vacation.txt']
shlex.split("cat 'my summer vacation.txt'")
['cat', 'my summer vacation.txt']

BUT that presumes that your filenames are already pre-quoted or have
had backslashes added, which very seldom is the case in reality. So,
you can either find an escaping function and hope you never forget to
invoke it (cf. SQL injection), or you can figure out the general
tokenization and let `subprocess` handle the rest (cf. prepared
statements):
split('cat examplesimplefilename') ['cat', 'examplesimplefilename']
# Therefore…
def do_cat(filename):
.... cmd = ['cat', filename] # less trivial cases would be more interesting
.... call(cmd)
....
Generally, use (a) deliberately simple test filename(s) with shlex,
then take the resulting list and replace the filename(s) with (a)
variable(s).

Or, just figure out the tokenization without recourse to shlex; it's
not difficult in most cases!
The Note in the Popen docs covers some common tokenization mistakes people make:
http://docs.python.org/library/subprocess.html#subprocess.Popen

Cheers,
Chris
 
A

andrea crotti

I have a similar problem, something which I've never quite understood
about subprocess...
Suppose I do this:

proc = subprocess.Popen(['ls', '-lR'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

now I created a process, which has a PID, but it's not running apparently...
It only seems to run when I actually do the wait.

I don't want to make it waiting, so an easy solution is just to use a
thread, but is there a way with subprocess?
 
D

Dennis Lee Bieber

I have a similar problem, something which I've never quite understood
about subprocess...
Suppose I do this:

proc = subprocess.Popen(['ls', '-lR'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

now I created a process, which has a PID, but it's not running apparently...
It only seems to run when I actually do the wait.

I don't want to make it waiting, so an easy solution is just to use a
thread, but is there a way with subprocess?

Unless you have a really massive result set from that "ls", that
command probably ran so fast that it is blocked waiting for someone to
read the PIPE.
 
A

andrea crotti

2012/9/18 Dennis Lee Bieber said:
Unless you have a really massive result set from that "ls", that
command probably ran so fast that it is blocked waiting for someone to
read the PIPE.

I tried also with "ls -lR /" and that definitively takes a while to run,
when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Anyway I also need to know when the process is over while waiting, so
probably a thread is the only way..
 
H

Hans Mulder

2012/9/18 Dennis Lee Bieber said:
Unless you have a really massive result set from that "ls", that
command probably ran so fast that it is blocked waiting for someone to
read the PIPE.

I tried also with "ls -lR /" and that definitively takes a while to run,
when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Yes: using "top" is an observation problem.

"Top", as the name suggests, shows only the most active processes.

It's quite possible that your 'ls' process is not active, because
it's waiting for your Python process to read some data from the pipe.

Try using "ps" instead. Look in thte man page for the correct
options (they differ between platforms). The default options do
not show all processes, so they may not show the process you're
looking for.
Anyway I also need to know when the process is over while waiting, so
probably a thread is the only way..

This sounds confused.

You don't need threads. When 'ls' finishes, you'll read end-of-file
on the proc.stdout pipe. You should then call proc.wait() to reap
its exit status (if you don't, you'll leave a zombie process).
Since the process has already finished, the proc.wait() call will
not actually do any waiting.


Hope this helps,

-- HansM
 
G

Gene Heskett

2012/9/18 Dennis Lee Bieber said:
Unless you have a really massive result set from that "ls",
that

command probably ran so fast that it is blocked waiting for someone
to read the PIPE.

I tried also with "ls -lR /" and that definitively takes a while to
run, when I do this:

proc = subprocess.Popen(['ls', '-lR', '/'], stdout=subprocess.PIPE,
stderr=subprocess.PIPE)

nothing is running, only when I actually do
proc.communicate()

I see the process running in top..
Is it still an observation problem?

Yes: using "top" is an observation problem.

"Top", as the name suggests, shows only the most active processes.
Which is why I run htop in a shell 100% of the time. With htop, you can
scroll down and see everything.
It's quite possible that your 'ls' process is not active, because
it's waiting for your Python process to read some data from the pipe.

Try using "ps" instead. Look in thte man page for the correct
options (they differ between platforms). The default options do
not show all processes, so they may not show the process you're
looking for.


This sounds confused.

You don't need threads. When 'ls' finishes, you'll read end-of-file
on the proc.stdout pipe. You should then call proc.wait() to reap
its exit status (if you don't, you'll leave a zombie process).
Since the process has already finished, the proc.wait() call will
not actually do any waiting.


Hope this helps,

-- HansM


Cheers, Gene
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: <http://coyoteden.dyndns-free.com:85/gene> is up!
To know Edina is to reject it.
-- Dudley Riggs, "The Year the Grinch Stole the Election"
 
A

andrea crotti

2012/9/19 Hans Mulder said:
Yes: using "top" is an observation problem.

"Top", as the name suggests, shows only the most active processes.

Sure but "ls -lR /" is a very active process if you try to run it..
Anyway as written below I don't need this anymore.
It's quite possible that your 'ls' process is not active, because
it's waiting for your Python process to read some data from the pipe.

Try using "ps" instead. Look in thte man page for the correct
options (they differ between platforms). The default options do
not show all processes, so they may not show the process you're
looking for.


This sounds confused.

You don't need threads. When 'ls' finishes, you'll read end-of-file
on the proc.stdout pipe. You should then call proc.wait() to reap
its exit status (if you don't, you'll leave a zombie process).
Since the process has already finished, the proc.wait() call will
not actually do any waiting.


Hope this helps,


Well there is a process which has to do two things, monitor
periodically some external conditions (filesystem / db), and launch a
process that can take very long time.

So I can't put a wait anywhere, or I'll stop everything else. But at
the same time I need to know when the process is finished, which I
could do but without a wait might get hacky.

So I'm quite sure I just need to run the subprocess in a subthread
unless I'm missing something obvious..
 
H

harish.barvekar

subprocess.call(tempFileName, shell=True).communicate()

this process is not blocking. I want to make a blocking call to it. please help
 
T

Terry Reedy

subprocess.call(tempFileName, shell=True).communicate()

should raise an AttributeError as the int returned by subprocess.call
does not have a .communicate method.
this process is not blocking.

Why do you think that? All function calls block until the function
returns, at which point blocking ceases. If you call
Popen(someprog).communicate() and someprog runs quickly, you will hardly
notice the blocking time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top