A tricky problem about Process.wait and popen

U

uncutstone

My OS is windows 98, and ruby version is ruby 1.8.2 (2004-12-25)
[i386-mswin32] .

The following is the code:

main.rb:

pipe = IO.popen("ruby saygoodbye.rb", "w+")
sleep 3
cpid = Process.wait
lines = pipe.readlines
puts lines

saygoodbye.rb:

34.times do
puts "Please take a look at what will happen, it's really
tricky"
end
$stderr.puts "child process #{Process.pid} exits"

The above code cannot execute as expected: the parent process will wait
for ever. When I use ctrl-c to break it , I got

main.rb:3:in `wait': Interrupt
from main.rb:3

But if I change 34.times in saygoodbye.rb to 33.times, it works well.
So , it seems there is a limit of pipe's capacity.

Somebody can give me a clear explanation, Very thanks.
 
U

uncutstone

It seems that when child process write too many things to the pipe, the
signal of child process exit sent to parent process will lost.

Is it a bug?
 
G

Gonald

uncutstone said:
It seems that when child process write too many things to the pipe, the
signal of child process exit sent to parent process will lost.

Is it a bug?
It's not a bug. The child process just never exits. This is because it
blocks while the pipe is full. Since the parent waits for the child to
exit before it reads from the pipe, a deadlock occurs after a certain
amount of data.

Try this, instead:

pipe = IO.popen("ruby saygoodbye.rb", "w+")
sleep 3
lines = pipe.readlines
cpid = Process.wait
puts lines
 
U

uncutstone

Thanks.
But it seems that child process has exited.
Acctually the result is :

child process 577605 exits

And main.rb is waiting. After I use ctrl-c to break it , i got :

main.rb:2:in `wait': Interrupt
from main.rb:2
 
R

Robert Klemme

Gonald said:
It's not a bug. The child process just never exits. This is because it
blocks while the pipe is full. Since the parent waits for the child to
exit before it reads from the pipe, a deadlock occurs after a certain
amount of data.

Try this, instead:

pipe = IO.popen("ruby saygoodbye.rb", "w+")
sleep 3
lines = pipe.readlines
cpid = Process.wait
puts lines
The pipe is not closed properly. I'd use the block form

IO.popen("ruby saygoodbye.rb", "w+") do |pipe|
....
end

Also, I'm not sure I get the point of the "sleep 3" in there. IMHO it's
superfluous since readlines won't return before the process has died
anyway. In the worst case it slows things down otherwise it just does
merely nothing.

Also I think Process.wait won't work because readlines won't return
before the child terminated.

If you want more direct output then it's better to do the iteration and
printing directly, like

IO.popen("ruby saygoodbye.rb", "w+") do |pipe|
pipe.each {|line| puts line}
puts "Exit status: #{$?.exitstatus}"
end

Kind regards

robert
 
U

uncutstone

Also, I'm not sure I get the point of the "sleep 3" in there.

It is nothing to do with "sleep 3", I just forgot to remove it. Sorry
for that it may blur the real problem.
The pipe is not closed properly. I'd use the block form
IO.popen("ruby saygoodbye.rb", "w+") do |pipe|
...
end

Yes, it's better. But my real problem is the lost signal that child
process send to parent when it exit.
Also I think Process.wait won't work because readlines won't >return before the child terminated.
Sorry , I think I don't get your point. Process.wait is executed before
realines, so why readlines affects Process.wait??
 
U

uncutstone

I made more test of the code and eventually found you are right and got
a clear understanding of it. Yes, the child process never exits and
sends no signal .

It is really tricky and interesting to see the way how the child
process get blocked.

When the code is "34.times do" , the result is:

"child process 577605 exits " was displayed on the screen
and then deadlock happened.

And after I use ctrl -c to break it, I got:

main.rb:2:in `wait': Interrupt
from main.rb:2

It shows that child process get blocked before it exits. It wait the
parent process to read pipe before it exits.

But when I change the code to "68.times do", the result is:

Nothing displayed on the screen and deadlock happens.

And after I use ctrl -c to break it, I got:

main.rb:2:in `wait': Interrupt
from main.rb:2
saygoodbye.rb:3:in `write': Bad file descriptor (Errno::EBADF)
from saygoodbye.rb:3:in `puts'
from saygoodbye.rb:3
from saygoodbye.rb:1:in `times'
from saygoodbye.rb:1

It shows that child process get blocked immediatly when it is doing
pipe writing.

Let me make a simple summary for it:
1. If the child process write too much to pipe, it will get blocked and
doesn't exits , doesn't send exit signal.
2.The way the child process get blocked is depending on how much it
want to write to the pipe.
2.1 If the child process make the pipe full but not too full, it will
get blocked before it exits.
2.1 If the child process make the pipe full and even too full, it will
get blocked immediatly at pipe writing.

Do you agree with it .Very Thanks .
 
U

uncutstone

I made more test of the code and eventually found you are right and got

a clear understanding of it. Yes, the child process never exits and
sends no signal .

It is really tricky and interesting to see the way how the child
process get blocked.


When the code is "34.times do" , the result is:


"child process 577605 exits " was displayed on the screen
and then deadlock happened.


And after I use ctrl -c to break it, I got:


main.rb:2:in `wait': Interrupt
from main.rb:2


It shows that child process get blocked before it exits. It wait the
parent process to read pipe before it exits.


But when I change the code to "68.times do", the result is:


Nothing displayed on the screen and deadlock happens.


And after I use ctrl -c to break it, I got:


main.rb:2:in `wait': Interrupt
from main.rb:2
saygoodbye.rb:3:in `write': Bad file descriptor (Errno::EBADF)
from saygoodbye.rb:3:in `puts'
from saygoodbye.rb:3
from saygoodbye.rb:1:in `times'
from saygoodbye.rb:1


It shows that child process get blocked immediatly when it is doing
pipe writing.


Let me make a simple summary for it:
1. If the child process write too much to pipe, it will get blocked and

doesn't exits , doesn't send exit signal.
2.The way the child process get blocked is depending on how much it
want to write to the pipe.
2.1 If the child process make the pipe full but not too full, it will
get blocked before it exits.
2.2 If the child process make the pipe full and even too full, it will
get blocked immediatly at pipe writing.


Do you agree with it .Very Thanks .
 
U

uncutstone

I think I already have understanded all of these. But a understanding
doesn't mean a solution.

Let me give the problem I want to solve.

There is a parent process and a fixed number of child processes. Parent
process is responsible for forking a fixed number of child processes
using IO.popen and then wait for any child process exits. Whenever a
child process complete its task , it will write the result in pipe and
signal the parent process "mission complete" and then exits. When
parent get the signal, it will read correspondant pipe to get result
and then creating a new child process .

If the result returned by the child process is small, pipe doesn't get
full, it works well.

But if the result is big and pipe gets full, the child get blocked and
signal cannot send to parent process.

Let me make it simple.
1.Multiple child process want to return result to parent process via
pipe.
2. Parent need a mechanism similar to unix select() to get notification
from child process when it completes its task.

The code is something like this:

main.rb
processNum = 10
pipes = Hash.new
numOfProcessRunning = 0
loop do
apipe = IO.popen("ruby child.rb","w+")
cpid = apipe.gets.strip.to_i
pipes[cpid] = apipe
numOfProcessRunning += 1
if numOfProcessRunning >= processNum
cpid = Process.wait
apipe = pipes[cpid]
apipe.each { |line|
#.... , get the result
}
apipe.close
pipes.delete(cpid)
numOfProcessRunning -= 1
end
end

child.rb:
puts Process.pid
$stdout.flush
#..... , do something, then
puts result
 
R

Robert Klemme

First of all please don't post the same message several times.
I think I already have understanded all of these. But a understanding
doesn't mean a solution.

Let me give the problem I want to solve.

There is a parent process and a fixed number of child processes. Parent
process is responsible for forking a fixed number of child processes
using IO.popen and then wait for any child process exits. Whenever a
child process complete its task , it will write the result in pipe and
signal the parent process "mission complete" and then exits. When
parent get the signal, it will read correspondant pipe to get result
and then creating a new child process .

Try the implementation I suggested. The IO created by IO.popen will not
signal EOF before the child terminated. So just use read or readlines
to read the pipe's contents until you get everything (= process
terminated). You can then evaluate it as you see fit. (see the
suggested code I posted earlier).
If the result returned by the child process is small, pipe doesn't get
full, it works well.

You must not rely on this. Increasing the pipe's size is not an option
as you still have a limit. Your process is responsible for reading the
pipe in order to prevent the writer from blocking.
But if the result is big and pipe gets full, the child get blocked and
signal cannot send to parent process.

Let me make it simple.
1.Multiple child process want to return result to parent process via
pipe.
2. Parent need a mechanism similar to unix select() to get notification
from child process when it completes its task.

The code is something like this:

main.rb
processNum = 10
pipes = Hash.new
numOfProcessRunning = 0
loop do
apipe = IO.popen("ruby child.rb","w+")
cpid = apipe.gets.strip.to_i
pipes[cpid] = apipe
numOfProcessRunning += 1
if numOfProcessRunning >= processNum
cpid = Process.wait
apipe = pipes[cpid]
apipe.each { |line|
#.... , get the result
}
apipe.close
pipes.delete(cpid)
numOfProcessRunning -= 1
end
end

This looks overly complex. You could do something like this:


PARALLEL = 10

threads = (1..PARALLEL).map do |i|
Thread.new i do |count|
IO.popen("ruby -e 'puts \"child #{count}\"'") do |io|
res = io.read
puts "Child #{count} returned #{res.inspect}"
end
end
end

puts "Started"

threads.each {|th| th.join}

puts "Terminated"

child.rb:
puts Process.pid
$stdout.flush
#..... , do something, then
puts result

Btw, if your child processes are Ruby processes you can use fork with a
block:


PARALLEL = 10

threads = (1..PARALLEL).map do |i|
Thread.new i do |count|
read, write = IO.pipe

cpid = fork do
read.close
write.puts "I'm child #{count}"
write.close
end

write.close
res = read.read
puts "Child #{count} with PID #{cpid} returned #{res.inspect}"
end
end

puts "Started"

threads.each {|th| th.join}

puts "Terminated"


If you need to communicate more complex information between child and
parent you can use Marshal on the pipe:


PARALLEL = 10

threads = (1..PARALLEL).map do |i|
Thread.new i do |count|
read, write = IO.pipe

cpid = fork do
# in child process
read.close
Marshal.dump({:result => "I'm child #{count}"}, write)
write.close
end

# in parent process
write.close
res = Marshal.load(read)
puts "Child #{count} with PID #{cpid} returned #{res.inspect}"
end
end

puts "Started"

threads.each {|th| th.join}

puts "Terminated"


These examples with explicitly created pipe work the same as the popen
approach, i.e. the parent thread blocks until the pipe reaches EOF which
in turn happens when the child exits.

Kind regards

robert
 
U

uncutstone

Very thanks for the patience to read my long post and give me a good
answer.

But there is still one thing untackled. I want whenever a child process
exits, a new child process is forked immediately without waiting other
child processes exit.

But "threads.each {|th| th.join}" will cause the main thread waiting
until all child processes exit. I want it just wait for one, not wait
for all.

I have got a idea using DRb(no pipe anymore) to implement this. I
really like it. Hope you can appreciate it. :)
Following is the code:

taskmanager.rb:

require 'drb'

# task manager is responsible for centralized task assignment and
# centralized task result process
class TaskManager

def TaskManager.main( processNum,filename)
processNum = 2
aTaskManager = TaskManager.new(processNum,filename)
aTaskManager.start
end

def initialize(processnum,filename)
@processnum = processnum
@filename = filename
end

def start
@processnum.times do
forkChild
end
DRb.start_service('druby://localhost:9000',self)
DRb.thread.join
end

def requestTask #centralized task assignment
assignATask
end

def finishTask(*result)
puts result.join(" ")
resultProcess(*result)
end

def stop
end

def forkChild
Thread.new { system("ruby #{@filename}") }
end

def resultProcess(*result)
end

def assignATask
rand(200)
end

private :forkChild , :assignATask , :resultProcess
end

TaskManager.main( 5, "worker_example.rb")


worker.rb:

require 'drb'

#worker is responsible to execute specific tasks

class Worker

def Worker.main
puts "Worker #{Process.pid} starts"
aworker = Worker.new
aworker.work
end

def initialize
DRb.start_service();
@taskManager = DRbObject.new(nil, 'druby://localhost:9000')
end

def work
aTask = @taskManager.requestTask
$stderr.puts "Worker #{Process.pid} get task #{aTask}"
loop do
sleep(10)
@taskManager.finishTask("Task",aTask.to_s,"finished")
aTask = @taskManager.requestTask
$stderr.puts "Worker #{Process.pid} get task #{aTask}"
end
end

end

Worker.main


I have still one question. How can I stop or close a DRb service ?
 
R

Robert Klemme

uncutstone said:
Very thanks for the patience to read my long post and give me a good
answer.

But there is still one thing untackled. I want whenever a child process
exits, a new child process is forked immediately without waiting other
child processes exit.

Then place a loop into the thread block.
But "threads.each {|th| th.join}" will cause the main thread waiting
until all child processes exit. I want it just wait for one, not wait
for all.

No, you probably did not understand my code fully: it waits for all
*threads* to terminate! The script creates PARALLEL (i.e. 10) threads
and each of them forks a ruby process and terminates once the child
process has terminated. So you got 10 processes executing in parallel
and the script waits for all to terminate.
I have got a idea using DRb(no pipe anymore) to implement this. I
really like it. Hope you can appreciate it. :)

I don't think you need DRb for this. It seems the number of child
processes is an input parameter. You just need to add that to my
script. Then decrement a counter thread safe and have each thread start
a new process as long as you are not done. It's an easy change to the
script I provided earlier.

<snip/>

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top