Optimization tweak . Using fork as a "mark" and "release" heapmanager.

John Carter · Mar 28, 2006

Some old Pascal implementations had (and I think some still do) had a
facility to "mark" the heap, and then at some point "release" all
items allocated after that mark.

Here is a nifty way of doing the same (and more!) in ruby....

==========================try.rb======================================
pid = Process.fork do
# Load any modules we need
require 'find'

a = 'x' * 100*1024*1024

end

pid, result = Process.waitpid2( pid)
======================================================================

Here is an edited version of the result of running (from root)
strace -v -f -o strace.log ruby try.rb

======================================================================
1597 execve("/usr/local/bin/ruby", ["ruby", "try.rb"], ["HZ=100", "SHELL=/bin/bash", "TERM=xterm", "OLDPWD=/root", "USER=root", "MAIL=/var/mail/root", "PATH=/usr/local/sbin:/usr/local/"..., "PWD=/home/johnc/tmp", "PS1=\\h:\\w\\$ ", "SHLVL=1", "HOME=/root", "LOGNAME=root", "DISPLAY=:0.0", "_=/usr/bin/strace"]) = 0

....all the start up cost of invoking ruby paid once and only once....

Here is the OS level call to fork...
1597 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7d57708) = 1598

# Note this is really really very fast as unix just creates a complete
# copy via COW pages (copy on write) using virtual memory magic.

# Parent proc hands waiting for child...
1597 waitpid(1598, <unfinished ...>

# Child proc loads and evals find.rb
1598 open("/usr/local/lib/ruby/1.9/find.rb", O_RDONLY|O_LARGEFILE) = 3
1598 close(3) = 0
1598 open("/usr/local/lib/ruby/1.9/find.rb", O_RDONLY|O_LARGEFILE) = 3
1598 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfd1baa8) = -1 ENOTTY (Inappropriate ioctl for device)
1598 read(3, "#\n# find.rb: the Find module for"..., 8192) = 1922

# Child proc grabs a huge chunk more memory

1598 brk(0x81c1000) = 0x81c1000
1598 mmap2(NULL, 104861696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb1925000

# Child exits...
1598 exit_group(0) = ?
1597 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 1598
1597 --- SIGCHLD (Child exited) @ 0 (0) ---

# WHEE! Bang! All the memory and resources associated with the child
# are reclaimed completely and instantly by the OS.

# parent continues on it merry way light and free....

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

ara.t.howard · Mar 28, 2006

Some old Pascal implementations had (and I think some still do) had a
facility to "mark" the heap, and then at some point "release" all
items allocated after that mark.

Here is a nifty way of doing the same (and more!) in ruby....

==========================try.rb======================================
pid = Process.fork do
# Load any modules we need
require 'find'

a = 'x' * 100*1024*1024

end

pid, result = Process.waitpid2( pid)
======================================================================

you can get an object back from the child using:

harp:~ > cat a.rb
def child
r, w = IO.pipe
IO.popen('-') do |pipe|
if pipe
w.close
buf = pipe.read
pipe.close
raise Marshal.load(r.read) unless $? == 0
Marshal.load(buf)
else
r.close
begin
print(Marshal.dump(yield))
rescue Exception => e
w.print(Marshal.dump(e))
exit! 42
end
end
end
ensure
r.close
end

emsg = lambda{|e| STDERR.puts %Q[#{ e.message } (#{ e.class })\n#{ e.backtrace.join "\n" }]}

p child{ 'value from child' } rescue emsg[$!]

p child{ error_from_child } rescue emsg[$!]

p 'but the parent lives'

harp:~ > ruby a.rb
"value from child"
undefined local variable or method `error_from_child' for main:Object (NameError)
a.rb:29
a.rb:14:in `child'
a.rb:4:in `child'
a.rb:29
"but the parent lives"

with all the same memory preserving side effects.

regards.

-a

Joel VanderWerf · Mar 29, 2006

John said:
Some old Pascal implementations had (and I think some still do) had a
facility to "mark" the heap, and then at some point "release" all
items allocated after that mark.

Here is a nifty way of doing the same (and more!) in ruby....

==========================try.rb======================================
pid = Process.fork do
# Load any modules we need
require 'find'

a = 'x' * 100*1024*1024

end

pid, result = Process.waitpid2( pid)
======================================================================

If possible, disable GC in the fork. That can greatly reduce memory
usage because the GC mark algorithm has to touch every reachable block
of allocated heap memory. So the memory manager has to copy most of the
original process anyway--the COW advantage is lost. This is especially
true if the parent process has a lot of objects. Example:

a = (1..2_000_000).map {[]} # emulate a big ObjectSpace

10.times do
pid = fork do
GC.disable if ARGV[0] == "nogc"
a = 'x' * 10*1024*1024 # trigger GC, if enabled
puts `free`[/Swap.*/]
end
end

Process.waitall

$ time ruby fork-gc.rb nogc
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
Swap: 489940 137340 352600
ruby fork-gc.rb nogc 5.29s user 0.62s system 97% cpu 6.049 total
$ time ruby fork-gc.rb
Swap: 489940 326976 162964
Swap: 489940 327100 162840
Swap: 489940 327336 162604
Swap: 489940 330228 159712
Swap: 489940 334664 155276
Swap: 489940 330456 159484
Swap: 489940 329060 160880
Swap: 489940 328124 161816
Swap: 489940 327148 162792
Swap: 489940 327072 162868
ruby fork-gc.rb 8.82s user 2.97s system 28% cpu 40.712 total

Note the big increase in swap used (second column of numbers).

** Caution: on my 512MB system this can thrash for a while. If you have
less memory, change the parameters.

Linux: using "clone3" and "waitid"	0	Oct 17, 2023
[ANN] posix-spawn 0.3.0 -- first public release (codename, "tigersblood")	5	Mar 4, 2011
Fork (and exec) in a threaded script.	4	Aug 16, 2011
Fork Problem	2	Nov 18, 2005
Subprocess and pipe-fork-exec primitive	2	Jul 30, 2007
fork,exec, and parallel processing	4	Mar 26, 2007
sleep/fork/shell/SIGCHLD interaction problem	8	Nov 11, 2007
Fork + Waitpid	4	May 13, 2005

Optimization tweak . Using fork as a "mark" and "release" heapmanager.

John Carter

ara.t.howard

Joel VanderWerf

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads