[BUG] Segmentation fault with a threads/forks script

L

Lucas Nussbaum

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

/prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)
 
L

Lucas Nussbaum

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

./prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)

With Ubuntu Breezy's ruby 1.9 package (version 1.9.0+20050623-2), it
crashes with :

*** glibc detected *** free(): invalid pointer: 0x08a9db38 ***
Aborted
 
A

Ara.T.Howard

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

./prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)


seems to work on 1.8.2 for values around 1000:

[ahowward@localhost ~]$ wget http://blop.info/bazaar/prunner.rb
--07:25:29-- http://blop.info/bazaar/prunner.rb
=> `prunner.rb'
Resolving blop.info... 85.68.8.93
Connecting to blop.info[85.68.8.93]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,125 [text/plain]

100%[======================================================================================================================================>] 2,125 --.--K/s

07:25:29 (42.22 MB/s) - `prunner.rb' saved [2,125/2,125]

[ahoward@localhost ~]$ for i in $(seq 1 2000);do echo 'date'; done > cmds
[ahoward@localhost ~]$ wc -l cmds
2000 cmds
[ahoward@localhost ~]$ ruby prunner.rb < cmds >/dev/null
prunner.rb:47:in `popen': Too many open files - date 2>&1 (Errno::EMFILE)
from prunner.rb:47
from prunner.rb:47:in `each'
from prunner.rb:47
[ahoward@localhost ~]$ head -1000 cmds |ruby prunner.rb >/dev/null
[ahoward@localhost ~]$ echo $?
0
[ahoward@localhost ~]$ ruby -v
ruby 1.8.2 (2005-02-12) [i686-linux]
[ahoward@localhost ~]$ uname -srm
Linux 2.6.12-1.1372_FC3 i686
[ahoward@localhost ~]$ cat /etc/redhat-release
Fedora Core release 3 (Heidelberg)


and on 1.9:

harp:~ > wget http://blop.info/bazaar/prunner.rb
--07:42:37-- http://blop.info/bazaar/prunner.rb
=> `prunner.rb.1'
Resolving blop.info... 85.68.8.93
Connecting to blop.info[85.68.8.93]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2,125 [text/plain]

100%[======================================================================================================================================>] 2,125 --.--K/s

07:42:37 (343.01 KB/s) - `prunner.rb.1' saved [2,125/2,125]

harp:~ > for i in $(seq 1 2000);do echo 'date'; done > cmds
harp:~ > wc -l cmds
2000 cmds
harp:~ > ruby prunner.rb < cmds >/dev/null
prunner.rb:47:in `popen': Too many open files - date 2>&1 (Errno::EMFILE)
from prunner.rb:47
from prunner.rb:47:in `each'
from prunner.rb:47
harp:~ > head -1000 cmds|ruby prunner.rb >/dev/null
harp:~ > echo $?
0
harp:~ > ruby -v
ruby 1.9.0 (2005-05-16) [i686-linux]
harp:~ > uname -srm
Linux 2.4.21-32.0.1.EL i686
harp:~ > cat /etc/redhat-release
Red Hat Enterprise Linux WS release 3 (Taroon Update 5)


did you compile ruby yourself or use some installer/package-manager?

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
 
L

Lucas Nussbaum

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

./prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)


seems to work on 1.8.2 for values around 1000:

[...]

did you compile ruby yourself or use some installer/package-manager?

I tested using Debian's and Ubuntu's packages.
What if you ulimit -n 16384 first ?
Does it still work ?
It worked for me with values around 1000 too. But after increasing the
ulimit for open files, it started crashing.
 
A

Ara.T.Howard

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

./prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)


seems to work on 1.8.2 for values around 1000:

[...]

did you compile ruby yourself or use some installer/package-manager?

I tested using Debian's and Ubuntu's packages.
What if you ulimit -n 16384 first ?
Does it still work ?
It worked for me with values around 1000 too. But after increasing the
ulimit for open files, it started crashing.

o.k. - now it crashed on 1.8.2. i can't ulimit on the 1.9 box.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
 
N

nobu.nokada

Hi,

At Thu, 21 Jul 2005 23:09:51 +0900,
Lucas Nussbaum wrote in [ruby-talk:149072]:
I tested using Debian's and Ubuntu's packages.
What if you ulimit -n 16384 first ?
Does it still work ?
It worked for me with values around 1000 too. But after increasing the
ulimit for open files, it started crashing.

I think it has been fixed in CVS trunk.

Fri Jun 3 23:23:02 2005 Nobuyoshi Nakada <[email protected]>

* intern.h (rb_fdset_t): deal with fd bit sets over FD_SETSIZE.
fixed: [ruby-dev:26187]
 
L

Lucas Nussbaum

Hi,

At Thu, 21 Jul 2005 23:09:51 +0900,
Lucas Nussbaum wrote in [ruby-talk:149072]:
I tested using Debian's and Ubuntu's packages.
What if you ulimit -n 16384 first ?
Does it still work ?
It worked for me with values around 1000 too. But after increasing the
ulimit for open files, it started crashing.

I think it has been fixed in CVS trunk.

Fri Jun 3 23:23:02 2005 Nobuyoshi Nakada <[email protected]>

* intern.h (rb_fdset_t): deal with fd bit sets over FD_SETSIZE.
fixed: [ruby-dev:26187]

Could somebody confirm ? The Debian/Ubuntu package is versioned
1.9.0+20050623-2, so it *might* be based on the ruby 1.9 CVS after that
bug was fixed, and it still crashes for me. (is the CVS HEAD ruby1.9 ?
I'm not familiar with Ruby development)
 
A

Ara.T.Howard

Hi,

I experience a reproducable crash (each time) with prunner.rb at
http://blop.info/bazaar/prunner.rb. The script starts several commands
at the same time. When running with a large number of commands, it
exits with :

./prunner.rb:51: [BUG] Segmentation fault
ruby 1.8.2 (2005-04-11) [i386-linux]

Aborted

To reproduce, create a large file with one command per line.
for i in $(seq 1 2000); do echo hostname done > cmds
Then run prunner.rb like this :
cat cmds |head -n 1500 |./prunner.rb
1500 can be increased if it doesn't crash for you.

Of course, I expect it to go wrong at some time, but it could probably
do this in a cleaner way.

Can somebody confirm the bug ? Or better, fix it ? :)

looks like you can work around it by just closing every thing as you use it:

harp:~ > curl http://fortytwo.merseine.nu/prunner.rb > prunner.rb

harp:~ > for i in $(seq 1 2000);do echo 'date'; done > cmds

harp:~ > wc -l cmds
2000 cmds

harp:~ > ruby prunner.rb < cmds > /dev/null

harp:~ > echo $?
0

harp:~ > ls prunner.*out* | wc -l
2000

harp:~ > cat prunner.out-.0
# 0 : date
Thu Jul 21 10:49:07 MDT 2005

harp:~ > cat prunner.out-.1999
# 1999 : date
Thu Jul 21 10:49:21 MDT 2005

(prunner.rb inlined below)

hth.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

===============================================================================
file: prunner.rb
===============================================================================
#!/usr/bin/ruby -w

require 'optparse'
require 'thread'

#
# prunner : read commands from stdin, and execute each of them in parallel
#

class Hash
def getopt k, default = nil
return self[k] if self.has_key? k
k = "#{ k }"
return self[k] if self.has_key? k
k = k.intern
return self[k] if self.has_key? k
return default
end
end

class Command
class << self
def gen_cid
@cid = defined?(@cid) ? (@cid + 1) : 0
end
end

attr_accessor :command, :cid, :prefix, :path, :exit_status

def initialize command, opts = {}
@command = command.strip
@cid = self.class.gen_cid
@prefix = opts.getopt 'prefix', "#{ $$ }_command.out-"
@path = "#{ @prefix }.#{ @cid }"
@header = opts.getopt 'header'
@mutex = Mutex::new
@lines = []
@update_idx = 0
@Thread = nil
@exit_status = -1
end
def start
@Thread =
Thread::new(@command, Thread::current) do |cmd, cur|
begin
IO::popen("{ #{ cmd } ;} 2>&1") do |pipe|
File::eek:pen(@path, 'w') do |f|
f << self if @header
while((line = pipe.gets))
synchronize{ @lines << line }
f << line
end
end
end
@exit_status = $?.exitstatus
rescue Exception => e
cur.raise e
end
end
end
def synchronize(*a, &b)
@mutex.synchronize(*a, &b)
end
def join(*a, &b)
@thread.join(*a, &b)
end
def update
report = nil
synchronize do
report = @lines[@update_idx .. -1]
@update_idx = @lines.size
end
report
end
def update?
@update_idx < @lines.size
end
def to_s
"# #{ @cid } : #{ @command }\n"
end
alias label to_s
end

class Main
def initialize env = ENV.to_hash, argv = ARGV.clone
@env, @argv = env, argv
@cmds = []
@header = true
@verbose = true
@interval = 1
@prefix = 'prunner.out-'
@viewthread = nil
parse_options
end
def parse_options
OptionParser::new do |opts|
opts.banner = "echo command | prunner.rb [options]"
opts.separator ''
opts.on('-h', '--suppress-header', 'suppress header in output files'){
@header = false
}
opts.on('-q', '--quiet', 'run quietly'){
@verbose = false
}
opts.on('-i', '--interval', 'output interval'){|i|
@interval = Float i
}
opts.on('-p', '--prefix PREFIX', "prefix for output files (default #{ @prefix })"){|p|
@prefix = p
}
opts.on_tail('h', '--help', 'Show this message') {
puts opts
exit
}
opts.parse!(@argv)
end
end
def main
STDIN.each do |line|
line.strip!
next if line.empty?
c =
Command::new(line,
:verbose => @verbose,
:header => @header,
:prefix => @prefix
)
c.start
@cmds << c
end

if @verbose
@viewthread =
Thread::new do
loop do
reports = @cmds.map{|c| [c.label, c.update] if c.update?}.compact
exit if reports.empty?
reports.each do |label, report|
print label
report.each{|line| print line}
end
sleep @interval
end
end
end

@cmds.each{|c| c.join}
@viewthread.join if @viewthread
exit
end
end

if $0 == __FILE__
STDOUT.sync = true
Main::new(ENV, ARGV).main
end
 
N

nobuyoshi nakada

Hi,

At Thu, 21 Jul 2005 23:45:34 +0900,
Lucas Nussbaum wrote in [ruby-talk:149077]:
Could somebody confirm ? The Debian/Ubuntu package is versioned
1.9.0+20050623-2, so it *might* be based on the ruby 1.9 CVS after that
bug was fixed, and it still crashes for me. (is the CVS HEAD ruby1.9 ?
I'm not familiar with Ruby development)

It seems to have new related bug(s), I'll investigate it more.
 
T

Tanaka Akira

nobuyoshi nakada said:
It seems to have new related bug(s), I'll investigate it more.

The all three fd_sets must be long enough for select.

Index: eval.c
===================================================================
RCS file: /src/ruby/eval.c,v
retrieving revision 1.803
diff -u -r1.803 eval.c
--- eval.c 19 Jul 2005 14:57:47 -0000 1.803
+++ eval.c 22 Jul 2005 15:49:31 -0000
@@ -9880,10 +9880,8 @@
}
}

-void
-rb_fd_set(n, fds)
- int n;
- rb_fdset_t *fds;
+static void
+rb_fd_resize(int n, rb_fdset_t *fds)
{
int m = howmany(n + 1, NFDBITS) * sizeof(fd_mask);
int o = howmany(fds->maxfd, NFDBITS) * sizeof(fd_mask);
@@ -9896,6 +9894,14 @@
memset((char *)fds->fdset + o, 0, m - o);
}
if (n >= fds->maxfd) fds->maxfd = n + 1;
+}
+
+void
+rb_fd_set(n, fds)
+ int n;
+ rb_fdset_t *fds;
+{
+ rb_fd_resize(n, fds);
FD_SET(n, fds->fdset);
}

@@ -9931,6 +9937,15 @@
memcpy(dst->fdset, src, size);
}

+int
+rb_fd_select(int n, rb_fdset_t *readfds, rb_fdset_t *writefds, rb_fdset_t *exceptfds, struct timeval *timeout)
+{
+ rb_fd_resize(n-1, readfds);
+ rb_fd_resize(n-1, writefds);
+ rb_fd_resize(n-1, exceptfds);
+ return select(n, rb_fd_ptr(readfds), rb_fd_ptr(writefds), rb_fd_ptr(exceptfds), timeout);
+}
+
#undef FD_ZERO
#undef FD_SET
#undef FD_CLR
@@ -10795,7 +10810,7 @@
delay_ptr = &delay_tv;
}

- n = select(max+1, rb_fd_ptr(&readfds), rb_fd_ptr(&writefds), rb_fd_ptr(&exceptfds), delay_ptr);
+ n = rb_fd_select(max+1, &readfds, &writefds, &exceptfds, delay_ptr);
if (n < 0) {
int e = errno;

Index: intern.h
===================================================================
RCS file: /src/ruby/intern.h,v
retrieving revision 1.172
diff -u -r1.172 intern.h
--- intern.h 14 Jul 2005 15:11:52 -0000 1.172
+++ intern.h 22 Jul 2005 15:49:32 -0000
@@ -162,6 +162,7 @@
void rb_fd_clr _((int, rb_fdset_t *));
int rb_fd_isset _((int, const rb_fdset_t *));
void rb_fd_copy _((rb_fdset_t *, const fd_set *, int));
+int rb_fd_select(int, rb_fdset_t *, rb_fdset_t *, rb_fdset_t *, struct timeval *);

#define rb_fd_ptr(f) ((f)->fdset)
#define rb_fd_max(f) ((f)->maxfd)
@@ -178,6 +179,7 @@
#define rb_fd_init(f) FD_ZERO(f)
#define rb_fd_term(f) (f)
#define rb_fd_max(f) FD_SETSIZE
+#define rb_fd_select(n, rfds, wfds, efds, timeout) select(n, rfds, wfds, efds, timeout)

#endif
 
N

nobu.nokada

Hi,

At Sat, 23 Jul 2005 00:55:09 +0900,
Tanaka Akira wrote in [ruby-talk:149199]:
The all three fd_sets must be long enough for select.

What about making them one struct? I guess it would be nice
also for absorbing difference between select and poll.
 
T

Tanaka Akira

The all three fd_sets must be long enough for select.

What about making them one struct? I guess it would be nice
also for absorbing difference between select and poll.[/QUOTE]

It may be a step for kqueue, epoll, /dev/poll, etc.

I feel it's good idea.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top