WEBrick deadlock under Cygwin

M

Mark Probert

Hi, all.

I am using IOWA / Webrick on Cygwin (DLL version: 1.5.10) under Win2k and
I have just started experiencing some problems (hung session, page not
displaying, that kind of thing).

One of the sessions crashed with the following:


[2004-08-17 12:28:14] INFO WEBrick 1.3.1
[2004-08-17 12:28:14] INFO ruby 1.8.1 (2003-12-25) [i386-cygwin]
[2004-08-17 12:28:14] INFO Iowa::HTTPServer#start: pid=2436 port=8080
probertm-3.corp.nortel.com - - [17/Aug/2004:12:28:19 GMT-8:00] "GET /
HTTP/1.1" 200 1656
- -> /
deadlock 0x103c7c68: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x103cde00: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x100f5a18: sleep:F(55) (main) -
/usr/lib/ruby/1.8/webrick/server.rb:91
[2004-08-17 12:29:38] ERROR fatal: Thread(0x100f5a18): deadlock
/usr/lib/ruby/1.8/webrick/server.rb:91:in `accept'


Any ideas?

-mark.
 
K

Kirk Haines

Hi, all.

I am using IOWA / Webrick on Cygwin (DLL version: 1.5.10) under
Win2k and I have just started experiencing some problems (hung
session, page not displaying, that kind of thing).

One of the sessions crashed with the following:

[2004-08-17 12:28:14] INFO WEBrick 1.3.1
[2004-08-17 12:28:14] INFO ruby 1.8.1 (2003-12-25) [i386-cygwin]

[2004-08-17 12:28:14] INFO Iowa::HTTPServer#start: pid=2436 port=8080
probertm-3.corp.nortel.com - - [17/Aug/2004:12:28:19 GMT-8:00] "GET
/ HTTP/1.1" 200 1656 - -> / deadlock 0x103c7c68: sleep:S - /usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x103cde00: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x100f5a18: sleep:F(55) (main) -
/usr/lib/ruby/1.8/webrick/server.rb:91
[2004-08-17 12:29:38] ERROR fatal: Thread(0x100f5a18): deadlock
/usr/lib/ruby/1.8/webrick/server.rb:91:in `accept'

Any ideas?

Hmm. That is interesting. Is it something that only happens sporadically?
Is it load related? i.e. if you go and use a load testing tool to hammer
the snot out of it, can you cause it to happen?

I don't have the Cygwin Ruby. I'm running the one click installer Ruby on a
standard XP dist, and I haven't seen that, but I'd love to try to reproduce
it. I'm going to go start a process to just hammer the crap out of it so
that I have a lot of threads going and see if I can get that to happen.


Kirk
 
F

Florian Gross

Kirk said:
I don't have the Cygwin Ruby. I'm running the one click installer Ruby on a
standard XP dist, and I haven't seen that, but I'd love to try to reproduce
it. I'm going to go start a process to just hammer the crap out of it so
that I have a lot of threads going and see if I can get that to happen.

I once was able to reproduce a deadlock using Sockets directly with the
one click installer (on Ruby 1.8.2), but it wasn't reproduceable.

Regards,
Florian Gross
 
K

Kirk Haines

Kirk Haines wrote:
I once was able to reproduce a deadlock using Sockets directly with
the one click installer (on Ruby 1.8.2), but it wasn't reproduceable.

So, that would indicate that it is a problem with sockets on Windows? I
have not been able to reproduce the original error, but I have found that if
I have more than 5 concurrent requests, the code fails. It looks like the
first socket that was opened of the concurrent group gets closed, and that
causes Webrick to throw an exception. An EINVAL from line 303 of
httpresponse.rb.


Kirk Haines
 
F

Florian Gross

Kirk said:
So, that would indicate that it is a problem with sockets on Windows? I
have not been able to reproduce the original error, but I have found that if
I have more than 5 concurrent requests, the code fails. It looks like the
first socket that was opened of the concurrent group gets closed, and that
causes Webrick to throw an exception. An EINVAL from line 303 of
httpresponse.rb.

Hm, an EINVAL exception shouldn't cause Thread deadlocks AFAIK. (Thread
deadlocks are caused by the fatal exception, I think.)

But I think it is related to the implementation of socket.so on Windows.

Regards,
Florian Gross
 
K

Kirk Haines

On Wed, 18 Aug 2004 08:10:56 +0900, Florian Gross wrote
Hm, an EINVAL exception shouldn't cause Thread deadlocks AFAIK.
(Thread deadlocks are caused by the fatal exception, I think.)

I should have been clearer. I have not been able to get a thread deadlock
to occur. So far the only error I have been able to cause to occur is the
EINVAL error that occurs when the socket goes away before WEBrick can return
a response. It's not a fatal error. It is simply caught and delivered to
the logs, but it does cause a connection to effectively be dropped.


Kirk
 
E

Eric Hodel

--H+4ONPRPur6+Ovig
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Wed, 18 Aug 2004 08:10:56 +0900, Florian Gross wrote
=20
=20
I should have been clearer. I have not been able to get a thread
deadlock to occur. So far the only error I have been able to cause to
occur is the EINVAL error that occurs when the socket goes away before
WEBrick can return a response. It's not a fatal error. It is simply
caught and delivered to the logs, but it does cause a connection to
effectively be dropped.

The right (wrong) sequence of locking can easily create a deadlock in
the right circumstances. I can generate a deadlock easily in Borges due
to do an incorrect sequence of locks after an exception kills a thread.
(The exception is only important because it allows the threads to
deadlock.)

Here's a really simple example:

$ cat x.rb
Thread.start do Thread.stop end

Thread.stop

$ ruby x.rb
deadlock 0x806ccb0: sleep:- - x.rb:1
deadlock 0x80788f8: sleep:- (main) - x.rb:3
x.rb:3:in `stop': Thread(0x80788f8): deadlock (fatal)
from x.rb:3

(Mutex#lock calls Thread.stop)

--=20
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04


--H+4ONPRPur6+Ovig
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)

iD8DBQFBIqlnMypVHHlsnwQRAryvAJ9KMiuJVy4v1chJeA8OjBnDJ8JNdACgoNvu
M/gPX3OAqSs+FBi6uVhSHlw=
=6eJF
-----END PGP SIGNATURE-----

--H+4ONPRPur6+Ovig--
 
M

Mark Probert

Kirk Haines said:
Hmm. That is interesting. Is it something that only happens
sporadically? Is it load related? i.e. if you go and use a load
testing tool to hammer the snot out of it, can you cause it to happen?

It seems to be sporadic and not load related. Perhaps related to changes
in the underlying pages, though I am not sure of that, either.

Everything seems to be working okay, then a previously working page
request, say to the main index page, will hang. Killing the server with
a ^C then results in the following:

...
http://localhost:8080/admin/nodelist.html/5c81c3bb-369ffdff-
abduZYCN54vqQ.e.1.7
-> /admin/nodelist.html/5c81c3bb-369ffdff-abduZYCN54vqQ.f.1.9
[2004-08-17 18:46:10] INFO going to shutdown ...
deadlock 0x103d60b0: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x103f7380: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x10400428: sleep:S -
/usr/lib/ruby/1.8/webrick/httpresponse.rb:303
deadlock 0x100f5a00: sleep:J(0x103d60b0) (main) -
/usr/lib/ruby/1.8/webrick/server.rb:112
/usr/lib/ruby/1.8/webrick/server.rb:112:in `join':
Thread(0x100f5a00): deadlock (fatal)
from /usr/lib/ruby/1.8/webrick/server.rb:112:in `start'
from /usr/lib/ruby/1.8/webrick/server.rb:112:in `each'
from /usr/lib/ruby/1.8/webrick/server.rb:112:in `start'
from /usr/lib/ruby/1.8/webrick/server.rb:79:in `start'
from /usr/lib/ruby/1.8/webrick/server.rb:79:in `start'
from ./hcweb.rb:78

I have not tried this under Win32. Perhaps it is a Cygwin / Win32
threading issue.

-mark.
 
F

Florian Gross

Eric said:
The right (wrong) sequence of locking can easily create a deadlock in
the right circumstances. I can generate a deadlock easily in Borges due
to do an incorrect sequence of locks after an exception kills a thread.
(The exception is only important because it allows the threads to
deadlock.)

But isn't this considered a Ruby bug?

Regards,
Florian Gross
 
E

Eric Hodel

--TKYYegg/GYAC5JIZ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Eric Hodel wrote:
=20
=20
But isn't this considered a Ruby bug?

No, this is definetly my bug. If a developer can't synchronize threads
properly, how can Ruby determine the correct way?

--=20
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04


--TKYYegg/GYAC5JIZ
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)

iD8DBQFBIt2FMypVHHlsnwQRAn15AKCS9sUKC08m0Dv7zH2GTQZAqzxHYgCghm/z
ZiSpJGZB/Qd7HpHot3U/lnM=
=tf+p
-----END PGP SIGNATURE-----

--TKYYegg/GYAC5JIZ--
 
F

Florian Gross

Eric said:
No, this is definetly my bug. If a developer can't synchronize threads
properly, how can Ruby determine the correct way?

I just think that a developer shouldn't be able to raise the fatal
exception which leads to deadlocks. Maybe there should be better
checking in Thread#stop?

Regards,
Florian Gross
 
E

Eric Hodel

--dCSxeJc5W8HZXZrD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Eric Hodel wrote:
=20
=20
I just think that a developer shouldn't be able to raise the fatal=20
exception which leads to deadlocks. Maybe there should be better=20
checking in Thread#stop?

In a non-threaded app a developer has the ability to raise a fatal
exception that leads to early termination.

--=20
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04


--dCSxeJc5W8HZXZrD
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)

iD8DBQFBI/4CMypVHHlsnwQRAoVwAJ9IAFnO44TZHHneIO2SNWavI70azACfUEo4
cM58Z8U3+r7WTkGRycO0ZJc=
=hTJ2
-----END PGP SIGNATURE-----

--dCSxeJc5W8HZXZrD--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,139
Latest member
JamaalCald
Top