[ANN] Mongrel 0.1.1 -- A Fast Ruby Web Server (It Works Now, Maybe)

G

gabriele renzi

PA ha scritto:
Being a sucker for meaningless benchmarks I had to run this as well :))

<snip>
Hey, that was cool. Any chance yo see how would they run with -c 10?
(and I wonder how fast twisted.web would be :)
 
J

Jonathan Leighton

Noob question here.

No intent to impugn Zed's mad skilz or the need for something like
Mongrel. I'm just confused by why it would be common to develop or
deploy a ruby on rails app with something other than production servers
like Apache.

So far, all of the rails demos I have seen are using webrick. This has
been true even for setups like macosx that come with apache already set
up and running.

Does apache not come standard with everything needed to serve a rails
app? If not, is there an add-on module for apache that makes it
rails-savvy?

Or is it the case that all rails apps have to be served by a special
rails server like mongrel or webrick?

Rails can be run through flat CGI with Apache, but that's really slow
because *every single time* you make a request, the code has to be
reloaded into memory from disk. The step up from this is something like
FastCGI, or SCGI which will keep the code in memory between requests.
The performance is then WAY better, but in development changes you make
to the code won't work until you reload the server. Obviously that's no
good so WEBrick is a lightweight server intended for development use,
which will just reload the parts of the code you change between
requests. It's not recommended for deployment though because it isn't
fast enough.

This is a very good article to read:
http://duncandavidson.com/essay/2005/12/railsdeployment

Hope that helps

Jon
 
Z

Zed Shaw

Noob question here.
I like noobs. Especially with BBQ sauce. :)
No intent to impugn Zed's mad skilz or the need for something like
Mongrel. I'm just confused by why it would be common to develop or
deploy a ruby on rails app with something other than production
servers
like Apache.
Good question. It really comes down to nothing more than the fastest
simplest way to serve up a Rails (or Nitro, Camping, IOWA, etc.)
application. You've currently got various options:

* CGI -- slow, resource hogging, but works everywhere.
* FastCGI -- Fast, current best practice, a pain in the ass to
install and real painful for win32 people.
* SCGI -- Fast, pure ruby (runs everywhere Ruby does), works with a
few servers, very simple to install, use, and cluster, good
monitoring (warning, I wrote this).
* mod_ruby -- Works but haven't heard of a lot of success with it,
couples your app to your web server making upgrades difficult.
* WEBrick -- Runs in pure ruby, easy to deploy, you can put it behind
any web server supporting something like mod_proxy. Fairly slow.

Now, the sweet spot would be something that was kind of at the
optimal axis of FastCGI, SCGI, and WEBrick:

* Runs everywhere Ruby does and is easy to install and use.
* Fast as hell with very little overhead above the web app framework.
* Uses plain HTTP so that it can sit behind anything that can proxy
HTTP. That's apache, lighttpd, IIS, squid, a huge amount of
deployment options open up.

This would be where I'm trying to place Mongrel. It's not intended
as a replacement for a full web server like Apache, but rather just
enough web server to run the app frameworks efficiently as backend
processes. Based on my work with SCGI (which will inherit some stuff
from Mongrel soon), it will hopefully meet a niche that's not being
met right now with the current options.
So far, all of the rails demos I have seen are using webrick. This
has
been true even for setups like macosx that come with apache already
set
up and running.

Does apache not come standard with everything needed to serve a rails
app? If not, is there an add-on module for apache that makes it
rails-savvy?
Apache or lighttpd are the big ones on Unix systems. When you get
over to the win32 camp though lighttpd just don't work, and many
people insist on using IIS. In my own experiences, if you can't hook
it into a portal or Apache without installing any software then
you're dead. Sure this is probably an attempt to stop a disruptive
technology, but if there's a solid fast way to deploy using HTTP then
that's one more chink in the armor sealed up.

Go talk to someone who's forced to IIS and you'll see why something
other than WEBrick is really needed. Actually, WEBrick would be fine
if it weren't so damn slow.
Or is it the case that all rails apps have to be served by a special
rails server like mongrel or webrick?
Well, they have to be served by something running Ruby. I know
there's people who have tried with mod_ruby, but I haven't heard of a
lot of success. I could be wrong on that. Also, many people don't
like tightly coupling their applications into their web server.
 
A

Amr Malik

Zed said:
On Jan 22, 2006, at 12:42 PM, Jeff Pritchard wrote:
snip..
Go talk to someone who's forced to IIS and you'll see why something
other than WEBrick is really needed. Actually, WEBrick would be fine
if it weren't so damn slow.
snip..

Thanks for your work on this. Can you elaborate on what makes Mongrel so
much faster than Webrick? What kind of optimization techniques did you
use to make this faster. Are you using C extensions etc in part to speed
things up. (I guess I'm looking for a bit of a architectural overview
with a webrick arch. comparison to boot if you used that as inspiration)

Just curious! :)

-Amr
 
S

Sascha Ebach

Thanks for your work on this. Can you elaborate on what makes Mongrel so
much faster than Webrick? What kind of optimization techniques did you
use to make this faster. Are you using C extensions etc in part to speed
things up. (I guess I'm looking for a bit of a architectural overview
with a webrick arch. comparison to boot if you used that as inspiration)

Just curious! :)

Yeah, me too. What I wonder about in specific is why not just rewrite the
performance critical parts of webrick in C. That way you would already have
the massive amount of features webrick offers without having to duplicate
all of this. I wonder if you have been thinking about that and the reason
you might have decided against doing it this way.

Just curious, too! :)

-Sascha
 
Z

Zed Shaw

Thanks for your work on this. Can you elaborate on what makes
Mongrel so
much faster than Webrick? What kind of optimization techniques did you
use to make this faster. Are you using C extensions etc in part to
speed
things up. (I guess I'm looking for a bit of a architectural overview
with a webrick arch. comparison to boot if you used that as
inspiration)

You're going to laugh but right now it's down to a bit of Ruby and a
nifty C extension. Seriously. No need yet of much more than some
threads that crank on output, a parser (in C) that makes a hash, and
a way to quickly lookup URI mappings. The rest is done with handlers
that process the result of this. It may get a bit larger than this,
but this core will probably be more than enough to at least service
basic requests. I'm currently testing out a way to drop the threads
in favor of IO.select, but it looks like that messes with threads in
some weird ways.

Once I figure out all the nooks and crannies of the thing then I'll
do a more formal design, but even then it's going to be ruthlessly
simplistic.

Zed A. Shaw
http://www.zedshaw.com/
 
Z

Zed Shaw

Yeah, me too. What I wonder about in specific is why not just
rewrite the performance critical parts of webrick in C. That way
you would already have the massive amount of features webrick
offers without having to duplicate all of this. I wonder if you
have been thinking about that and the reason you might have decided
against doing it this way.

Well, this may be mean, but have you ever considered that "the
massive amount of features webrick offers" is part of the problem?
It's difficult to go into a large (or even medium) code base and
profile it and then add bolt on performance improvements. It can be
done, but it usually ends up as a wart on the system.

So, rather than try to "fix" WEBrick I'm just considering it a
different solution to a different set of problems. Mongrel may pick
up all the features WEBrick has, but right now it's targeted at just
serving Ruby web apps as fast as possible.

Zed A. Shaw
http://www.zedshaw.com/
 
S

Sascha Ebach

Well, this may be mean, but have you ever considered that "the massive
amount of features webrick offers" is part of the problem? It's
difficult to go into a large (or even medium) code base and profile it
and then add bolt on performance improvements. It can be done, but it
usually ends up as a wart on the system.

So, rather than try to "fix" WEBrick I'm just considering it a different
solution to a different set of problems. Mongrel may pick up all the
features WEBrick has, but right now it's targeted at just serving Ruby
web apps as fast as possible.

I suspected something along those lines :) I would probably do the same
because starting from the beginning is always more fun than trying to
understand a large code base. Although I personally think that the latter
doesn't have to be slower. How long could it take to find a dozen slow
spots in webrick? Maybe 2-3 days? Another 2-3 days to tune them?

Anyway, I was just curious, and I am looking forward to following along and
look and learn from the C code. I personally never had the need for
anything to be faster than Ruby *except* the http stuff. But since I have
never actually written more than a couple of lines of C I shyed away from
starting such a thing.

Another tip: Maybe you want to look at Will Glozer's Cerise.

http://rubyforge.org/projects/cerise/

It has a minimum bare bones http server entirely written in Ruby. Maybe it
is of help. Just a thought.

-Sascha
 
T

Toby DiPasquale

Zed said:
You're going to laugh but right now it's down to a bit of Ruby and a
nifty C extension. Seriously. No need yet of much more than some
threads that crank on output, a parser (in C) that makes a hash, and
a way to quickly lookup URI mappings. The rest is done with handlers
that process the result of this.

That's a PATRICIA trie for URL lookup, a finite state machine compiled
Ragel->C->binary for HTTP protocol parsing and an implicit use of
select(2) (via Thread), for the even-more-curious out there ;) (first
hit on Google for "Ragel" will tell you what you need to know about
that)
It may get a bit larger than this,
but this core will probably be more than enough to at least service
basic requests. I'm currently testing out a way to drop the threads
in favor of IO.select, but it looks like that messes with threads in
some weird ways.

Ok, so here's where I fell off your train. On your Ruby/Event page, you
said that you killed the project b/c Ruby's Thread class multiplexes via
the use of select(2), which undermines libevent's ability to effectively
manage events (which I had discovered while writing some extensions a
while back and thought "how unfortunate"). But I have some questions
about the above:

1. As above, the Thread class uses select(2) (or poll(2)) internally;
what would be the difference in using IO::select explicitly besides more
code to write to manage it all?

2. What are these "weird ways" you keep referring to? I got the
select-hogging-the-event-party thing, but what else?

I am interested b/c I am currently trying to write a microthreading
library for Ruby based on some of the more performing event multiplexing
techniques (kqueue, port_create, epoll, etc) so I can use it for other
stuff I want to write (^_^)
Once I figure out all the nooks and crannies of the thing then I'll
do a more formal design, but even then it's going to be ruthlessly
simplistic.

Simple is good, m'kay? ;-) Great show in any case! I know I'll be using
this for my next internal Rails app.
 
Z

Zed Shaw

That's a PATRICIA trie for URL lookup, a finite state machine compiled
Ragel->C->binary for HTTP protocol parsing and an implicit use of
select(2) (via Thread), for the even-more-curious out there ;) (first
hit on Google for "Ragel" will tell you what you need to know about
that)

Ooohh, *that's* what people want to know. You're right. Here's the
main gear involved in the process:

1) Basic Ruby TCPServer is used to create the server socket. No
magic here. A thread then just runs in a loop accepting connections.
2) When a client is accepted it's passed to a "client processor".
This processor is a single function that runs in a loop doing a
readpartial on the socket to get a chunk of data.
3) That chunk's passed to a HTTP parser which makes a Ruby Hash with
the CGI vars in it. The parser is written with Ragel 5.2 (which has
problems compiling on some systems). This parser is the first key to
Mongrel's speed.
4) With a completed HTTP parse, and the body of the request waiting
to be processed, Mongrel tries to find the handler for the URI. It
does this with a modified trie that returns the handler as well as
break the prefix and postfix of the URI into SCRIPT_INFO and
PATH_INFO components.
5) Once I've got the handler, the request hash variables, and a
request object I just call the "process" method and it does it's work.

Unhandled issues are:

* The trie was written in ruby and isn't all that fast. A trie might
also be overkill for what will typically be a few URIs. I was
thinking though that the trie would be great for storing cached
results and looking them up really fast.
* The thread handling has limitations that make it not quite as
efficient as I'd like. For example, I read 2k chunks off the wire
and parse them. If the request doesn't fit in the 2k then I have to
reset the parser, keep the data, and parse it again. I'd really much
rather use a nice ring buffer for this.
* The threads create a ton of objects which can make the GC cause
large pauses. I've tried a group of threads waiting on a queue of
requests, but that's not much faster or better. So far the fastest
is using IO.select (see below).

Ok, so here's where I fell off your train. On your Ruby/Event page,
you
said that you killed the project b/c Ruby's Thread class
multiplexes via
the use of select(2), which undermines libevent's ability to
effectively
manage events (which I had discovered while writing some extensions a
while back and thought "how unfortunate"). But I have some questions
about the above:

Yes, that's still true since Ruby and libevent don't know about the
other. They fight like twenty rabid cats in a pillow case. The main
difference is that IO.select knows about Ruby's threads, so it's
supposed to be safe to use.
1. As above, the Thread class uses select(2) (or poll(2)) internally;
what would be the difference in using IO::select explicitly besides
more
code to write to manage it all?
It does use select transparently, but it seems to add a bunch of
overhead to the select processing it uses. I'm sorting out the
IO.select and thread relationship.
2. What are these "weird ways" you keep referring to? I got the
select-hogging-the-event-party thing, but what else?
Basically select hogs the party, threads just kind of stop for no
reason, select just stops, etc. I really which they'd just use pth
so I could get on with my life. :) I've been playing with it, and
I think I have something that might work.
I am interested b/c I am currently trying to write a microthreading
library for Ruby based on some of the more performing event
multiplexing
techniques (kqueue, port_create, epoll, etc) so I can use it for other
stuff I want to write (^_^)
You know, having tried this, I have to say you'll be fighting a
losing battle. Ruby's thread implementation just isn't able to work
with external multiplexing methods. I couldn't figure it out, so if
you do then let me know.
Simple is good, m'kay? ;-) Great show in any case! I know I'll be
using
this for my next internal Rails app.

Thanks!

Zed A. Shaw
http://www.zedshaw.com/
 
P

PA

Hey, that was cool. Any chance yo see how would they run with -c 10?

[Mongrel]
% ruby -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]
% ruby simpletest.rb
% ab -n 10000 -c 10 http://localhost:3000/test
Requests per second: 386.31 [#/sec] (mean)

[Webrick]
% ruby -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]
% ruby webrick_compare.rb >& /dev/null
% ab -n 10000 -c 10 http://localhost:3000/test
Requests per second: 27.58 [#/sec] (mean)

[Cherrypy]
% python -V
Python 2.4.2
% python tut01_helloworld.py
% ab -n 10000 -c 10 http://localhost:8080/
Requests per second: 164.77 [#/sec] (mean)

[LuaWeb]
% lua -v
Lua 5.1 Copyright (C) 1994-2006 Lua.org, PUC-Rio
% lua Test.lua
% ab -n 10000 -c 10 http://localhost:1080/hello
Requests per second: 927.04 [#/sec] (mean)

[httpd]
% httpd -v
Server version: Apache/1.3.33 (Darwin)
% ab -n 10000 -c 10 http://localhost/test.txt
Requests per second: 1186.10 [#/sec] (mean)

[lighttpd]
% lighttpd -v
lighttpd-1.4.9 - a light and fast webserver
% ab -n 10000 -c 10 http://localhost:8888/test.txt
Called sick today (fdevent.c.170: aborted)


Cheers
 
K

kellan

Zed said:
On Jan 23, 2006, at 9:27 PM, Toby DiPasquale wrote:

You know, having tried this, I have to say you'll be fighting a
losing battle. Ruby's thread implementation just isn't able to work
with external multiplexing methods. I couldn't figure it out, so if
you do then let me know.

I've been meaning to ask about this as well ever since I saw you killed
Ruby/Event. In your experience is it only a Bad Idea(tm) to use
poll/libevent in your Ruby app if you'll also be using Threads, or is
it always a bad idea, even if you can guarentee that "require 'thread'"
is never issued?

Also, did you ever get a chance to write a port mortem discussing your
findings and the problems you ran into?

This seems like a fairly major serious problem with Ruby that should
be addressed. Event driven programming really enables the whole
"pieces loosely joined" paradigm. I mean its kind of embarrasing that
the best way to do async programming with Ruby is to use Rails'
javascript libraries. (okay, I'm enough of a web geek to think that
that is actually kind of cool, but we'll ignore that)

Thanks,
kellan
 
B

Booker C. Bense

-----BEGIN PGP SIGNED MESSAGE-----

I suspected something along those lines :) I would probably do the same
because starting from the beginning is always more fun than trying to
understand a large code base. Although I personally think that the latter
doesn't have to be slower. How long could it take to find a dozen slow
spots in webrick? Maybe 2-3 days? Another 2-3 days to tune them?

If it were that easy somebody would have already done
it. Profiling and optimizing languages like Ruby can be
quite difficult, if you do it at the C level you often get
results very difficult to interpret or even do anything useful
with. I.e. the profiler shows you spending 80% your time in
some basic underlying routine of ruby. If you do it at a higher
level, the overhead of benchmarking can often skew the results
badly.

So it's hard to get the data, and optimizing w/o real profiling
data is one of the great evils of programming. With simpler
apps you can often make a good guess, but in my experience
guessing where the time is spent in a more complex application
is almost always wrong.

_ Booker C. Bense


-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQCVAwUBQ9bAIWTWTAjn5N/lAQGJQAQAgMHiY0RF+WR72pcQi0f67w2q9lUXa9wG
4pB0SfD73IiOU6D9khf8iL2Kf8dpfQ1Ubsmgpi+cVsKYADXnbZSC1Krjd6HT6Uq7
gFaGnNyj3T6VyRZDbacBR6p2NJSZRa68R2o9kkRo0g160H/a47cE+J7fi22HGjbb
1kvYfxXLgso=
=Kv8Y
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top