How do I scale large Ruby web applications?

Discussion in 'Ruby' started by Sascha Ebach, Apr 26, 2004.

  1. Sascha Ebach

    Sascha Ebach Guest

    Hi there,

    I plan to do a fairly large (how large depends on the commercial
    success) project that will be programmed completely in Ruby. I can't
    talk about the specifics right now, but to give you an idea it will be
    something like David Heinemeier Hansson's Basecamp except that it will
    not have anything to do with project management, so no direct
    competition (it will be completely german project anyway). If I remember
    correctly I read somewhere in his blog that basecamp is a medium sized
    project, so
    when I say -large- project I actually mean a project which will start
    out small and depending on the commercial success could get a very large
    user base. At least that is what I am hoping for :) It could of course
    completely fail, only time will tell.

    I would like to ask a couple of questions on the scalability of these
    kinds of web projects. I was thinking of writing to David directly but I
    was hoping that if he (and others with that kind of experience) answers
    my questions on the list everybody could profit from that.

    Following are my conceptions of what I heard or have read about Ruby. If
    I am wrong with any of these, don't hesitate to correct me.

    In my understanding the biggest problem in scaling Ruby (cpu wise) is
    that it doesn't have native thread support, yet. What this means in
    terms of a web application is that if you only have let's say 30
    concurrent users on a fairly new piece of Hardware this is not a
    problem. But what happens if your site suddenly gets very popular and
    you jump from 20 to 200 or even 2000 concurrent users? How do you scale
    such a web app? If you were to program this web app in Java or any other
    language which supports native threads you could simply throw more cpus
    and ram at it. I am thinking of a blade server here. The more users you
    get you simply stick another blade in your server and have your piece of
    mind. As I understand it you cannot do something like that with Ruby.
    Enter Distributed Ruby (DRb).

    As Martin Fowler states in his first law of distributed object design:
    Don't distribute your objects!
    <http://c2.com/cgi/wiki?FirstLawOfDistributedObjectDesign>

    Things can really get hairy when getting distributed. All kinds of
    things can go wrong. IMO it is kinda like the step from single threaded
    programming to multi threaded programming or worse. So it is always a
    good thing if you can avoid it. I can't even start to think about how I
    would (unit) test such a beast but as of now I don't see any real
    alternatives for scaling a Ruby app. Of course when done right it has
    many many benefits. One is that you can buy the cheapest hardware and
    plug them together like Google is doing it, but Google seems to have an
    armada of excellent programmers (so it seems) to handle the pontentially
    very difficult distributed stuff. I am only a single (maybe a little
    over ambitious) programmer. I have seen a couple of very nice and simple
    examples in drb (from Dave Thomas for example), but would you really
    advise using drb for some big time commercial web app?

    I would be very curious what kind of strategy 37signals has with their
    Basecamp. Maybe David can elaborate on that if it is not a big secret. I
    am very eager to hear about specific choices from anyone who has
    similiar experience.

    - What kind of hardware are you using?
    - Where are the biggest performance bottlenecks in your environment?
    - What is your hardware upgrade path in case active user numbers go
    through the roof?
    - What kind of httpd do you use?
    - What kind of framework are you using?

    Thanks
    --
    Sascha Ebach
     
    Sascha Ebach, Apr 26, 2004
    #1
    1. Advertising

  2. "Sascha Ebach" <> schrieb im Newsbeitrag
    news:...

    > In my understanding the biggest problem in scaling Ruby (cpu wise) is
    > that it doesn't have native thread support, yet. What this means in
    > terms of a web application is that if you only have let's say 30
    > concurrent users on a fairly new piece of Hardware this is not a
    > problem. But what happens if your site suddenly gets very popular and
    > you jump from 20 to 200 or even 2000 concurrent users? How do you scale
    > such a web app? If you were to program this web app in Java or any other
    > language which supports native threads you could simply throw more cpus
    > and ram at it.


    Not necessarily: typically a queue based approach is used: incoming
    requests go into a queue from which a fixed (or limited) number of
    processors fetch them and process them. That way you prevent the thread
    creation overhead plus you avoid system overload due to too many threads
    (native or not) and congestion due to resource bottlenecks (too many
    clients that want to use the same resource).

    The problem with performance is that you normally have to try and measure
    to see whether it's ok or not. And of course it heavily depends on the
    algorithms and kind of application at hand. I'd say without more
    information about these it's impossible to say offhand whether Ruby will
    deliver or not.

    Kind regards

    robert
     
    Robert Klemme, Apr 26, 2004
    #2
    1. Advertising

  3. Hello Sascha,

    Monday, April 26, 2004, 9:41:20 AM, you wrote:

    SE> Hi there,
    SE> .....

    I don't know what the application does that you mention in your
    posting, but the best way to scale ruby web apps is to use numerous
    external started FCGI servers and use the session id to bind to the
    same user to the same FCGI server. If you don't have extremely high
    interuser interaction this would simplify your life. It can also
    reduce your database load a lot and make HTML-GUI implementation
    easier. The most complicated thing is to get this configuration up and
    running - and it does not work good with load balancers.

    So i'm afraid that nobody can give you a good hint. You have to find
    the best architekture to your special application on your own.
    Starting from using DNS round robin load balancing to one high end
    multi-cpu server running dozens of FCGI servers everything is possible.

    The first thing would be to check if ruby is really what you want. Do
    you have the libraries you need for your project, are they stable
    (write some burn-in tests here, for example REXML has some extremely
    annoying performance bugs) and then do some benchmarking. Then think
    about design.

    --
    Best regards,
    Lothar mailto:
     
    Lothar Scholz, Apr 26, 2004
    #3
  4. > I would be very curious what kind of strategy 37signals has with their
    > Basecamp. Maybe David can elaborate on that if it is not a big secret.
    > I am very eager to hear about specific choices from anyone who has
    > similiar experience.


    Like Lothar says, it depends on your application. Basecamp has fairly
    complex relations in the domain model, but our load isn't that high
    because each site is only used by a firm and a client, which is usually
    less than 10 people. That would be unlike some high-load e-commerce
    site that needs to be accessible to the world. So we're blessed in that
    way.

    Hence, our current approach is mod_ruby/mysql on a single box. We're
    just in the process of making that single box of stronger iron, but
    that's definitely the easiest way. Stay on a single box as long as you
    can. Next, move the database to a separate machine. Then start thinking
    about scaling the application server. (Or you may start thinking
    beforehand as you do now, just no need to commit the dollars)

    Our strategy for scaling the application server is also to go the FCGI
    route. Basecamp works more like a PHP app than a Java app on that
    point, so state is only kept in the database (and currently
    CGI::Session on the file system, but that's easy to move to either
    database or shared DRb server). That means the database handles all the
    synchronizing and we can just add more application servers (running
    FCGI) as the load increases. When the database needs scaling, I hear
    MySQL is also very much up for that job.

    The biggest bottleneck is rendering templates in Ruby. We're currently
    just using vanilla ERb with no template caching and we feel that on
    complex templates.

    But all of this ties into your business model as well. It needs to be
    profitable to add additional hardware. In the case of Basecamp, it's
    very much so. More customers directly equals more revenue, which makes
    expanding the server park no problem at all.

    Basecamp is running Ruby on Rails[1] with Apache/mod_ruby. As I said,
    it'll probably be running on FCGI before long as we look at add another
    application server.

    COMMERCIAL: If you happen to be in Chicago on June 25th, we'll be
    realing all about Basecamp and Rails in a one-day workshop. There's
    more information on http://www.37signals.com/workshop-062504.php.

    [1] http://www.rubyonrails.org
    --
    David Heinemeier Hansson,
    http://www.instiki.org/ -- A No-Step-Three Wiki in Ruby
    http://www.basecamphq.com/ -- Web-based Project Management
    http://www.loudthinking.com/ -- Broadcasting Brain
     
    David Heinemeier Hansson, Apr 26, 2004
    #4
  5. Sascha Ebach

    Dan Janowski Guest

    On Apr 26, 2004, at 03:41, Sascha Ebach wrote:
    >
    > In my understanding the biggest problem in scaling Ruby (cpu wise) is
    > that it doesn't have native thread support, yet. What this means in
    > terms of a web application is that if you only have let's say 30
    > concurrent users on a fairly new piece of Hardware this is not a
    > problem. But what happens if your site suddenly gets very popular and
    > you jump from 20 to 200 or even 2000 concurrent users? How do you
    > scale such a web app? If you were to program this web app in Java or
    > any other language which supports native threads you could simply
    > throw more cpus and ram at it. I am thinking of a blade server here.
    > The more users you get you simply stick another blade in your server
    > and have your piece of mind. As I understand it you cannot do
    > something like that with Ruby. Enter Distributed Ruby (DRb).
    >
    > As Martin Fowler states in his first law of distributed object design:
    > Don't distribute your objects!
    > <http://c2.com/cgi/wiki?FirstLawOfDistributedObjectDesign>
    >


    Distributing objects is one thing, but using drb as a request/response
    and control protocol is what I am considering at the moment. I was
    thinking of FCGI, but it is all or nothing in the sense that it forces
    the whole hit service out of Apache. But a considerable amount of cgi
    processing is not session dependent and is more appropriate in the
    Apache side (I use mod_ruby). Separating non-web application logic from
    cgi/web processing is one way to efficiently distribute the load.

    Dan
     
    Dan Janowski, Apr 27, 2004
    #5
  6. Hello Dan,

    Tuesday, April 27, 2004, 7:19:44 AM, you wrote:


    DJ> On Apr 26, 2004, at 03:41, Sascha Ebach wrote:
    >>
    >> In my understanding the biggest problem in scaling Ruby (cpu wise) is
    >> that it doesn't have native thread support, yet. What this means in
    >> terms of a web application is that if you only have let's say 30
    >> concurrent users on a fairly new piece of Hardware this is not a
    >> problem. But what happens if your site suddenly gets very popular and
    >> you jump from 20 to 200 or even 2000 concurrent users? How do you
    >> scale such a web app? If you were to program this web app in Java or
    >> any other language which supports native threads you could simply
    >> throw more cpus and ram at it. I am thinking of a blade server here.
    >> The more users you get you simply stick another blade in your server
    >> and have your piece of mind. As I understand it you cannot do
    >> something like that with Ruby. Enter Distributed Ruby (DRb).
    >>
    >> As Martin Fowler states in his first law of distributed object design:
    >> Don't distribute your objects!
    >> <http://c2.com/cgi/wiki?FirstLawOfDistributedObjectDesign>
    >>


    DJ> Distributing objects is one thing, but using drb as a request/response
    DJ> and control protocol is what I am considering at the moment. I was
    DJ> thinking of FCGI, but it is all or nothing in the sense that it forces
    DJ> the whole hit service out of Apache. But a considerable amount of cgi
    DJ> processing is not session dependent and is more appropriate in the
    DJ> Apache side (I use mod_ruby). Separating non-web application logic from
    DJ> cgi/web processing is one way to efficiently distribute the load.

    Does apache never use threading inside one of the preforked apache
    processes ? I thought 2.0 is doing this by default, even on Unix.

    And you should measure if enabling threading reduces the overhead more
    then FCGI is adding.

    By the way if you have your own webserver (so you don't need all the
    flexible configuration features of apache) then you should never use
    apache if performance is important. Apache is not a very fast
    webserver. A lot of other servers give you twice the responds then
    apache (or even more if apache is not configured correctly).

    --
    Best regards,
    Lothar mailto:
     
    Lothar Scholz, Apr 27, 2004
    #6
  7. Sascha Ebach

    Sascha Ebach Guest

    Thanks for all the comments so far. They are really helping.

    Let me elaborate a little more. In the comments I read interesting ideas
    I want to go into a little more.

    Robert Klemme wrote:

    > The problem with performance is that you normally have to try and
    > measure to see whether it's ok or not. And of course it heavily
    > depends on the algorithms and kind of application at hand. I'd say
    > without more information about these it's impossible to say offhand
    > whether Ruby will deliver or not.


    I know that. You only going to really know the moment your service gets
    used. Still you have to find a good scalable approach to begin with to
    minimize the chances of rewriting everything. And I also know that
    premature optimization is a bad thing. But blindly doing anything that
    works can be equally problematic in the future.

    We are currently in the planning stages writing the use cases. And if I
    would have to compare what the system will have to do I would probably
    have to say it will be a lot like ebay. Except that it will have nothing
    to do with auctions. We are not that crazy ;) Ppl will be able to put
    there profiles in, upload pictures and descriptions of their objects,
    search the database for these objects, communicate over the system about
    these objects. Those are the basics. So from the viewpoint of what you
    will be able to do I see that it is pretty similiar to the basic things
    you can do on ebay. I hope that clears things up a little more.

    Robert also states that typically (in web applications?) a queue based
    approach is used. That sounds very interesting. Not only that you can
    circumvent threads which is always good if possible but I can imagine
    that such a system would be fairly easy to implement. The thing I am
    wondering about is the response times. While you can have
    progress bars in a desktop app you really can't afford to let the
    user wait in a web app. It would probably take a while before that
    happens on newer hardware, but this was exactly my point. In the event
    you are so lucky that ppl get crazy about your service and you really
    get more visitors than anticipated even a queue based solution has to
    scale. I cannot imagine how to do that so I have to give it more
    thought. Maybe such a system can be implemted in such a way that it can
    be easily scaled with new (upgrading) hardware? I could imagine a queue
    based system that forwards its items to a distributed network of
    computers via DRb...

    Lothar Scholz wrote:

    > The best way to scale ruby web apps is to use numerous external
    > started FCGI servers and use the session id to bind to the same user
    > to the same FCGI server. If you don't have extremely high interuser
    > interaction this would simplify your life. It can also reduce your
    > database load a lot and make HTML-GUI implementation easier. The most
    > complicated thing is to get this configuration up and running - and
    > it does not work good with load balancers.


    You seem to speak out of experience. Could you elaborate a little more
    or maybe point me into a direction where I can read a more about this
    technique? I will read about FCGI because others have mentioned it too.
    The reason I haven't already done so, is that from a PHP perspective you
    wouldn't bother to use CGI because of the process duplication which
    slows down things considerably. mod_php is a lot more stable and
    complete than mod_ruby at the moment. I must investigate more in this
    option. One other thing I wonder about is when you say that it is
    complicated to set up. It sounds to me that this technique is like
    distributed processes in contrast to distributed services on different
    machines. With the advantage of being able to scale the hardware on one
    machine.

    > The first thing would be to check if ruby is really what you want. Do
    > you have the libraries you need for your project, are they stable


    I don't know if there are any really stable and mature libraries for
    Ruby in comparison to Java libraries. At first I thought it would be a
    good idea
    to write everything myself so I could keep the system as small as
    possible. But
    I have come to realize that this may be not the optimal approach. At the
    moment I think it would be a better idea to use one of the available
    frameworks and contribute to them rather than rewrite my own. That way I
    can give something back to the open source community which is long
    overdue for me (using open source stuff all the time). At the moment my
    favourite framework is Cerise (http://cerise.rubyforge.org/) as it is
    the most elegant solution I have encountered so far (not only for Ruby).
    I have done some
    preliminary tests with Cerise and the functionality I want to implement
    will be a snap to do so with this framework. Will Glozer the project
    maintainer seems to be very experienced and still working alone on this
    wonderful piece of software. I would like to join that project (or any
    other) if it turns out to be the right one for my purposes. But I
    haven't made my final
    decision, yet. I still have to really test a couple of the other
    frameworks that were mentioned on this list earlier, including the soon
    (?) to be released rails. The project will span from June 2004 to March
    2005, so
    there is still some time.

    David Heinemeier Hansson wrote:

    > Stay on a single box as long as you can. Next, move the database to a
    > separate machine. Then start thinking about scaling the application
    > server. (Or you may start thinking beforehand as you do now, just no
    > need to commit the dollars)


    That was the idea. First the database, then the images on a another
    machine (maybe with a fast webserver and logging turned off).

    > COMMERCIAL: If you happen to be in Chicago on June 25th, we'll be
    > realing all about Basecamp and Rails in a one-day workshop. There's
    > more information on http://www.37signals.com/workshop-062504.php.


    Yeah, I read all about that. Two weeks ago our first child was born so I
    am a little tied at the moment. I would have probably come to Denmark
    for your
    presentation, but because of these new "circumstances" I really cannot.
    I wonder if anyone would be able to record your session on video and
    offer it for download via torrent or so, maybe you?

    Kirk Haines: I really like the way you described the scaling process.
    That seems like a viable possibility I must investigate. I have just
    started reading about IOWA and have yet to really check it out. I only
    know roughly what it is, but not what it can do for me. Very
    interesting.

    Dan Janowski wrote:

    > Distributing objects is one thing, but using drb as a
    > request/response and control protocol is what I am considering at the
    > moment. I was thinking of FCGI, but it is all or nothing in the sense
    > that it forces the whole hit service out of Apache. But a
    > considerable amount of cgi processing is not session dependent and is
    > more appropriate in the Apache side (I use mod_ruby). Separating
    > non-web application logic from cgi/web processing is one way to
    > efficiently distribute the load.


    It seems like I will have to weigh those two options against each other.
    I think once you have a running DRb system on one machine it should be a
    snap to just add other machines to the system. If you first go the FCGI
    route you will have to leave it sooner or later. The problem is with
    development time. A distributed system can be so much harder to
    implement, remember the HURD? http://www.gnu.org/software/hurd/

    Lothar Scholz added:

    > By the way if you have your own webserver (so you don't need all the
    > flexible configuration features of apache) then you should never use
    > apache if performance is important. Apache is not a very fast
    > webserver. A lot of other servers give you twice the responds then
    > apache (or even more if apache is not configured correctly).


    That sounds good. I was hoping to be able to use a built-in server like
    Cerise offers. What would really be nice if someone with the right
    skills would implement a webserver as a barebones c-module. Or maybe
    somehow integrating one of the available webservers like
    http://www.annexia.org/freeware/rws/ or any other. Ideally something
    that would work with Webrick and making it faster. Apache is not really
    fast but Webrick or any other Ruby implementation is by far slower.
    Anyone heard of such a project? My c-skills are really not mentionable.
    I never came around to actually write anything in c although I have
    already read 3 or 4 books about the languags (just in case).


    Thanks again for all your replies.
    --
    Sascha Ebach
     
    Sascha Ebach, Apr 27, 2004
    #7
  8. Hello Sascha,


    >> The best way to scale ruby web apps is to use numerous external
    >> started FCGI servers and use the session id to bind to the same user
    >> to the same FCGI server. If you don't have extremely high interuser
    >> interaction this would simplify your life. It can also reduce your
    >> database load a lot and make HTML-GUI implementation easier. The most
    >> complicated thing is to get this configuration up and running - and
    >> it does not work good with load balancers.


    SE> You seem to speak out of experience. Could you elaborate a little more

    I was responsible for a complete crash of AOL Germany and a 3 days
    crash of Compuserve Germany :) Since this early days i learned a lot of how
    to not write/setup a huge content management system and that a rich company
    like AOL has only 4 pizza boxes each with one 266 MHz to serve everything.

    SE> or maybe point me into a direction where I can read a more about this
    SE> technique? I will read about FCGI because others have mentioned it too.
    SE> The reason I haven't already done so, is that from a PHP perspective you
    SE> wouldn't bother to use CGI because of the process duplication which
    SE> slows down things considerably. mod_php is a lot more stable and
    SE> complete than mod_ruby at the moment. I must investigate more in this
    SE> option. One other thing I wonder about is when you say that it is
    SE> complicated to set up. It sounds to me that this technique is like
    SE> distributed processes in contrast to distributed services on different
    SE> machines. With the advantage of being able to scale the hardware on one
    SE> machine.

    FCGI has nothing to do with CGI except 3 letters. It is an application
    server and mod_fastcgi will send request data via a socket connection
    to the FCGI server - it can automatically start and stop the FCGI
    application server, but in a lot of cases this is not done very
    gratefully.

    mod_fastcgi + ruby FCGI is faster then mod_ruby. And it will not
    result in an overflow of apaches incoming socket connection queue.
    You get the mentioned queue system for free.

    If you move away from apache, as i mentionend in another thread, you
    can get better performance, especially the mod_fastcgi module does not
    support everything that is in the specification. There are commerical
    servers that are much better.

    >> The first thing would be to check if ruby is really what you want. Do
    >> you have the libraries you need for your project, are they stable


    SE> I don't know if there are any really stable and mature libraries for
    SE> Ruby in comparison to Java libraries. At first I thought it would be a
    SE> good idea
    SE> to write everything myself so I could keep the system as small as
    SE> possible.

    Every programmer first think it is better to do everything on its own.
    Maybe a reason why only 16% of all IT projects are successful.


    --
    Best regards,
    Lothar mailto:
     
    Lothar Scholz, Apr 27, 2004
    #8
  9. Sascha Ebach

    Sascha Ebach Guest

    Hello Lothar,

    > If you move away from apache, as i mentionend in another thread, you
    > can get better performance, especially the mod_fastcgi module does not
    > support everything that is in the specification. There are commerical
    > servers that are much better.


    I was going to ask you if you would recommend one, but I see there are
    lots of servers mentioned on http://www.fastcgi.com. Nevertheless ...

    Thanks
    --
    Sascha Ebach
     
    Sascha Ebach, Apr 27, 2004
    #9
  10. Sascha Ebach

    Ara.T.Howard Guest

    On Tue, 27 Apr 2004, Sascha Ebach wrote:

    > You seem to speak out of experience. Could you elaborate a little more or
    > maybe point me into a direction where I can read a more about this
    > technique? I will read about FCGI because others have mentioned it too.


    it's the cat's meow:

    go to

    http://www.rubygarden.org/

    and search for 'fastcgi'. there is some good info there.


    > The reason I haven't already done so, is that from a PHP perspective you
    > wouldn't bother to use CGI because of the process duplication which slows
    > down things considerably. mod_php is a lot more stable and complete than
    > mod_ruby at the moment. I must investigate more in this option. One other
    > thing I wonder about is when you say that it is complicated to set up. It
    > sounds to me that this technique is like distributed processes in contrast
    > to distributed services on different machines. With the advantage of being
    > able to scale the hardware on one machine.


    fcgi is a protocol by which a web server can forward a running program a
    request via pipes OR sockets - eg. a fcgi process can live on another MACHINE
    as the server and still answer requests. basically it just serializes a web
    request (env, query, post data) etc. in a way that can be easily demultiplexed
    by the application (require 'fcgi' does this for you can hands you a cgi
    object populated with this data. going to other way is even easier, you
    simply write to stdout as you would with a normal cgi program and mod_fastcgi
    process on the other end demutiplexes this output as the response for a
    particular request. so you see, it really does HAVE to be like cgi at all;
    fortunately for us though it is, and because it is it's a peice of cake to use
    existing code and libraries:

    cgi program that prints env:

    ~ > cat env.cgi
    #!/usr/local/ruby-1.8.0/bin/ruby
    require 'cgi'
    require 'fcgi'
    FCGI.each_cgi do |cgi|
    content = ''
    env = []
    cgi.env_table.each do |k,v|
    env << [k,v]
    end
    env.sort!
    env.each do |k,v|
    content << %Q(#{k} => #{v}<br>\n)
    end
    cgi.out{content}
    end

    and a fcgi program which does the same:

    ~ > cat env.fcgi
    #!/usr/local/ruby-1.8.0/bin/ruby
    require 'cgi'
    require 'fcgi'
    FCGI.each_cgi do |cgi|
    content = ''
    env = []
    cgi.env_table.each do |k,v|
    env << [k,v]
    end
    env.sort!
    env.each do |k,v|
    content << %Q(#{k} => #{v}<br>\n)
    end
    cgi.out{content}
    end


    you may have noticed that these are the SAME program: that's because they are!
    based on the characteristics of STDIN (pipe/socket or not) a fastcgi process
    can determine whether or not it is running in a fastcgi environment: therefore
    you can write your apps as a cgi program and they will work unmodified as
    fastcgi programs! why would you do this? well, for one:


    [ahoward@www htdocs]$ ab http://127.0.0.1/env.cgi | grep 'Requests per second'
    Requests per second: 9.86 [#/sec] (mean)

    [ahoward@www htdocs]$ ab http://127.0.0.1/env.fcgi | grep 'Requests per second'
    Requests per second: 179.18 [#/sec] (mean)

    two orders of magnitude!

    as mentioned above, sometimes re-starting a fastcgi process is toublesome
    (mildly) because apache can manage a pool of such processes (configurable) so
    you need to kill all of them to make sure you changes are propogated to all
    clients - what i typically do is simply 'ln -s foo.fcgi foo.cgi'. therefore i
    can view

    http://127.0.0.1/foo.cgi

    while developing and see changes instantly. when i'm ready i redeploy the
    application. one easy way to do this is to code your process so you can

    http://127.0.0.1/foo.cgi?reload=true

    in which case the process simply exits (apache will restart it again for you).
    obviously this should not work from remote machines!


    > I don't know if there are any really stable and mature libraries for Ruby in
    > comparison to Java libraries.


    these seem stable:
    - fcgi
    - amrita
    - kwartz

    > That was the idea. First the database, then the images on a another machine
    > (maybe with a fast webserver and logging turned off).


    with fastcgi you can move image.cgi and db.cgi to another sever by themselves
    without even a web server!

    > Yeah, I read all about that. Two weeks ago our first child was born so I am
    > a little tied at the moment.


    congrats.

    -a
    --
    ===============================================================================
    | EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
    | PHONE :: 303.497.6469
    | ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
    | URL :: http://www.ngdc.noaa.gov/stp/
    | TRY :: for l in ruby perl;do $l -e "print \"\x3a\x2d\x29\x0a\"";done
    ===============================================================================
     
    Ara.T.Howard, Apr 27, 2004
    #10
  11. (selective reply)

    "Sascha Ebach" <> schrieb im Newsbeitrag
    news:...
    > Thanks for all the comments so far. They are really helping.
    >
    > Let me elaborate a little more. In the comments I read interesting ideas
    > I want to go into a little more.
    >
    > Robert Klemme wrote:
    >
    > > The problem with performance is that you normally have to try and
    > > measure to see whether it's ok or not. And of course it heavily
    > > depends on the algorithms and kind of application at hand. I'd say
    > > without more information about these it's impossible to say offhand
    > > whether Ruby will deliver or not.

    >
    > I know that. You only going to really know the moment your service gets
    > used. Still you have to find a good scalable approach to begin with to
    > minimize the chances of rewriting everything. And I also know that
    > premature optimization is a bad thing. But blindly doing anything that
    > works can be equally problematic in the future.
    >
    > We are currently in the planning stages writing the use cases. And if I
    > would have to compare what the system will have to do I would probably
    > have to say it will be a lot like ebay. Except that it will have nothing
    > to do with auctions. We are not that crazy ;) Ppl will be able to put
    > there profiles in, upload pictures and descriptions of their objects,
    > search the database for these objects, communicate over the system about
    > these objects. Those are the basics. So from the viewpoint of what you
    > will be able to do I see that it is pretty similiar to the basic things
    > you can do on ebay. I hope that clears things up a little more.
    >
    > Robert also states that typically (in web applications?) a queue based
    > approach is used.


    I wasn't clear enough: I meant to say in webservers. Tomcat for example
    has a thread pool (whose min and max size is configurable). The start
    with min size and increase the number of threads if needed and until max
    is reached.

    > That sounds very interesting. Not only that you can
    > circumvent threads


    You don't circumvent threads but you remove the overhead of thread
    creation - which is significant if processing times are low.

    > which is always good if possible but I can imagine
    > that such a system would be fairly easy to implement.


    Yeah, not really a big deal. (see attached example)

    > The thing I am
    > wondering about is the response times. While you can have
    > progress bars in a desktop app you really can't afford to let the
    > user wait in a web app. It would probably take a while before that
    > happens on newer hardware, but this was exactly my point. In the event
    > you are so lucky that ppl get crazy about your service and you really
    > get more visitors than anticipated even a queue based solution has to
    > scale. I cannot imagine how to do that so I have to give it more
    > thought.


    Well, you have the response time problem either way - i.e. with threads
    per request as well as with tasks distributed via a queue. The advantage
    of the pool based approach is, that you can configure the max number of
    threads to the max reasonable for the system at hand and thus ensure that
    those requests that don't time out are processed with reasonable
    performance.

    > Maybe such a system can be implemted in such a way that it can
    > be easily scaled with new (upgrading) hardware? I could imagine a queue
    > based system that forwards its items to a distributed network of
    > computers via DRb...


    You mean a load balancing frontend that distributes requests among other
    hosts? Yeah, that's possible. And you can buy such solutions out of the
    box without additional ruby programming.

    Regards

    robert
     
    Robert Klemme, Apr 27, 2004
    #11
  12. Hello Ara.T.Howard,

    ATH> [ahoward@www htdocs]$ ab http://127.0.0.1/env.cgi | grep 'Requests per second'
    ATH> Requests per second: 9.86 [#/sec] (mean)

    ATH> [ahoward@www htdocs]$ ab http://127.0.0.1/env.fcgi | grep 'Requests per second'
    ATH> Requests per second: 179.18 [#/sec] (mean)

    ATH> two orders of magnitude!

    Just nitpicking, but this is a little bit more then one order of
    magniture. But from my tests i can confirm this difference if your
    page is not completely database bound.


    >> I don't know if there are any really stable and mature libraries for Ruby in
    >> comparison to Java libraries.


    ATH> these seem stable:
    ATH> - fcgi
    ATH> - kwartz
    ATH> - amrita

    Not 100% stable, you have to use some workarounds and more worse you
    have to find the places for the workarounds. But it is useable in a
    production environment, if performance is not the primary goal.

    I would add clearsilver to the lists, but i must admit that i never
    got it compiled on my windows machine: http://www.clearsilver.net/


    >> That was the idea. First the database, then the images on a another machine
    >> (maybe with a fast webserver and logging turned off).



    --
    Best regards, mailto: scholz at scriptolutions dot com
    Lothar Scholz http://www.ruby-ide.com
    CTO Scriptolutions Ruby, PHP, Python IDE 's
     
    Lothar Scholz, Apr 27, 2004
    #12
  13. Sascha Ebach

    Sascha Ebach Guest

    thanks again for all these awesome hints.

    I read the docs and whitepapers on FastCGI on fastcgi.com and have a
    better understanding of what FastCGI is. It sounds very interesting even
    for using with php. As I understand it, it allows you to run different
    versions of php (or whatever) althogether. Might be nice to have an
    apache with ruby, php4 and php5 in parallel.
    I didn't even suspect that fcgi processes can even run on different
    machines. Now I have something to really try.

    Thanks Robert that really cleared it up for me. I knew of threadpools
    but I slightly ;) misunderstood you the first time.

    Kirk, that indeed seems like a good starting point for the
    documentation. Whenever I get around testing IOWA (and I will when I am
    not so pressed for time anymore) I will likely ask questions. Don't
    expect this before June though.


    Lothar, last time I checked your website the Ruby IDE wasn't ready yet.
    I just installed the preview release and I am looking forward to testing
    it. It looks great on first look. I will test if it can replace my
    beloved vim.
    It seems rather confusing though that when you click on download for the
    php or python version you land on the ruby download page.
    http://www.scriptolutions.com/download_php.php. I am looking forward to
    see these editors mature.

    ---
    Sascha Ebach
     
    Sascha Ebach, Apr 27, 2004
    #13
  14. Hello Sascha,


    SE> Lothar, last time I checked your website the Ruby IDE wasn't ready yet.
    SE> I just installed the preview release and I am looking forward to testing
    SE> it. It looks great on first look. I will test if it can replace my
    SE> beloved vim.

    Danke. Ich freue mich über jeden Bug Report :)

    Du solltest schon im CGI's den Debugger out of the box benutzen können
    wenn du als project type "Fuby CGI based Website" angibts, dann wird im
    Hintergrund ein Apache gestartet und alle Preview Kommandos (d.h. F9
    oder das web debug icon in der toolbar) laufen dann über den Browser, an einen
    Proxy und von dort zum Apache,
    von dort über ein launcher.exe file auf das Ruby CGI, ein langer weg
    aber es ist ja fürs Debugging mit einem request pro sekunde. :)

    Ansonsten hab ich gerade noch 2 fatale Bugs im Debugger gefixt (du
    weisst schon die ersten 2 Wochen, gibts jeden Tag fatale Bugfixes),
    wenn du also TDSL oder ISH hast, dann solltest du morgen nochmal
    downloaden. Bin dann ab Freitag für 3 Wochen erstmal im Urlaub und
    kann keine Uploads dieser Grösse mehr in meinem kleinen Bergdorf an
    der Thailändisch/Burmesischen Grenze machen.

    SE> It seems rather confusing though that when you click on download for the
    SE> php or python version you land on the ruby download page.
    SE> http://www.scriptolutions.com/download_php.php. I am looking forward to
    SE> see these editors mature.

    Dauert leider noch etwas, na ja bei meinen Freunden lerne ich ja
    buddistische gelassenheit - sehr wichtig das bei der Software
    Entwicklung nicht zu vergessen.

    --
    Best regards, emailto: scholz at scriptolutions dot com
    Lothar Scholz http://www.ruby-ide.com
    CTO Scriptolutions Ruby, PHP, Python IDE 's
     
    Lothar Scholz, Apr 27, 2004
    #14
  15. Sascha Ebach

    Sascha Ebach Guest

    Hallo Lothar,

    > Bin dann ab Freitag für 3 Wochen erstmal im Urlaub und
    > kann keine Uploads dieser Grösse mehr in meinem kleinen Bergdorf an
    > der Thailändisch/Burmesischen Grenze machen.


    Wow, dir gehört ein ganzes Bergdorf? Vielleicht sollte ich auch lieber
    in die IDE Entwicklung einsteigen ;)

    >
    > Dauert leider noch etwas, na ja bei meinen Freunden lerne ich ja
    > buddistische gelassenheit - sehr wichtig das bei der Software
    > Entwicklung nicht zu vergessen.
    >


    Ich freu mich darauf. Einen schönen Urlaub wünsche ich.

    --
    Sascha Ebach
     
    Sascha Ebach, Apr 27, 2004
    #15
  16. LS> Danke. Ich freue mich über jeden Bug Report :)
    LS> .....

    Sorry for this german posting, it was intended to be a private email
    to Sascha.


    --
    Best regards, emailto: scholz at scriptolutions dot com
    Lothar Scholz http://www.ruby-ide.com
    CTO Scriptolutions Ruby, PHP, Python IDE 's
     
    Lothar Scholz, Apr 27, 2004
    #16
  17. This is very common is J2EE architectures with the Apache/Tomcat web
    app tier, and the EJB Container business logic tier. While RMI has some
    cost, it's less then expected, especially in the context of the
    business logic being the main factor for scalability. That said, a
    poorly designed webapp->businesslogic interface would generate too much
    RMI chatter and cause issues. However, a high-level cohesive high-level
    business transaction interface to the business logic tier makes RMI
    very workable, and it's also good design. And now that the business
    logic is out-of-process and you have a lot of scalability/clustering
    options.

    On Apr 27, 2004, at 1:19 AM, Dan Janowski wrote:

    >
    > On Apr 26, 2004, at 03:41, Sascha Ebach wrote:
    >>
    >> In my understanding the biggest problem in scaling Ruby (cpu wise) is
    >> that it doesn't have native thread support, yet. What this means in
    >> terms of a web application is that if you only have let's say 30
    >> concurrent users on a fairly new piece of Hardware this is not a
    >> problem. But what happens if your site suddenly gets very popular and
    >> you jump from 20 to 200 or even 2000 concurrent users? How do you
    >> scale such a web app? If you were to program this web app in Java or
    >> any other language which supports native threads you could simply
    >> throw more cpus and ram at it. I am thinking of a blade server here.
    >> The more users you get you simply stick another blade in your server
    >> and have your piece of mind. As I understand it you cannot do
    >> something like that with Ruby. Enter Distributed Ruby (DRb).
    >>
    >> As Martin Fowler states in his first law of distributed object
    >> design: Don't distribute your objects!
    >> <http://c2.com/cgi/wiki?FirstLawOfDistributedObjectDesign>
    >>

    >
    > Distributing objects is one thing, but using drb as a request/response
    > and control protocol is what I am considering at the moment. I was
    > thinking of FCGI, but it is all or nothing in the sense that it forces
    > the whole hit service out of Apache. But a considerable amount of cgi
    > processing is not session dependent and is more appropriate in the
    > Apache side (I use mod_ruby). Separating non-web application logic
    > from cgi/web processing is one way to efficiently distribute the load.
    >
    > Dan
    >
    >
     
    Nick Van Weerdenburg, Apr 27, 2004
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tanguero .

    Architecting a large-scale system

    Tanguero ., Aug 12, 2003, in forum: Java
    Replies:
    6
    Views:
    511
    alexis rzewski
    Aug 14, 2003
  2. Jeff Bronte
    Replies:
    1
    Views:
    431
    BBello5778
    Nov 8, 2003
  3. Kymert persson

    Large scale C++ software design?

    Kymert persson, Aug 14, 2003, in forum: C++
    Replies:
    2
    Views:
    596
    Kymert persson
    Aug 14, 2003
  4. Steven T. Hatton
    Replies:
    7
    Views:
    423
    Steven T. Hatton
    Apr 7, 2004
  5. puzzlecracker

    Large scale C++ software design,

    puzzlecracker, Feb 3, 2005, in forum: C++
    Replies:
    28
    Views:
    1,547
    Ioannis Vranos
    Feb 4, 2005
Loading...

Share This Page