Mutithreading to implement near 7000 to 10000 mssage per min

Discussion in 'Ruby' started by Kaja Mohaideen, Aug 14, 2008.

  1. Hello,

    we want to process 7000 to 10000 message and store it to the Database
    perminute. we are trying through threads. can you please help me how we
    do?

    Regards
    Kaja Mohaidee.A
    Trichy
    --
    Posted via http://www.ruby-forum.com/.
     
    Kaja Mohaideen, Aug 14, 2008
    #1
    1. Advertising

  2. 2008/8/14 Michael T. Richter <>:
    > On Thu, 2008-08-14 at 14:22 +0900, Kaja Mohaideen wrote:
    >
    > we want to process 7000 to 10000 message and store it to the Database
    > perminute. we are trying through threads. can you please help me how we
    > do?
    >
    > Use a quick language that's designed for massively parallel operations, not
    > one of the slower scripting languages without proper support for
    > parallelism.


    There may be a few ways to get this working with Ruby:

    1. use Ruby to create a CSV file or similar which is then loaded into
    the database via the database's bulk loader.

    2. if there is support for batch operations in DBD/DBI these might work as well.

    Since this task is mostly IO bound (unless of course there are
    expensive operations to be done on those messages), Ruby threads are
    not too unlikely to work in this scenario.

    > I'm not trying to be glib here (although I may well be succeeding anyway).
    > What you are saying here looks an awful lot to me like "I want to hammer in
    > this screw here. Which wrench is best for the job?" Select your tools
    > appropriately for the problem. Don't try to reforge your wrenches into
    > bizarre combination screwdriver-hammers.


    Well, maybe Kaja is just experimenting and trying out to which limits
    he / she can push Ruby. :)

    Kind regards

    robert


    --
    use.inject do |as, often| as.you_can - without end
     
    Robert Klemme, Aug 14, 2008
    #2
    1. Advertising

  3. On Thu, 2008-08-14 at 16:52 +0900, Michael T. Richter wrote:
    > On Thu, 2008-08-14 at 14:22 +0900, Kaja Mohaideen wrote:
    > > we want to process 7000 to 10000 message and store it to the Database
    > > perminute. we are trying through threads. can you please help me how we
    > > do?

    >
    > Use a quick language that's designed for massively parallel
    > operations, not one of the slower scripting languages without proper
    > support for parallelism.
    >
    > I'm not trying to be glib here (although I may well be succeeding
    > anyway). What you are saying here looks an awful lot to me like "I
    > want to hammer in this screw here. Which wrench is best for the job?"
    > Select your tools appropriately for the problem. Don't try to reforge
    > your wrenches into bizarre combination screwdriver-hammers.


    Yes ... Erlang / Mnesia should be able to handle this, and then you
    could write some Ruby code to extract the messages from Mnesia to a
    "regular" RDBMS if needed.
    --
    M. Edward (Ed) Borasky
    ruby-perspectives.blogspot.com

    "A mathematician is a machine for turning coffee into theorems." --
    Alfréd Rényi via Paul Erdős
     
    M. Edward (Ed) Borasky, Aug 14, 2008
    #3
  4. On Aug 14, 2008, at 6:28 AM, M. Edward (Ed) Borasky wrote:

    > On Thu, 2008-08-14 at 16:52 +0900, Michael T. Richter wrote:
    >> On Thu, 2008-08-14 at 14:22 +0900, Kaja Mohaideen wrote:
    >>> we want to process 7000 to 10000 message and store it to the
    >>> Database
    >>> perminute. we are trying through threads. can you please help me
    >>> how we
    >>> do?

    >>
    >> Use a quick language that's designed for massively parallel
    >> operations, not one of the slower scripting languages without proper
    >> support for parallelism.
    >>
    >> I'm not trying to be glib here (although I may well be succeeding
    >> anyway). What you are saying here looks an awful lot to me like "I
    >> want to hammer in this screw here. Which wrench is best for the
    >> job?"
    >> Select your tools appropriately for the problem. Don't try to
    >> reforge
    >> your wrenches into bizarre combination screwdriver-hammers.

    >
    > Yes ... Erlang / Mnesia should be able to handle this, and then you
    > could write some Ruby code to extract the messages from Mnesia to a
    > "regular" RDBMS if needed.



    I'd say use rabbitmq with the ruby amqp library, this will allow you
    to easily push many thousands of messages/sec into the rabbitmq
    message bus and then you can have a set of fanout workers consuming
    the queues on the other side and putting the items in the database.

    -Ezra
     
    Ezra Zygmuntowicz, Aug 14, 2008
    #4
  5. Kaja Mohaideen

    ara.t.howard Guest

    On Aug 13, 2008, at 11:22 PM, Kaja Mohaideen wrote:

    > Hello,
    >
    > we want to process 7000 to 10000 message and store it to the Database
    > perminute. we are trying through threads. can you please help me how
    > we
    > do?
    >
    > Regards
    > Kaja Mohaidee.A
    > Trichy



    buffer them and insert them in a transaction 1000 at a time. even
    with ruby this should be a peice of cake.

    a @ http://codeforpeople.com/
    --
    we can deny everything, except that we have the possibility of being
    better. simply reflect on that.
    h.h. the 14th dalai lama
     
    ara.t.howard, Aug 14, 2008
    #5
  6. On Thu, Aug 14, 2008 at 12:06 PM, ara.t.howard <> wrote:
    >
    > buffer them and insert them in a transaction 1000 at a time. even with ruby
    > this should be a peice of cake.


    Do any of the ruby db libraries offer support for doing this efficiently?

    martin
     
    Martin DeMello, Aug 14, 2008
    #6
  7. Kaja Mohaideen

    ara.t.howard Guest

    On Aug 14, 2008, at 1:10 PM, Martin DeMello wrote:

    > On Thu, Aug 14, 2008 at 12:06 PM, ara.t.howard
    > <> wrote:
    >>
    >> buffer them and insert them in a transaction 1000 at a time. even
    >> with ruby
    >> this should be a peice of cake.

    >
    > Do any of the ruby db libraries offer support for doing this
    > efficiently?
    >
    > martin


    pretty much all of them


    cfp:~/rails_root > cat a.rb
    size = Integer(ARGV.shift || 10_000)

    messages = Array.new(size).map{ rand.to_s }

    Db = "#{ RAILS_ROOT }/db/#{ RAILS_ENV }.sqlite3"


    # using sqlite directly
    #
    Message.delete_all
    sql = messages.map{|message| "insert into messages(content)
    values(#{ message.inspect });"}.join("\n")

    a = b = response = nil
    IO.popen("sqlite3 #{ Db } 2>&1", "r+") do |sqlite3|
    a = Time.now.to_f
    sqlite3.puts "begin;"
    sqlite3.puts sql
    sqlite3.puts "end;"
    sqlite3.flush
    sqlite3.close_write
    response = sqlite3.read
    b = Time.now.to_f
    end

    abort response unless $?.exitstatus.zero?

    puts "using sqlite3"
    puts "elapsed: #{ b - a }"
    puts "count: #{ Message.count }"

    # using ar
    #
    Message.delete_all
    a = Time.now.to_f

    Message.transaction do
    messages.each{|message| Message.create! :content => message}
    end

    b = Time.now.to_f

    puts "using ar"
    puts "elapsed: #{ b - a }"
    puts "count: #{ Message.count }"



    cfp:~/rails_root > ./script/runner a.rb
    using sqlite3
    elapsed: 0.222311019897461
    count: 10000

    using ar
    elapsed: 7.75591206550598
    count: 10000

    0.2 seconds for 100000 records seems plenty fast to me. 7 seconds not
    so much.



    a @ http://codeforpeople.com/
    --
    we can deny everything, except that we have the possibility of being
    better. simply reflect on that.
    h.h. the 14th dalai lama
     
    ara.t.howard, Aug 14, 2008
    #7
  8. On Fri, Aug 15, 2008 at 05:56:46AM +0900, ara.t.howard wrote:
    >
    > On Aug 14, 2008, at 1:10 PM, Martin DeMello wrote:
    >
    >> On Thu, Aug 14, 2008 at 12:06 PM, ara.t.howard <>
    >> wrote:
    >>>
    >>> buffer them and insert them in a transaction 1000 at a time. even with
    >>> ruby this should be a peice of cake.

    >>
    >> Do any of the ruby db libraries offer support for doing this efficiently?
    >>
    >> martin

    >
    > pretty much all of them
    >


    [...]

    > cfp:~/rails_root > ./script/runner a.rb
    > using sqlite3
    > elapsed: 0.222311019897461
    > count: 10000
    >
    > using ar
    > elapsed: 7.75591206550598
    > count: 10000
    >
    > 0.2 seconds for 100000 records seems plenty fast to me. 7 seconds not so
    > much.


    If your standard of performance is 10,000 records inserted in a minute, any
    database should be able to satisfy your requirements.

    And here's the amalgalite version of ara's test... embedded sqlite in a ruby
    extension.

    % cat am_inserts.rb
    #!/usr/bin/env ruby
    require 'rubygems'
    require 'amalgalite'

    size = Integer(ARGV.shift || 10_000)

    messages = Array.new(size).map{ rand.to_s }

    Db = "speed-test.db"

    FileUtils.rm_f Db if File.exist?( Db )
    db = Amalgalite::Database.new( Db )
    db.execute(" CREATE TABLE messages(content); ")

    before = Time.now.to_f
    db.transaction do |db_in_trans|
    messages.each do |m|
    db_in_trans.execute("insert into messages(content) values( #{m} )")
    end
    end
    after = Time.now.to_f
    elapsed = after - before
    mps = size / elapsed
    puts "#{"%0.2f" % elapsed} seconds to insert #{size} records at #{"%0.2f" % mps} records per second"

    % ruby am_inserts.rb
    0.38 seconds to insert 10000 records at 25999.01 records per second

    % ruby am_inserts.rb 100000
    3.80 seconds to insert 100000 records at 26344.71 records per second

    enjoy,

    -jeremy

    --
    ========================================================================
    Jeremy Hinegardner
     
    Jeremy Hinegardner, Aug 14, 2008
    #8
  9. Kaja Mohaideen

    femto Zheng Guest

    Hi, I've found that this examples run batch messages insertion in a transac=
    tion,
    so this achieves quite impressive performance, but if I put every message s=
    ave
    into its own transaction, ie, switch message.each do and db.transaction do
    messages.each do |m|
    db.transaction do |db_in_trans|
    db_in_trans.execute("insert into messages(content) values( '#{m}' )")
    end
    end

    ruby speed-test.rb 1
    0.09 seconds to insert 1 records at 10.75 records per second

    ruby speed-test.rb 10
    1.20 seconds to insert 10 records at 8.31 records per second

    ruby speed-test.rb 100
    11.22 seconds to insert 100 records at 8.91 records per second

    ruby speed-test.rb 1000
    132.27 seconds to insert 1000 records at 7.56 records per second
    the performance goes down pretty badly, goes down to several records per se=
    cond.

    On Fri, Aug 15, 2008 at 5:25 AM, Jeremy
    Hinegardner<> wrote:
    > On Fri, Aug 15, 2008 at 05:56:46AM +0900, ara.t.howard wrote:
    >>
    >> On Aug 14, 2008, at 1:10 PM, Martin DeMello wrote:
    >>
    >>> On Thu, Aug 14, 2008 at 12:06 PM, ara.t.howard <>
    >>> wrote:
    >>>>
    >>>> buffer them and insert them in a transaction 1000 at a time. =A0even w=

    ith
    >>>> ruby this should be a peice of cake.
    >>>
    >>> Do any of the ruby db libraries offer support for doing this efficientl=

    y?
    >>>
    >>> martin

    >>
    >> pretty much all of them
    >>

    >
    > [...]
    >
    >> cfp:~/rails_root > ./script/runner a.rb
    >> using sqlite3
    >> elapsed: 0.222311019897461
    >> count: 10000
    >>
    >> using ar
    >> elapsed: 7.75591206550598
    >> count: 10000
    >>
    >> 0.2 seconds for 100000 records seems plenty fast to me. =A07 seconds not=

    so
    >> much.

    >
    > If your standard of performance is 10,000 records inserted in a minute, a=

    ny
    > database should be able to satisfy your requirements.
    >
    > And here's the amalgalite version of ara's test... embedded sqlite in a r=

    uby
    > extension.
    >
    > =A0% cat am_inserts.rb
    > =A0#!/usr/bin/env ruby
    > =A0require 'rubygems'
    > =A0require 'amalgalite'
    >
    > =A0size =3D Integer(ARGV.shift || 10_000)
    >
    > =A0messages =3D Array.new(size).map{ rand.to_s }
    >
    > =A0Db =3D "speed-test.db"
    >
    > =A0FileUtils.rm_f Db if File.exist?( Db )
    > =A0db =3D Amalgalite::Database.new( Db )
    > =A0db.execute(" CREATE TABLE messages(content); ")
    >
    > =A0before =3D Time.now.to_f
    > =A0db.transaction do |db_in_trans|
    > =A0 =A0messages.each do |m|
    > =A0 =A0 =A0db_in_trans.execute("insert into messages(content) values( #{m=

    } )")
    > =A0 =A0end
    > =A0end
    > =A0after =3D Time.now.to_f
    > =A0elapsed =3D after - before
    > =A0mps =3D size / elapsed
    > =A0puts "#{"%0.2f" % elapsed} seconds to insert #{size} records at #{"%0.=

    2f" % mps} records per second"
    >
    > =A0% ruby am_inserts.rb
    > =A00.38 seconds to insert 10000 records at 25999.01 records per second
    >
    > =A0% ruby am_inserts.rb 100000
    > =A03.80 seconds to insert 100000 records at 26344.71 records per second
    >
    > enjoy,
    >
    > -jeremy
    >
    > --
    > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    > =A0Jeremy Hinegardner =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=

    =A0 =
    >
    >
    >
     
    femto Zheng, Jul 27, 2009
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lois
    Replies:
    1
    Views:
    3,312
    Ryan Stewart
    Dec 27, 2004
  2. Jobs
    Replies:
    4
    Views:
    365
    kaldrenon
    Aug 3, 2007
  3. ameyas7

    mutithreading in webapps

    ameyas7, Aug 5, 2007, in forum: Java
    Replies:
    7
    Views:
    362
    Mike Schilling
    Aug 6, 2007
  4. Barathi

    earning $10000 per month .....

    Barathi, Nov 17, 2007, in forum: Java
    Replies:
    0
    Views:
    352
    Barathi
    Nov 17, 2007
  5. Todd

    Formatting decimals with 2 digits - 3.70 vs 3.7000 ???

    Todd, Nov 24, 2003, in forum: ASP .Net Datagrid Control
    Replies:
    1
    Views:
    166
    Mel Freeman
    Nov 25, 2003
Loading...

Share This Page