Mutithreading to implement near 7000 to 10000 mssage per min

K

Kaja Mohaideen

Hello,

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how we
do?

Regards
Kaja Mohaidee.A
Trichy
 
R

Robert Klemme

2008/8/14 Michael T. Richter said:
On Thu, 2008-08-14 at 14:22 +0900, Kaja Mohaideen wrote:

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how we
do?

Use a quick language that's designed for massively parallel operations, not
one of the slower scripting languages without proper support for
parallelism.

There may be a few ways to get this working with Ruby:

1. use Ruby to create a CSV file or similar which is then loaded into
the database via the database's bulk loader.

2. if there is support for batch operations in DBD/DBI these might work as well.

Since this task is mostly IO bound (unless of course there are
expensive operations to be done on those messages), Ruby threads are
not too unlikely to work in this scenario.
I'm not trying to be glib here (although I may well be succeeding anyway).
What you are saying here looks an awful lot to me like "I want to hammer in
this screw here. Which wrench is best for the job?" Select your tools
appropriately for the problem. Don't try to reforge your wrenches into
bizarre combination screwdriver-hammers.

Well, maybe Kaja is just experimenting and trying out to which limits
he / she can push Ruby. :)

Kind regards

robert
 
M

M. Edward (Ed) Borasky

Use a quick language that's designed for massively parallel
operations, not one of the slower scripting languages without proper
support for parallelism.

I'm not trying to be glib here (although I may well be succeeding
anyway). What you are saying here looks an awful lot to me like "I
want to hammer in this screw here. Which wrench is best for the job?"
Select your tools appropriately for the problem. Don't try to reforge
your wrenches into bizarre combination screwdriver-hammers.

Yes ... Erlang / Mnesia should be able to handle this, and then you
could write some Ruby code to extract the messages from Mnesia to a
"regular" RDBMS if needed.
--
M. Edward (Ed) Borasky
ruby-perspectives.blogspot.com

"A mathematician is a machine for turning coffee into theorems." --
Alfréd Rényi via Paul Erdős
 
E

Ezra Zygmuntowicz

Yes ... Erlang / Mnesia should be able to handle this, and then you
could write some Ruby code to extract the messages from Mnesia to a
"regular" RDBMS if needed.


I'd say use rabbitmq with the ruby amqp library, this will allow you
to easily push many thousands of messages/sec into the rabbitmq
message bus and then you can have a set of fanout workers consuming
the queues on the other side and putting the items in the database.

-Ezra
 
A

ara.t.howard

Hello,

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how
we
do?

Regards
Kaja Mohaidee.A
Trichy


buffer them and insert them in a transaction 1000 at a time. even
with ruby this should be a peice of cake.

a @ http://codeforpeople.com/
 
M

Martin DeMello

buffer them and insert them in a transaction 1000 at a time. even with ruby
this should be a peice of cake.

Do any of the ruby db libraries offer support for doing this efficiently?

martin
 
A

ara.t.howard

Do any of the ruby db libraries offer support for doing this
efficiently?

martin

pretty much all of them


cfp:~/rails_root > cat a.rb
size = Integer(ARGV.shift || 10_000)

messages = Array.new(size).map{ rand.to_s }

Db = "#{ RAILS_ROOT }/db/#{ RAILS_ENV }.sqlite3"


# using sqlite directly
#
Message.delete_all
sql = messages.map{|message| "insert into messages(content)
values(#{ message.inspect });"}.join("\n")

a = b = response = nil
IO.popen("sqlite3 #{ Db } 2>&1", "r+") do |sqlite3|
a = Time.now.to_f
sqlite3.puts "begin;"
sqlite3.puts sql
sqlite3.puts "end;"
sqlite3.flush
sqlite3.close_write
response = sqlite3.read
b = Time.now.to_f
end

abort response unless $?.exitstatus.zero?

puts "using sqlite3"
puts "elapsed: #{ b - a }"
puts "count: #{ Message.count }"

# using ar
#
Message.delete_all
a = Time.now.to_f

Message.transaction do
messages.each{|message| Message.create! :content => message}
end

b = Time.now.to_f

puts "using ar"
puts "elapsed: #{ b - a }"
puts "count: #{ Message.count }"



cfp:~/rails_root > ./script/runner a.rb
using sqlite3
elapsed: 0.222311019897461
count: 10000

using ar
elapsed: 7.75591206550598
count: 10000

0.2 seconds for 100000 records seems plenty fast to me. 7 seconds not
so much.



a @ http://codeforpeople.com/
 
J

Jeremy Hinegardner

Do any of the ruby db libraries offer support for doing this efficiently?

martin

pretty much all of them

[...]

cfp:~/rails_root > ./script/runner a.rb
using sqlite3
elapsed: 0.222311019897461
count: 10000

using ar
elapsed: 7.75591206550598
count: 10000

0.2 seconds for 100000 records seems plenty fast to me. 7 seconds not so
much.

If your standard of performance is 10,000 records inserted in a minute, any
database should be able to satisfy your requirements.

And here's the amalgalite version of ara's test... embedded sqlite in a ruby
extension.

% cat am_inserts.rb
#!/usr/bin/env ruby
require 'rubygems'
require 'amalgalite'

size = Integer(ARGV.shift || 10_000)

messages = Array.new(size).map{ rand.to_s }

Db = "speed-test.db"

FileUtils.rm_f Db if File.exist?( Db )
db = Amalgalite::Database.new( Db )
db.execute(" CREATE TABLE messages(content); ")

before = Time.now.to_f
db.transaction do |db_in_trans|
messages.each do |m|
db_in_trans.execute("insert into messages(content) values( #{m} )")
end
end
after = Time.now.to_f
elapsed = after - before
mps = size / elapsed
puts "#{"%0.2f" % elapsed} seconds to insert #{size} records at #{"%0.2f" % mps} records per second"

% ruby am_inserts.rb
0.38 seconds to insert 10000 records at 25999.01 records per second

% ruby am_inserts.rb 100000
3.80 seconds to insert 100000 records at 26344.71 records per second

enjoy,

-jeremy
 
F

femto Zheng

Hi, I've found that this examples run batch messages insertion in a transac=
tion,
so this achieves quite impressive performance, but if I put every message s=
ave
into its own transaction, ie, switch message.each do and db.transaction do
messages.each do |m|
db.transaction do |db_in_trans|
db_in_trans.execute("insert into messages(content) values( '#{m}' )")
end
end

ruby speed-test.rb 1
0.09 seconds to insert 1 records at 10.75 records per second

ruby speed-test.rb 10
1.20 seconds to insert 10 records at 8.31 records per second

ruby speed-test.rb 100
11.22 seconds to insert 100 records at 8.91 records per second

ruby speed-test.rb 1000
132.27 seconds to insert 1000 records at 7.56 records per second
the performance goes down pretty badly, goes down to several records per se=
cond.

buffer them and insert them in a transaction 1000 at a time. =A0even w= ith
ruby this should be a peice of cake.

Do any of the ruby db libraries offer support for doing this efficientl= y?

martin

pretty much all of them

[...]

cfp:~/rails_root > ./script/runner a.rb
using sqlite3
elapsed: 0.222311019897461
count: 10000

using ar
elapsed: 7.75591206550598
count: 10000

0.2 seconds for 100000 records seems plenty fast to me. =A07 seconds not= so
much.

If your standard of performance is 10,000 records inserted in a minute, a= ny
database should be able to satisfy your requirements.

And here's the amalgalite version of ara's test... embedded sqlite in a r= uby
extension.

=A0% cat am_inserts.rb
=A0#!/usr/bin/env ruby
=A0require 'rubygems'
=A0require 'amalgalite'

=A0size =3D Integer(ARGV.shift || 10_000)

=A0messages =3D Array.new(size).map{ rand.to_s }

=A0Db =3D "speed-test.db"

=A0FileUtils.rm_f Db if File.exist?( Db )
=A0db =3D Amalgalite::Database.new( Db )
=A0db.execute(" CREATE TABLE messages(content); ")

=A0before =3D Time.now.to_f
=A0db.transaction do |db_in_trans|
=A0 =A0messages.each do |m|
=A0 =A0 =A0db_in_trans.execute("insert into messages(content) values( #{m= } )")
=A0 =A0end
=A0end
=A0after =3D Time.now.to_f
=A0elapsed =3D after - before
=A0mps =3D size / elapsed
=A0puts "#{"%0.2f" % elapsed} seconds to insert #{size} records at #{"%0.= 2f" % mps} records per second"

=A0% ruby am_inserts.rb
=A00.38 seconds to insert 10000 records at 25999.01 records per second

=A0% ruby am_inserts.rb 100000
=A03.80 seconds to insert 100000 records at 26344.71 records per second

enjoy,

-jeremy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,019
Latest member
RoxannaSta

Latest Threads

Top