Memory leak problem

E

Elias Orozco

I have a txt file with some data that i need to import to de database.
I'm using ruby to import those data but i have a major problem, when i
've
a large amount of data i run out of memory.


File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
@stringArray = line.split("|")
@i += 1
puts @i
@pid = @stringArray[0]
@chain_id = @stringArray[1]
@business = Business.find_by_pid_and_chain_id(@pid,@chain_id);
#Check PID + CHAIN_ID
@business.pid = @stringArray[0]
@business.chain_id = @stringArray[1]
@business.cityname = @stringArray[17]
@business.state = @stringArray[18]
@business.business =
Business.find_by_pid_and_chain_id(@pid,@chain_id);
@business.city = City.new
@business.business_category = get_category_id(@stringArray[40])
@business.address = @stringArray[8] +" "+ @stringArray[9] +" "+
@stringArray[10]+" "+ @stringArray[11] +" "+@stringArray[12]+"
"+@stringArray[13]+" "+@stringArray[14]
if @chain_id == nil
@chain_id = ""
end
business.save
end
end


I belive that ruby use in every cycle of the do new blocks of memories
por my instances
of Business. Can someone help me please?

Thanks,

Elioncho
 
R

Roger Pack

Elias said:
I have a txt file with some data that i need to import to de database.
I'm using ruby to import those data but i have a major problem, when i
've
a large amount of data i run out of memory.


File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
@stringArray = line.split("|")
@i += 1
puts @i
@pid = @stringArray[0]
@chain_id = @stringArray[1]
@business = Business.find_by_pid_and_chain_id(@pid,@chain_id);
#Check PID + CHAIN_ID
@business.pid = @stringArray[0]
@business.chain_id = @stringArray[1]
@business.cityname = @stringArray[17]
@business.state = @stringArray[18]
@business.business =
Business.find_by_pid_and_chain_id(@pid,@chain_id);
@business.city = City.new
@business.business_category = get_category_id(@stringArray[40])
@business.address = @stringArray[8] +" "+ @stringArray[9] +" "+
@stringArray[10]+" "+ @stringArray[11] +" "+@stringArray[12]+"
"+@stringArray[13]+" "+@stringArray[14]
if @chain_id == nil
@chain_id = ""
end
business.save
end
end


I belive that ruby use in every cycle of the do new blocks of memories
for my instances
of Business.

Yes that's right. You've possibly run into an infamous "ruby's GC is
broken!" bug. Then again, I could be wrong.
Question:
don't you want @business.save?

Anyway ways to avoid this:
you may be able to use ar extensions, which allows for multiple inserts
into the DB. Oh wait, except that you are doing multiple updates. Never
mind.
In that case, as gross as it seems, you could try forking once per loop
[or once every x lines of the input file]--that way the forked process
will die [with its high RAM consumption] allowing the parent process to
continue [and fork more].

require 'forkoff'
File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
[1].forkoff {
# do your stuff
}
end

Maybe it won't help. I know with huge data sets it helps me avoid the
GC. I've never tried it with sql and rails.
Good luck.
-=R
 
D

daniel hoey

I have a txt file with some data that i need to import to de database.
I'm using ruby to import those data but i have a major problem, when i
've
a large amount of data i run out of memory.

File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
@stringArray = line.split("|")
@i += 1
puts @i
@pid = @stringArray[0]
@chain_id = @stringArray[1]
@business = Business.find_by_pid_and_chain_id(@pid,@chain_id);
#Check PID + CHAIN_ID
@business.pid = @stringArray[0]
@business.chain_id = @stringArray[1]
@business.cityname = @stringArray[17]
@business.state = @stringArray[18]
@business.business =
Business.find_by_pid_and_chain_id(@pid,@chain_id);
@business.city = City.new
@business.business_category = get_category_id(@stringArray[40])
@business.address = @stringArray[8] +" "+ @stringArray[9] +" "+
@stringArray[10]+" "+ @stringArray[11] +" "+@stringArray[12]+"
"+@stringArray[13]+" "+@stringArray[14]
if @chain_id == nil
@chain_id = ""
end
business.save
end
end

I belive that ruby use in every cycle of the do new blocks of memories
por my instances
of Business. Can someone help me please?

Thanks,

Elioncho

I would suggest that you don't use member variables in this situation
if it is possible (ie 'business' rather than '@business', 'i' rather
then '@i'). It doesn't look like you need member variables here and it
could will mean that the GC won't collect some memory that it
otherwise could.

You could also try a patched version of ruby with better memory
collection (http://lloydforge.org/projects/ruby/,
http://blog.pluron.com/2008/01/ruby-on-rails-i/comments/page/2/ or
http://www.rubyenterpriseedition.com/)

Also try explicitly calling GC.start at the beginning of every loop
(or every few loops). This will slow your code down a lot but I've
occasionally seen cases where it has helped.

All that said, the Ruby GC is a bit crap and you might just have use
the fork approach suggested by Roger.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

memory leak 26
FFI Memory Leak 4
Memory leak 8
Memory leak in 1.9.2-p330? 2
Number of objects grows unbouned...Memory leak 1
Memory Leak (again) 3
Threads, Queues and possible memory leak 3
mysql gem memory leak 1

Members online

No members online now.

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,112
Latest member
VinayKumar Nevatia
Top