How do I reduce the memory usage of a script?

Scott Ellsworth · Jul 13, 2005

Hi, all.

Please find attached a simple Ruby script that rummages through my
ITunes files, reads the first megabyte or so, finds the encoder, and
then prints the encoder and filename. This lets me know which tracks
need re-ripping.

This script blows through half a gig of RAM while running, and I really
do not see why. It should only have perhaps a few megabytes at max in
RAM.

FWIW, the output looks like:
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song1.m4a
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song2.m4a
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song3.m4a

Style and speed optimizations are accepted, but the runtime is under a
minute now for the 5500 files I have in my library, so memory usage is
my real problem.

Help?

#!/usr/bin/env ruby
require 'find'
def procpath(f)
if File.file?(f) then
if File.fnmatch("*.m4a",f) then
found = false
data = IO.read(f, 65536*8)
re = /[[:alnum:]_., ]{9,}/
data.scan(re) do |string|
if (string =~ /QuickTime/) then
filename = File.basename(f)
dirname = File.dirname(f)
# puts "#{string} #{dirname}"
puts "#{string} #{dirname} #{filename}"
found = true
break
end
end
if (!found) then
puts "Unknown #{f}"
end
end
elsif File.directory?(f) && !File.fnmatch(".", f) &&
!File.fnmatch("..", f) then
Dir.foreach(f) { |subf| procpath(subf) }
end
end

Find.find("/Users/work/Music/iTunes/iTunes Music/") do |f|
procpath(f)
end

Scott

John Carter · Jul 13, 2005

#!/usr/bin/env ruby
require 'find'
def procpath(f)
if File.file?(f) then
if File.fnmatch("*.m4a",f) then
found = false
data = IO.read(f, 65536*8)
re = /[[:alnum:]_., ]{9,}/
data.scan(re) do |string|
if (string =~ /QuickTime/) then
filename = File.basename(f)
dirname = File.dirname(f)
# puts "#{string} #{dirname}"
puts "#{string} #{dirname} #{filename}"
found = true
break
end
end
if (!found) then
puts "Unknown #{f}"
end
end
elsif File.directory?(f) && !File.fnmatch(".", f) &&
!File.fnmatch("..", f) then
Dir.foreach(f) { |subf| procpath(subf) }

Why are you recursing here? Find.find does this stuff for you!

end
end

Find.find("/Users/work/Music/iTunes/iTunes Music/") do |f|
procpath(f)
end

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

Logan Capaldo · Jul 13, 2005

Hi, all.

Please find attached a simple Ruby script that rummages through my
ITunes files, reads the first megabyte or so, finds the encoder, and
then prints the encoder and filename. This lets me know which tracks
need re-ripping.

This script blows through half a gig of RAM while running, and I
really
do not see why. It should only have perhaps a few megabytes at max in
RAM.

FWIW, the output looks like:
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song1.m4a
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song2.m4a
iTunes v4.9, QuickTime 7.0.1 /Users/work/Music/iTunes/iTunes
Music/Yellowcard/Ocean Avenue Song3.m4a

Style and speed optimizations are accepted, but the runtime is under a
minute now for the 5500 files I have in my library, so memory usage is
my real problem.

Help?

#!/usr/bin/env ruby
require 'find'
def procpath(f)
if File.file?(f) then
if File.fnmatch("*.m4a",f) then
found = false
data = IO.read(f, 65536*8)
re = /[[:alnum:]_., ]{9,}/
data.scan(re) do |string|
if (string =~ /QuickTime/) then
filename = File.basename(f)
dirname = File.dirname(f)
# puts "#{string} #{dirname}"
puts "#{string} #{dirname} #{filename}"
found = true
break
end
end
if (!found) then
puts "Unknown #{f}"
end
end
elsif File.directory?(f) && !File.fnmatch(".", f) &&
!File.fnmatch("..", f) then
Dir.foreach(f) { |subf| procpath(subf) }
end
end

Find.find("/Users/work/Music/iTunes/iTunes Music/") do |f|
procpath(f)
end

Scott

Well mileage may vary and all that jazz, but on my box it took up
like ~30M virtual according to top and like 1.5MB ~ 2MB physical.
Have you tried explicity invoking the GC?

daz · Jul 13, 2005

Scott said:
Hi, all.
[...]

This script blows through half a gig of RAM while running, and I really
do not see why. It should only have perhaps a few megabytes at max in
RAM.
[...]

if (!found) then
puts "Unknown #{f}"

else
data = nil
GC.start # garbage collect

end

Any better with that addition ?

daz

daz · Jul 13, 2005

(Called away from keyboard)

Compare last with:

if (!found) then
puts "Unknown #{f}"
end
data = nil
GC.start # garbage collect

.... which will garbage collect more often.

Best,

daz

Scott Ellsworth · Jul 13, 2005

daz said:
if (!found) then
puts "Unknown #{f}"
end
data = nil
GC.start # garbage collect

This did seem to drop the memory usage on my MacOS X 10.4.2 system.

I will investigate the Find.find command next to see if I can get rid of
some recursion. An array of 5500 paths should not be _that_ big, at
least in comparison with four or five levels of directory depth.

Scott

Robert Klemme · Jul 14, 2005

Scott said:
This did seem to drop the memory usage on my MacOS X 10.4.2 system.

I will investigate the Find.find command next to see if I can get rid
of some recursion. An array of 5500 paths should not be _that_ big,
at least in comparison with four or five levels of directory depth.

The problem might be that the data is still around while you enter the
recursion. If you want to verify that this is the case you can simply do
data = nil after processing. But: You definitely need to throw out the
recursion from propath() - otherwise you'll be processing directories over
and over again (I smell something like O(n*n) here)!

Kind regards

robert

Scott Ellsworth · Jul 18, 2005

Robert Klemme said:
The problem might be that the data is still around while you enter the
recursion. If you want to verify that this is the case you can simply do
data = nil after processing. But: You definitely need to throw out the
recursion from propath() - otherwise you'll be processing directories over
and over again (I smell something like O(n*n) here)!

I have removed the recursion - see below.

A question, though, Is the String.scan method I used the best way to do
the scan this block of data? Every file is going to contain the string
'QuickTime' somewhere in the first few MB, and I want from the last
nonprintable character before it to the next nonprintable character
after. I only need to read from disk until I find that string, and once
I find it, I need only the bytes before, plus a version number
afterwards. I certainly do not need to manipulate more than a few
hundred characters around that magic string, and once I have read, I do
not need to go back.

NB - nonprintable here is defined as [[:alnum:]_., ]

work@boggle

esktop$ time ./detectEncoding.rb > songs.txt

real 3m30.563s
user 0m26.229s
sys 0m23.746s

New code:

#!/usr/bin/env ruby
require 'find'
re = /[[:alnum:]_., ]{9,}/
Find.find("/Users/work/Music/iTunes/iTunes Music/") do |f|
if File.file?(f) && File.fnmatch("*.m4a",f) then
found = false
data = IO.read(f, 65536*8)
data.scan(re) do |string|
if (string =~ /QuickTime/) then
filename = File.basename(f)
dirname = File.dirname(f)
puts "#{string} #{dirname}"
# puts "#{string} #{dirname} #{filename}"
found = true
break
end
end
if (!found) then
puts "Unknown #{f}"
end
data = nil
GC.start # garbage collect
end
end

Scott

Robert Klemme · Aug 3, 2005

Scott Ellsworth said:
Robert Klemme said:

The problem might be that the data is still around while you enter
the recursion. If you want to verify that this is the case you can
simply do data = nil after processing. But: You definitely need to
throw out the recursion from propath() - otherwise you'll be
processing directories over and over again (I smell something like
O(n*n) here)!

Click to expand...

I have removed the recursion - see below.

A question, though, Is the String.scan method I used the best way to
do the scan this block of data? Every file is going to contain the
string 'QuickTime' somewhere in the first few MB, and I want from the
last nonprintable character before it to the next nonprintable
character after. I only need to read from disk until I find that
string, and once I find it, I need only the bytes before, plus a
version number afterwards. I certainly do not need to manipulate
more than a few hundred characters around that magic string, and once
I have read, I do not need to go back.

NB - nonprintable here is defined as [[:alnum:]_., ]

The problem with your script is that it does not find "QuickTime" if your
chunk reading cuts it in half (or "Q" and "uickTime" - whatever). It might
be easier to just slurp in the complete file (depending on size - a few MB
are no problem) and then do the scan on the single string. Also, I don't
understand why you don't put QuickTime into your search RE.

Kind regards

robert

work@boggleesktop$ time ./detectEncoding.rb > songs.txt

real 3m30.563s
user 0m26.229s
sys 0m23.746s

New code:

#!/usr/bin/env ruby
require 'find'
re = /[[:alnum:]_., ]{9,}/
Find.find("/Users/work/Music/iTunes/iTunes Music/") do |f|
if File.file?(f) && File.fnmatch("*.m4a",f) then
found = false
data = IO.read(f, 65536*8)
data.scan(re) do |string|
if (string =~ /QuickTime/) then
filename = File.basename(f)
dirname = File.dirname(f)
puts "#{string} #{dirname}"
# puts "#{string} #{dirname} #{filename}"
found = true
break
end
end
if (!found) then
puts "Unknown #{f}"
end
data = nil
GC.start # garbage collect
end
end

Scott

How do I rename and copy a file on the server?	1	Nov 21, 2025
How do I quickly search the end of a huge text file?	2	Sep 4, 2008
FAQ 4.32 How do I strip blank space from the beginning/end of a string?	0	Feb 25, 2011
How do I estimate how long it will take a calculation to complete?	8	Aug 20, 2009
FAQ 5.3 How do I count the number of lines in a file?	0	Jan 31, 2011
FAQ 5.24 All I want to do is append a small amount of text to the end of a file. Do I still have to	0	Feb 1, 2011
FAQ 5.4 How do I delete the last N lines from a file?	0	Jan 31, 2011
Consistently inconsistent error regarding the nature of a control -how do I fix this?	4	Aug 24, 2009

How do I reduce the memory usage of a script?

Scott Ellsworth

John Carter

Logan Capaldo

daz

daz

Scott Ellsworth

Robert Klemme

Scott Ellsworth

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads