Finding Duplicate MP3s

K

Khurrum Ma

I'm a RubyNoobie and I am writing a few random scripts to learn the
language. Would it be alright to post scripts to get them criticized by
pros?

This is supposed to search a folder to find duplicate MP3s and move them
to a destination directory.

*When you search a directory structure with lots of branches then
mp3info gives an error.
*The source and destination folders NEED to have the backslash \ or else
the script does horrible things. So it needs better input checking

I don't always know the best programmer practices and it would help me
if someone can make suggestions.


# Finds duplicate MP3s and move them to stated destination
# usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir\
# the ending \ slash is REQUIRED
require "mp3info"
require "ftools"

class DuplicateMP3
def initialize()
unless ARGV.length == 2
puts "usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir\"
exit
end
@dir_source = ARGV[0]
@dir_destination = ARGV[1]

showallduplicates()
end

def showallduplicates()
directory = @dir_source
duplicates = Hash.new { |h,k| h[k] = [] }
Dir.chdir(@dir_source)

#Check if the file is a file then
#Make a unique ID with the mp3 title and mp3 song length combined
#Enter unique into the hash as a key
#If unique key is already in the hash then move the file to destination
folder

Dir["**/*.mp3"].each do |file|
if File.file?(file)
Mp3Info.open(file) do |mp3|
unique = mp3.tag.title.to_s + mp3.length.to_s
if (duplicates.has_key?(unique))
puts "duplicate: #{file}. Moving..."
movedupes(file)
else
duplicates[unique].push(file)
end
end
end
end
end

def movedupes(file)
source = @dir_source + file.to_s
dest = @dir_destination + File.basename(file.to_s)
File.move(source, dest)
puts "file moved"
end

end

find = DuplicateMP3.new()
 
J

Judson Lester

Note: parts of this message were removed by the gateway to make it a legal Usenet post.

I don't always know the best programmer practices and it would help me

Sure.


class DuplicateMP3
def initialize()
unless ARGV.length == 2
puts "usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir\"
exit
end
@dir_source = ARGV[0]
@dir_destination = ARGV[1]

showallduplicates()


Meh. I'd pass source and dest to initialize directly. I'd call
showallduplicates() on the DuplicateMP3 object directly. If you really
want, add a class method like

def self.do_it(src, dest)
mover = new(src,dest)
mover.showallduplicates()
end

Maybe. I'm shaky on this actually needing to be a class, but I guess it
works okay...

end
def showallduplicates()
directory = @dir_source
duplicates = Hash.new { |h,k| h[k] = [] }
Dir.chdir(@dir_source)

#Check if the file is a file then
#Make a unique ID with the mp3 title and mp3 song length combined
#Enter unique into the hash as a key
#If unique key is already in the hash then move the file to destination
folder

Here's the biggest thing: look into using Find (in the standard library.)
It's almost always superior to Dir, unless you really don't want to recurse
at all.
Dir["**/*.mp3"].each do |file|
if File.file?(file)
Mp3Info.open(file) do |mp3|
unique = mp3.tag.title.to_s + mp3.length.to_s

I'd use [mp3.tag.title.to_s, mp3.length] instead. It's certainly a corner
case, but it'd be a shame if you had "song" that was 210 long treated as a
dupe of "song2", 10 long.

if (duplicates.has_key?(unique))
puts "duplicate: #{file}. Moving..."
movedupes(file)
else
duplicates[unique].push(file)

Meh again. Use a regular hash, and replace this line with
"duplicates[unique] = true" since you don't ever use the values of the
duplicates hash.

end
end
end
end
end

def movedupes(file)


And the solution to your input problem is:
"source = File::join(@dir_source, file.to_s)"
If you use Find, it'll be a little easier here, and tricker in the
destination, because Find will give you the full path anyway.
source = @dir_source + file.to_s


Instead of File.basename, you'll want to use something else. If you're
using Find, you'll get reasonably full paths, so you might be able to do
something like
file.to_s.sub(%r{^#{Regexp::escape(@dir_source)}}, "")
which will find the source path and replace it with an empty string. I
don't know a better way work with the paths, honestly.
And you'd use File::join again here, as well.
dest = @dir_destination + File.basename(file.to_s)
File.move(source, dest)
puts "file moved"
end

end

find = DuplicateMP3.new()

Ultimately, I think you want to have this look like
if($0 == __FILE__)
#Put the ARGV.length check here
find = DuplicateMP3.new(ARGV[0], ARGV[1])
find.showallduplicates()
end
 
J

Judson Lester

Note: parts of this message were removed by the gateway to make it a legal Usenet post.

I don't always know the best programmer practices and it would help me

Sure.


class DuplicateMP3
def initialize()
unless ARGV.length == 2
puts "usage: duplicatemp3.rb c:\source\dir\ c:\destination\dir\"
exit
end
@dir_source = ARGV[0]
@dir_destination = ARGV[1]

showallduplicates()


Meh. I'd pass source and dest to initialize directly. I'd call
showallduplicates() on the DuplicateMP3 object directly. If you really
want, add a class method like

def self.do_it(src, dest)
mover = new(src,dest)
mover.showallduplicates()
end

Maybe. I'm shaky on this actually needing to be a class, but I guess it
works okay...

end
def showallduplicates()
directory = @dir_source
duplicates = Hash.new { |h,k| h[k] = [] }
Dir.chdir(@dir_source)

#Check if the file is a file then
#Make a unique ID with the mp3 title and mp3 song length combined
#Enter unique into the hash as a key
#If unique key is already in the hash then move the file to destination
folder

Here's the biggest thing: look into using Find (in the standard library.)
It's almost always superior to Dir, unless you really don't want to recurse
at all.
Dir["**/*.mp3"].each do |file|
if File.file?(file)
Mp3Info.open(file) do |mp3|
unique = mp3.tag.title.to_s + mp3.length.to_s

I'd use [mp3.tag.title.to_s, mp3.length] instead. It's certainly a corner
case, but it'd be a shame if you had "song" that was 210 long treated as a
dupe of "song2", 10 long.

if (duplicates.has_key?(unique))
puts "duplicate: #{file}. Moving..."
movedupes(file)
else
duplicates[unique].push(file)

Meh again. Use a regular hash, and replace this line with
"duplicates[unique] = true" since you don't ever use the values of the
duplicates hash.

end
end
end
end
end

def movedupes(file)


And the solution to your input problem is:
"source = File::join(@dir_source, file.to_s)"
If you use Find, it'll be a little easier here, and tricker in the
destination, because Find will give you the full path anyway.
source = @dir_source + file.to_s


Instead of File.basename, you'll want to use something else. If you're
using Find, you'll get reasonably full paths, so you might be able to do
something like
file.to_s.sub(%r{^#{Regexp::escape(@dir_source)}}, "")
which will find the source path and replace it with an empty string. I
don't know a better way work with the paths, honestly.
And you'd use File::join again here, as well.
dest = @dir_destination + File.basename(file.to_s)
File.move(source, dest)
puts "file moved"
end

end

find = DuplicateMP3.new()

Ultimately, I think you want to have this look like
if($0 == __FILE__)
#Put the ARGV.length check here
find = DuplicateMP3.new(ARGV[0], ARGV[1])
find.showallduplicates()
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top