Fastest Array Search?

D

Derek Cannon

Hello, everyone! What would be the quickest/best way to check elements
of an array, which themselves are arrays of hashes? Quick example:

mov1 = {"title"=>"Batman Returns", "genre"=>"Action"}
mov2 = {"title"=>"Batman", "genre"=>"Action"}
mov3 = {"title"=>"Batman Begins", "genre"=>"Action"}
mov4 = {"title"=>"The Dark Knight", "genre"=>"Action"}
array = mov1, mov2, mov3, mov4

Right now, to search for "Batman Beings", for instance, I'm using:

puts array.each { |i|
if i["title"] == "Batman Begins"
puts "Found!"
break # ends the searching because item is found
end
}

Is there a better or faster way to do this in Ruby? I'm going to be
looking through much more data then this, so I want the most efficient
code possible. I'm just a beginner, so this is pretty much all I know
how to do in Ruby.
 
R

Robert Klemme

2010/3/30 Derek Cannon said:
Hello, everyone! What would be the quickest/best way to check elements
of an array, which themselves are arrays of hashes? Quick example:

mov1 =3D {"title"=3D>"Batman Returns", "genre"=3D>"Action"}
mov2 =3D {"title"=3D>"Batman", "genre"=3D>"Action"}
mov3 =3D {"title"=3D>"Batman Begins", "genre"=3D>"Action"}
mov4 =3D {"title"=3D>"The Dark Knight", "genre"=3D>"Action"}
array =3D mov1, mov2, mov3, mov4

Right now, to search for "Batman Beings", for instance, I'm using:

puts array.each { |i|
=A0if i["title"] =3D=3D "Batman Begins"
=A0 =A0puts "Found!"
=A0 =A0break # ends the searching because item is found
=A0end
=A0}

Is there a better or faster way to do this in Ruby? I'm going to be
looking through much more data then this, so I want the most efficient
code possible. I'm just a beginner, so this is pretty much all I know
how to do in Ruby.

I'd use #find:

array.find {|i| i["title"] =3D=3D "Batman Begins"}

If you need to sift through large volumes of data it is reasonable to
create an index using one or more Hash instances.

Kind regards

robert


--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
J

Jesús Gabriel y Galán

Hello, everyone! What would be the quickest/best way to check elements
of an array, which themselves are arrays of hashes? Quick example:

mov1 =3D {"title"=3D>"Batman Returns", "genre"=3D>"Action"}
mov2 =3D {"title"=3D>"Batman", "genre"=3D>"Action"}
mov3 =3D {"title"=3D>"Batman Begins", "genre"=3D>"Action"}
mov4 =3D {"title"=3D>"The Dark Knight", "genre"=3D>"Action"}
array =3D mov1, mov2, mov3, mov4

Right now, to search for "Batman Beings", for instance, I'm using:

puts array.each { |i|
=A0if i["title"] =3D=3D "Batman Begins"
=A0 =A0puts "Found!"
=A0 =A0break # ends the searching because item is found
=A0end
=A0}

Is there a better or faster way to do this in Ruby? I'm going to be
looking through much more data then this, so I want the most efficient
code possible. I'm just a beginner, so this is pretty much all I know
how to do in Ruby.

item =3D array.find {|movie| movie["title"] =3D=3D "Batman Begins"}

If the array is not sorted by the criteria you want to search for, you
will have to traverse the array looking at all elements (O(n)). If you
have a relevant amount of data you should look into using a real
database to do the searching for you with appropriate indexes.

Jesus.
 
D

Derek Cannon

If you have a relevant amount of data you should look into using a real
database to do the searching for you with appropriate indexes.

Well this code is going to be in Rails, searching through a SQLite3
database. Basically, I have a bunch of movies and my array will update
as my collection grows, but when I tell it to add more movies to the
database, I don't want it to add movies that already exist there. Is
there an even better way to make sure a database entry doesn't exist?

By the way: Thanks for the quick replies Robert and Jesus.
 
J

Jesús Gabriel y Galán

Well this code is going to be in Rails, searching through a SQLite3
database. Basically, I have a bunch of movies and my array will update
as my collection grows, but when I tell it to add more movies to the
database, I don't want it to add movies that already exist there. Is
there an even better way to make sure a database entry doesn't exist?

Primary keys and unique indexes in the database ensure that you don't
have duplicated rows.
Are you using ActiveRecord? I don't really get what you say about "my
array will update".

In any case, if you have a database holding your data, don't traverse
an array with all the records to search for a duplicate before
inserting, use the DB facilities...

Jesus.
 
D

Derek Cannon

Primary keys and unique indexes in the database ensure that you don't
have duplicated rows.
Are you using ActiveRecord?

I don't even know what that means yet. I just started with Rails a few
days ago. I just created a project, created a database with the fields I
want, and used scaffold. I can add/edit/delete elements of my database,
that's pretty much all. I've already written a class that will parse XML
files for data, load that data into an array of instantiated classes
called "Movie". For example, each Movie class has a title, rating, plot,
etc. I just want to make sure the movie's title and release date don't
already exist in a database before I add them. For example, I was
thinking of something like this:

# @movie_db is the Rails SQLite3 database
# @movies holds the temporary array of all the Movie classes,
#each which have many variables: title, rating, plot, etc

@movies.each { |i|
if @movie_db.find { |x| (x["title"] == i.title) and
x["release_year"] == x.release_year } == false
@movie_db = MovieDb.new("title"=>i.title,"plot"=>i.plot,etc)
@movie_db.save
end
end

^ This is just the basic code. It makes sure @movies elements will only
be added if a movie is unique (based on title and release date (since
there are some movies with the same name, but none that I can think of
with the same name AND release year)).
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Primary keys and unique indexes in the database ensure that you don't
have duplicated rows.
Are you using ActiveRecord?

I don't even know what that means yet. I just started with Rails a few
days ago. I just created a project, created a database with the fields I
want, and used scaffold. I can add/edit/delete elements of my database,
that's pretty much all. I've already written a class that will parse XML
files for data, load that data into an array of instantiated classes
called "Movie". For example, each Movie class has a title, rating, plot,
etc. I just want to make sure the movie's title and release date don't
already exist in a database before I add them. For example, I was
thinking of something like this:

# @movie_db is the Rails SQLite3 database
# @movies holds the temporary array of all the Movie classes,
#each which have many variables: title, rating, plot, etc

@movies.each { |i|
if @movie_db.find { |x| (x["title"] == i.title) and
x["release_year"] == x.release_year } == false
@movie_db = MovieDb.new("title"=>i.title,"plot"=>i.plot,etc)
@movie_db.save
end
end

^ This is just the basic code. It makes sure @movies elements will only
be added if a movie is unique (based on title and release date (since
there are some movies with the same name, but none that I can think of
with the same name AND release year)).
Hi, Derek, there is a ML specifically for Rails
http://groups.google.com/group/rubyonrails-talk

The short answer is that you're doing way too much work (and what a nice
short answer that is :)

You don't need to worry about the DB as much as you are, let ActiveRecord do
that for you. Your model can declare validations that need to be met, such
as "this attribute should be unique", and then AR will not save any data to
the db if that validation does not pass.

In your case, the model might look like this
class Movie < ActiveRecord::Base
validates_uniqueness_of :title , :scope => :genre , :message =>
"title/genre pair must be unique"
end

Here is an example that shows a session, where I create several movies, some
of which are duplicates. You see that it does not save the duplicates
because they fail this validation. Then I query the db for all the movies it
contains, and you will see that there are no duplicate movies listed in
there.

http://img202.imageshack.us/img202/213/picture6os.png



You shouldn't be replicating the db in memory, and you shouldn't be
validating data outside your model like this (I assume this code is in your
controller).

Anyway, hope that helps, and for an exceptionally good resource on Rails,
check out guides.rubyonrails.org In this particular case, the information to
create the validation was in the guide titled "Active Record Validations and
Callbacks". I know there is a lot there, but with all seriousness, the time
you spend reading these, and getting familiar with what is available (even
if it's just to make a mental note of it, and then later look it up when you
need it) will save you hundreds more time and effort (and lets not forget
frustration) in the long run. Because Rails will do things for you that just
make your life so much easier, and that you might spend several days working
on to create a slower buggier solution to something there is already a
method for, or that AR already accounts for. I speak from experience on that
:p
 
D

Derek Cannon

Here is an example that shows a session, where I create several movies,
some
of which are duplicates. You see that it does not save the duplicates
because they fail this validation. Then I query the db for all the
movies it
contains, and you will see that there are no duplicate movies listed in
there.

http://img202.imageshack.us/img202/213/picture6os.png

Oh wow, thanks very much for all this work! This picture has really
given me a better picture of what you mean!
You shouldn't be replicating the db in memory, and you shouldn't be
validating data outside your model like this (I assume this code is in
your controller).

Anyway, hope that helps, and for an exceptionally good resource on
Rails, check out guides.rubyonrails.org In this particular case, the
information to create the validation was in the guide titled "Active Record Validations and Callbacks". I know there is a lot there, but with all seriousness, the time you spend reading these, and getting familiar with what is available (even if it's just to make a mental note of it, and then later look it up when you need it) will save you hundreds more time and effort (and lets not
forget frustration) in the long run. Because Rails will do things for you that
just make your life so much easier, and that you might spend several days
working on to create a slower buggier solution to something there is already a
method for, or that AR already accounts for. I speak from experience on that

I'll definitely check this out. Thanks again for all the work you put
into your answer. It's really helped me to better grasp some of the
great features of Rails!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top