[QUIZ] Index and Query (#54)

Ruby Quiz · Nov 12, 2005

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Lyndon Samson

All this fiddling with bits in the thread "How to get non-unique elements from
an array" got me digressing to search engines and indexing.

So if you have for example:

Doc1=The quick brown fox

Doc2=Jumped over the brown dog

Doc3=Cut him to the quick

You can build a table with bit number and word.

1 the
2 quick
3 brown
4 fox
5 jumped
6 over
7 dog
8 cut
9 him
10 to
11 quick

To create indices:

Doc1=00000001111
Doc2=00001110101
Doc3=11110000011

You can very quickly return the Docs that contain 'the' [ Doc1,Doc2,Doc3 ], or
brown [ Doc1,Doc2 ] etc.

This week's Ruby Quiz is to write a simple indexer/query system.

[ Note:

In the spirit of that thread, I think part of the quiz should be to solve the
indexing problem in the shortest, most elegant, yet fastest way possible. Maybe
that goes without saying, but I've seen some pretty long quiz solutions in the
past.

--Ryan Leavengood ]

Bob Showalter · Nov 14, 2005

OK, here's a real down and dirty implementation of the basic bitmap
index. Probably horribly inefficient, but takes advantage of Ruby's
handy Bignum class. I did find myself wishing for a Bignum#[]= method.
It uses Marshal to save the index data between runs. Run with no
arguments for instructions.

#!/usr/local/bin/ruby

class Index

INDEX_FILE = 'index.dat'

# loads existing index file, if any
def initialize
@terms = {}
@index = {}
if File.exists? INDEX_FILE
@terms, @index = Marshal.load(File.open(INDEX_FILE, 'rb') {|f|
f.read})
end
end

# sets the current document being indexed
def document=(name)
@document = name
end

# adds given term to the index under the current document
def <<(term)
raise "No document defined" unless defined? @document
unless @terms.include? term
@terms[term] = @terms.length
end
i = @terms[term]
@index[@document] ||= 0
@index[@document] |= 1 << i
end

# finds documents containing all of the specified terms.
# if a block is given, each document is supplied to the
# block, and nil is returned. Otherwise, an array of
# documents is returned.
def find(*terms)
@index.each do |document, mask|
if terms.all? { |term| @terms[term] && mask[@terms[term]] != 0 }
yield document
end
end
end

# dumps the entire index
def dump
@index.each do |document, mask|
puts "#{document}:"
@terms.each do |term, value|
puts " #{term}" if mask & value
end
end
end

# saves the index data to disk
def save
File.open(INDEX_FILE, 'wb') do |f|
Marshal.dump([@terms, @index], f)
end
end

end

idx = Index.new
case ARGV.shift
when 'add'
ARGV.each do |fname|
idx.document = fname
IO.foreach(fname) do |line|
line.downcase.scan(/\w+/) { |term| idx << term }
end
end
idx.save
when 'find'
idx.find(*ARGV.collect { |s| s.downcase }) { |document| puts document }
when 'dump'
idx.dump
else
print <<-EOS
Usage: #$0 add file [file...] Adds files to index
#$0 find term [term...] Lists files containing all term(s)
#$0 dump Dumps raw index data
EOS
end

Bob Showalter · Nov 14, 2005

Bob said:
# finds documents containing all of the specified terms.
# if a block is given, each document is supplied to the
# block, and nil is returned. Otherwise, an array of
# documents is returned.
def find(*terms)
@index.each do |document, mask|
if terms.all? { |term| @terms[term] && mask[@terms[term]] != 0 }
yield document
end
end
end

Oops, that comment is wrong. You must supply a block. I forgot to go
back and add the support for returning an array.

Brian Schröder · Nov 14, 2005

[snip]

Hello Bob,

thank you for the solution, but please try to respect the non-spoiler
period in the future. From the announcement eMail:

1. Please do not post any solutions or spoiler discussion for this quiz un=
til
48 hours have passed from the time on this message.

cheers,

Brian

James Edward Gray II · Nov 14, 2005

[snip]

Click to expand...

Hello Bob,

thank you for the solution, but please try to respect the non-spoiler
period in the future. =46rom the announcement eMail:

1. Please do not post any solutions or spoiler discussion for this =20=

quiz until
48 hours have passed from the time on this message.

I'm pretty sure we're past that now.

Do we need a Ruby Quiz to parse the date out of an email header and =20
add 48 hours to it? <laughs>

James Edward Gray II

Keith Fahlgren · Nov 14, 2005

Do we need a Ruby Quiz to parse the date out of an email header and =A0
add 48 hours to it? =A0<laughs>

Actually, we need the following, much more practical, quiz:

Write a program that watches a mailing list (or newsgroup, if you=20
prefer) for a message with a subject line that contains "[QUIZ]". If=20
the body of that email also contains "Please do not post any solutions=20
or spoiler discussion for this quiz until 48 hours have passed from the=20
time on this message.", the program should email the list once per=20
minute with the number of minutes remaining until the 48 hours have=20
passed. Sample output should look about like this:

=2D-----------------------------
To: "ruby-talk ML" <[email protected]>
=46rom: "Annoying mailer" <[email protected]>
Subject: [QUIZ][COUNTDOWN] Index and Query (#54)

2879 minutes remaining!
=2D-----------------------------

=2D-----------------------------
To: "ruby-talk ML" <[email protected]>
=46rom: "Annoying mailer" <[email protected]>
Subject: [QUIZ][COUNTDOWN] Index and Query (#54)

2878 minutes remaining!
=2D-----------------------------

=2D-----------------------------
To: "ruby-talk ML" <[email protected]>
=46rom: "Annoying mailer" <[email protected]>
Subject: [QUIZ][COUNTDOWN] Index and Query (#54)

2877 minutes remaining!
=2D-----------------------------
=2E..

Yes, I'm joking.

Keith

Bob Showalter · Nov 14, 2005

Brian said:
thank you for the solution, but please try to respect the non-spoiler
period in the future.

Sorry, I thought 48 hours had passed.

From the announcement:
Date: Sat, 12 Nov 2005 10:27:39 +0900
Posted: Fri, 11 Nov 2005 20:27:28 -0500

My post:
Date: Tue, 15 Nov 2005 00:56:22 +0900
Posted: Mon, 14 Nov 2005 10:50:36 -0500

That's more than 48 hours, no?

Brian Schröder · Nov 14, 2005

Sorry, I thought 48 hours had passed.

From the announcement:
Date: Sat, 12 Nov 2005 10:27:39 +0900
Posted: Fri, 11 Nov 2005 20:27:28 -0500

My post:
Date: Tue, 15 Nov 2005 00:56:22 +0900
Posted: Mon, 14 Nov 2005 10:50:36 -0500

That's more than 48 hours, no?

So sorry for the confusion. I left friday and somehow thought the quiz
was sent today. My inner clock is a bit out of sync at the moment.
Maybe I should write a ruby program to skew my inner clock until it is
in sync again.

Humble apologies again,

brian

Dale Martenson · Nov 14, 2005

When I think of creating an index in Ruby I think of a Hash. So I
decided to code both and see how they compare. Here are my
non-bitmapped and bitmapped solutions:

class IndexHash
attr_accessor :index

def initialize( documents=nil )
@index = Hash.new( [] )
input( documents ) if documents
end

def input( documents )
documents.each_pair do |symbol, contents|
contents.split.each { |word| insert( symbol, word) }
end
end

def insert( document_symbol, word )
@index[word.downcase] = [] unless @index.has_key?( word.downcase )
@index[word.downcase].push( document_symbol ) unless
@index[word.downcase].include?( document_symbol )
end

def find( word )
@index[ word.downcase ]
end

def words
@index.keys.sort
end
end

class IndexBitmap
attr_accessor :index

def initialize( documents=nil )
@index = []
@documents = {}
input( documents ) if documents
end

def input( documents )
documents.each_pair do |symbol, contents|
contents.split.each { |word| insert( symbol, word) }
end
end

def insert( document_symbol, word )
@index.push( word.downcase ) unless @index.include?( word.downcase )
@documents[ document_symbol ] = 0 unless @documents.has_key?(
document_symbol )
@documents[ document_symbol ] += (1<<@index.index( word.downcase ))
end

def find( word )
result = []
@documents.each_pair do |symbol, value|
result.push( symbol ) if value & (1<<@index.index( word.downcase ))
end
result
end

def words
@index.sort
end
end

To verify this, I used the following tests. I just had to change which
class was being tested (@test_class defined in 'setup'):

require 'test/unit'
require 'index'

class Array
# Contents of the two arrays are the same, but the order may be
different
def equivalent(other)
self.each do |item|
if !other.include?( item ) then
return false
end
end
return true
end
end

DOC1 = "The quick brown fox"
INDEX1 = [ 'the', 'quick', 'brown', 'fox' ]
DOC2 = "Jumped over the brown dog"
INDEX2 = [ 'jumped', 'over', 'the', 'brown', 'dog' ]
DOC3 = "Cut him to the quick"
INDEX3 = [ 'cut', 'him', 'to', 'the', 'quick' ]

class TestIndex < Test::Unit::TestCase
def setup
@test_class = IndexBitmap
@i = @test_class.new
end

def test_index_single_document
@i.input( :doc1=>DOC1 )
assert_equal( INDEX1.sort, @i.words )
end

def test_index_muliple_documents_input_one_at_a_time
@i.input( :doc1=>DOC1 )
@i.input( :doc2=>DOC2 )
@i.input( :doc3=>DOC3 )
assert_equal( (INDEX1+INDEX2+INDEX3).uniq.sort, @i.words )
end

def test_index_muliple_documents_input_all_at_one_time
@i.input( :doc1=>DOC1, :doc2=>DOC2, :doc3=>DOC3 )
assert_equal( (INDEX1+INDEX2+INDEX3).uniq.sort, @i.words )
end

def test_index_single_document_on_new
j = @test_class.new( :doc1=>DOC1 )
assert_equal( INDEX1.sort, j.words )
end

def test_index_muliple_documents_input_all_at_one_time_on_new
j = @test_class.new( :doc1=>DOC1, :doc2=>DOC2, :doc3=>DOC3 )
assert_equal( (INDEX1+INDEX2+INDEX3).uniq.sort, j.words )
end

def test_index_find
@i.input( :doc1=>DOC1, :doc2=>DOC2, :doc3=>DOC3 )
assert_equal( true, [:doc1,:doc2,:doc3].equivalent( @i.find( 'the' )
) )
assert_equal( true, [:doc1,:doc3].equivalent( @i.find( 'quick' ) ) )
assert_equal( true, [:doc2].equivalent( @i.find( 'dog' ) ) )
assert_equal( true, [:doc1,:doc2].equivalent( @i.find( 'brown' ) ) )
end
end

aurelianito · Nov 14, 2005

Hi!
This is my first post to the ruby quiz

.

I've read it and remembered a structure that I've studied in college,
the Trie. So I've implemented a very ineficient Trie and tried it.

In a trie, there is a tree of letters. Each word is saved in the tree
(so, words with the same root share a part of the trie, saving space).
I've added to this structure the references for each word.

This is the code:

require "pp"
require "set"

$stdout.sync = true # rubyeclipse requires it

class Trie
def initialize
@containers = Set.new
@tries = Hash.new
end

def containers(word)
if word.length == 0 then
return @containers
end
trie = @tries[ word[0,1] ]
return trie ? trie.containers(word[1...word.length]) : Set.new
end

def add(word, index)
if word.length == 0 then
@containers << index
else
# word[0,1] returns a String. word[0] returns a number (yack!)
trie = @tries[ word[0,1] ] ||= Trie.new
trie.add( word[1...word.length], index )
end
end
end

class Indexer
def initialize( texts )
@trie = Trie.new
texts.each do
|t|
t.split.each do
|w|
@trie.add(w.capitalize, t)
end
end
end

def containers(word)
@trie.containers(word.capitalize)
end
end

texts = ["The quick brown fox", "Jumped over the brown dog", "Cut him
to the quick"]

indexer = Indexer.new(texts)
puts "containers for \"the\""
pp indexer.containers('the') # -> ["The quick brown fox", "Jumped over
the brown dog", "Cut him to the quick"]
puts "containers for \"brown\""
pp indexer.containers('brown') # -> ["The quick brown fox", "Jumped
over the brown dog"]
puts "containers for \"inexistant\""
pp indexer.containers('inexistant') #-> []

Interfecus · Nov 14, 2005

Hi,

This is my first submitted solution. I only started learning ruby in my
free time 4 days ago, so don't hold back the critique. I would really
appreciate comments and suggestions on how I can get more clued in to
the ruby style & conventions. I'm afraid it's a bit longer than the
others, but I'm just learning

class Catalogue
def initialize(start_docs=[[]])
#Expects an array of [space-delimited keyword list, object to
catalogue] arrays for each initial object
@keywords = Array.new #Array of used keywords. Position is important.
@cat_objects = Array.new #Array of [keyword bitfield, stored object]
arrays
start_docs.each do |st_doc|
self.catalogue!(st_doc[0], st_doc[1])
end
end

def each_under_kw(keyword)
#Expects a string keyword. Yields objects using that keyword.
if cindex = @keywords.index(keyword.upcase)
@cat_objects.each do |cat_obj|
yield(cat_obj[1]) unless ((cat_obj[0] & (2 ** cindex)) == 0)
end
end
end

def each
@cat_objects.each {|obj| yield obj[1]}
end

def catalogue!(keyword_list, cat_object)
#Adds a new object to the catalogue. Expects a space-delimited list of
keywords and an object to catalogue.
key_bitfield = 0
split_list = keyword_list.upcase.split
unless split_list.empty?
split_list.each do |test_keyword|
cindex = @keywords.index(test_keyword)
if cindex == nil
cindex = @keywords.length
@keywords << test_keyword
end
key_bitfield |= 2 ** cindex
end
@cat_objects << [key_bitfield , cat_object]
end
end

attr_accessor :cat_objects, :keywords
end

# Begin Demonstration

# For this demonstration, the list of keywords itself is the object
stored.
# This does not have to be the case, any object can be stored.

doc1 = "The quick brown fox"
doc2 = "Jumped over the brown dog"
doc3 = "Cut him to the quick"

demo = Catalogue.new([[doc1, doc1], [doc2, doc2]]) #Create the
catalogue with 2 objects

demo.catalogue!(doc3, doc3) #Add an object to the catalogue

print "All phrases:\n"

demo.each do |obj|
print obj + "\n"
end

print "\nList of objects with keyword 'the':\n"

demo.each_under_kw('the') do |obj|
print obj + "\n"
end

print "\nList of objects with keyword 'brown':\n"

demo.each_under_kw('brown') do |obj|
print obj + "\n"
end

print "\nList of objects with keyword 'dog':\n"

demo.each_under_kw('dog') do |obj|
print obj + "\n"
end

print "\nList of objects with keyword 'quick':\n"

demo.each_under_kw('quick') do |obj|
print obj + "\n"
end

#End Demonstration

Bob Showalter · Nov 14, 2005

Bob said:
OK, here's a real down and dirty implementation of the basic bitmap
index.

Here's an update of my solution that fixes my dump method and makes
the block optional on the find method:

#!/usr/local/bin/ruby

# document indexing/searching class
class Index

# default index file name
INDEX_FILE = 'index.dat'

# loads existing index file, if any
def initialize(index_file = INDEX_FILE)
@terms = {}
@index = {}
@index_file = index_file
if File.exists? @index_file
@terms, @index = Marshal.load(
File.open(@index_file, 'rb') {|f| f.read})
end
end

# sets the current document being indexed
def document=(name)
@document = name
end

# adds given term to the index under the current document
def <<(term)
raise "No document defined" unless defined? @document
unless @terms.include? term
@terms[term] = @terms.length
end
i = @terms[term]
@index[@document] ||= 0
@index[@document] |= 1 << i
end

# finds documents containing all of the specified terms.
# if a block is given, each document is supplied to the
# block, and nil is returned. Otherwise, an array of
# documents is returned.
def find(*terms)
results = []
@index.each do |document, mask|
if terms.all? { |term| @terms[term] && mask[@terms[term]] != 0 }
block_given? ? yield(document) : results << document
end
end
block_given? ? nil : results
end

# dumps the entire index, showing each term and the documents
# containing that term
def dump
@terms.sort.each do |term, value|
puts "#{term}:"
@index.sort.each do |document, mask|
puts " #{document}" if mask[@terms[term]] != 0
end
end
end

# saves the index data to disk
def save
File.open(@index_file, 'wb') do |f|
Marshal.dump([@terms, @index], f)
end
end

end

if $0 == __FILE__
idx = Index.new
case ARGV.shift
when 'add'
ARGV.each do |fname|
idx.document = fname
IO.foreach(fname) do |line|
line.downcase.scan(/\w+/) { |term| idx << term }
end
end
idx.save
when 'find'
idx.find(*ARGV.collect { |s| s.downcase }) do |document|
puts document
end
when 'dump'
idx.dump
else
print <<-EOS
Usage: #$0 add file [file...] Adds files to index
#$0 find term [term...] Lists files containing all term(s)
#$0 dump Dumps raw index data
EOS
end
end

Lyndon Samson · Nov 15, 2005

Hi!
This is my first post to the ruby quiz .

I've read it and remembered a structure that I've studied in college,
the Trie. So I've implemented a very ineficient Trie and tried it.

In a trie, there is a tree of letters. Each word is saved in the tree
(so, words with the same root share a part of the trie, saving space).
I've added to this structure the references for each word.

This sounds similar to how LZW compression works. Cool!

James Edward Gray II · Nov 15, 2005

This is my first submitted solution. I only started learning ruby
in my
free time 4 days ago, so don't hold back the critique.

Here's a tip that leaped out at me just while glancing at your code.
We spell:

print obj + "\n"

as:

puts obj

Hope that helps.

James Edward Gray II

David Balmain · Nov 15, 2005

Here's my solution. It's an inverted index like Dale's solution
however it uses a Bignum bitmap like Bob's solution. This made it
really easy to add a simple query language so you can run queries
like;

index.search("+ruby rails -python") {|doc, score| puts "#{score}:#{doc}"}

The results are scored by the number of matching terms. My solution
also allows updates and deletes. The most complicated method in there
is optimize. Basically what this is doing is shortening all the
document bitmaps for each term to remove the deleted document.

Cheers,
Dave

PS: I'm currently benchmarking the solutions with some surprising results.

require 'strscan'

module SimpleFerret

class Analyzer
ENGLISH_STOP_WORDS =3D [
"a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if",
"in", "into", "is", "it", "no", "not", "of", "on", "or", "s", "such",
"t", "that", "the", "their", "then", "there", "these", "they", "this"=
,
"to", "was", "will", "with"
]

def initialize(regexp =3D /[[:alpha:]]+/, stop_words =3D ENGLISH_STOP_W=
ORDS)
@regexp =3D regexp
@stop_words =3D stop_words.inject({}) {|h, word| h[word] =3D true; h}
end

def each_token(string)
ss =3D StringScanner.new(string)
while ss.scan_until(@regexp)
token =3D ss.matched.downcase
yield token unless @stop_words[token]
end
end
end

class Index
def initialize(analyzer =3D Analyzer.new())
@analyzer =3D analyzer
@index =3D Hash.new(0)
@docs =3D []
@doc_map =3D {}
@deleted =3D 0
end

def add(id, string)
delete(id) if @doc_map[id] # clear existing entry using that id
doc_num =3D @docs.size
@docs << id
@doc_map[id] =3D doc_num
doc_mask =3D 1 << doc_num
@analyzer.each_token(string) do |token|
@index[token] |=3D doc_mask
end
end
alias :[]=3D :add

def delete(id)
@deleted |=3D 1 << @doc_map[id]
end

def search(search_string)
must =3D []
should =3D []
must_not =3D []

search_string.split.each do |st|
case st[0]
when ?+: @analyzer.each_token(st) {|t| must << t}
when ?-: @analyzer.each_token(st) {|t| must_not << t}
else @analyzer.each_token(st) {|t| should << t}
end
end
if not must.empty?
bitmap =3D -1 # 0b111111111111....
must.each {|token| bitmap &=3D @index[token]}
else # no point in using should if we have must
bitmap =3D 0
should.each {|token| bitmap |=3D @index[token]}
end
if bitmap > 0
must_not.each {|token| bitmap &=3D ~ @index[token]}
end
bitmap &=3D ~ @deleted
doc_num =3D 0
results =3D []
while (bitmap > 0)
if (bitmap & 1) =3D=3D 1
results << score_result(doc_num, should, must.size)
end
bitmap >>=3D 1
doc_num +=3D 1
end
results.sort! do |(adoc, ascore), (bdoc, bscore)|
bscore <=3D> ascore
end.each do |(doc, score)|
yield(doc, score)
end
end

def size
delete_count =3D 0
bitmask =3D 1
while bitmask < @deleted
delete_count +=3D 1 if (bitmask & @deleted) > 0
bitmask <<=3D 1
end
@docs.size - delete_count
end
alias :num_docs :size

def unique_terms
@index.size
end

# will need to give it a name the first time
def write(fname =3D @fname)
@fname =3D fname
File.open(fname, "wb") {|f| Marshal.dump(self, f)}
end

def Index.read(fname)
Marshal.load(File.read(fname))
end

# removes deleted documents from the index
def optimize
masks =3D []; bitmask =3D 1;
mask =3D 0; bm =3D 1; last_mask =3D -1;
doc_num =3D 0
while (bitmask < @deleted)
if (@deleted & bitmask) =3D=3D 0
mask |=3D bm
bm <<=3D 1
last_mask <<=3D 1
doc_num +=3D 1
elsif
@docs.delete_at(doc_num)
masks << mask
mask =3D 0
end
bitmask <<=3D 1
end
@doc_map =3D {}
@docs.each_index {|i| @doc_map[@docs] =3D i}

masks << last_mask
@index.each_pair do |id, bitmap|
new_bitmap =3D 0
masks.each do |mask|
new_bitmap |=3D (bitmap & mask)
bitmap >>=3D 1
end
if new_bitmap > 0
@index[id] =3D new_bitmap
else
@index.delete(id)
end
end
@deleted =3D 0
end

private

def score_result(doc_num, should, must_count)
score =3D must_count
should.each do |term|
score +=3D 1 if (@index[term] & 1 << doc_num) > 0
end
return [@docs[doc_num], score]
end
end
end

if $0 =3D=3D __FILE__
include SimpleFerret
INDEX_FILE =3D "simple.idx"
if File.exists?(INDEX_FILE)
idx =3D Index.read(INDEX_FILE)
else
idx =3D Index.new
end
case ARGV.shift
when 'add'
ARGV.each {|fname| idx.add(fname, File.read(fname))}
idx.write(INDEX_FILE)
when 'find'
idx.search(ARGV.join(" ")) { |doc, score| puts "#{score}:#{doc}" }
else
print <<-EOS
Usage: #$0 add file [file...] Adds files to index
#$0 find term [term...] Runs the query on the index
EOS
end
end

James Edward Gray II · Nov 15, 2005

PS: I'm currently benchmarking the solutions with some surprising
results.

Please do share...

James Edward Gray II

David Balmain · Nov 15, 2005

Please do share...

Will do. Still got a bit of work to do.

Interfecus · Nov 15, 2005

Thanks, I'll remember that. I was getting rather confused about which
of the print, puts, or p commands to use for any given task.

horndude77 · Nov 15, 2005

Here's my solution. Nothing fancy except for allowing searches for
documents with multiple words. I haven't done much extensive testing
yet. I'd expect it to be slow, but this is a lot better ruby code than
when I started so I'm happy.

class Indexer
attr_reader :words, :index
def initialize(docs)
@words = []
@index = {}
docs.each do |key,doc|
docwords = divide_words(doc)
@words |= docwords
@index[key] = 0
docwords.each do |w|
n = @words.index(w)
@index[key] |= 1 << n if n
end
end
end

def divide_words(words)
words_list = words.downcase.split(/[^\w']/).uniq - [""]
words_list.each { |w| w.gsub!(/^\W*|\W*$/, '') }
words_list.uniq!
words_list
end

def [](word)
query(word)
end

def query(query)
search_words = divide_words(query)

bit_mask = 0
search_words.each do |w|
word_index = @words.index(w)
(bit_mask = 0; break) if(!word_index)
bit_mask |= 1 << word_index
end
result = []
if(bit_mask>0) then
@index.each do |name,bits|
(result << name) if(bits & bit_mask == bit_mask)
end
end
result
end

def display
puts "Index #{@words.length} word#{'s' if @words.length > 1}"
puts "[#{@words.join(', ')}]"
@index.each do |k,v|
printf("%s: %b\n", k, v)
end
end
end

docs = {
:doc1 => "The quick brown fox",
:doc2 => "Jumped over the brown dog",
:doc3 => "Cut him to the quick",
:doc4 => "He's got some punctuation.",
:doc5 => "I just need a lot more different words to put in here",
:doc6 => "1 2 3 4 5 6 7 8 9 0 a b c d e f g h i j k l m n o p q r",
:doc7 => "She's going to the 'store' or \"store\""
}

index = Indexer.new(docs)
index.display
puts "[#{index["the"].join(",")}]"
puts "[#{index["quick"].join(",")}]"
puts "[#{index["fox"].join(",")}]"
puts "[#{index["blah"].join(",")}]"
puts "[#{index["fox quick"].join(",")}]"

James Edward Gray II · Nov 15, 2005

Thanks, I'll remember that. I was getting rather confused about which
of the print, puts, or p commands to use for any given task.

print() is for when you want to do all the work yourself:

print [1, 2, 3] 123=> nil
print "a line"

Click to expand...

a line=> nil

(Note the lack of newlines above.)

puts() is when you want Ruby to make pretty human readable output for
you:

puts [1, 2, 3]

Click to expand...

1
2
3
=> nila line
=> nil

p() is for "inspect()ing" objects to see what they look like under
the hood (great for debugging):

p [1, 2, 3]

Click to expand...

[1, 2, 3]
=> nil"a line"
=> nil

Hope that helps.

James Edward Gray II

Ruby Quiz Idea	6	Oct 19, 2005
[QUIZ][SOLUTION] Index and Query (#54)	0	Nov 17, 2005
[QUIZ] Posix Pangrams (#97)	17	Oct 6, 2006
[QUIZ] Encyclopedia Construction (#205)	4	May 15, 2009
[QUIZ] Shirt Reader (#140)	2	Sep 21, 2007
[QUIZ] Verbal Arithmetic (#128)	20	Jun 15, 2007
[QUIZ] Making Change (#154)	102	Jan 25, 2008
[QUIZ] Inference Engine (#37)	3	Jul 1, 2005

[QUIZ] Index and Query (#54)

Ruby Quiz

Bob Showalter

Bob Showalter

Brian Schröder

James Edward Gray II

Keith Fahlgren

Bob Showalter

Brian Schröder

Dale Martenson

aurelianito

Interfecus

Bob Showalter

Lyndon Samson

James Edward Gray II

David Balmain

James Edward Gray II

David Balmain

Interfecus

horndude77

James Edward Gray II

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads