How to make stopwords case insensitive

J

John Dale

New at ruby..

I was trying to create a stoplist that is case insensitive. When I run
the code below It includes "In" which I do not want. I was thinking I
could use the .match(/[A-Z,a-z]/) I did use downcase on the string
text, which did work, but I want to leave the string "text" in its
orignal content.

Thanks,

John



text = %q{Los Angeles has some of the nicest weather In the country.}
stopwords = %w{the a by on for of are with just but and to the my in I
has some}

#stopwords = stopwords.match(/[A-Z,a-z]/)

words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }

puts keywords.join(' ')
 
S

Siep Korteling

JW said:
New at ruby..

I was trying to create a stoplist that is case insensitive. When I run
the code below It includes "In" which I do not want. I was thinking I
could use the .match(/[A-Z,a-z]/) I did use downcase on the string
text, which did work, but I want to leave the string "text" in its
orignal content.

Thanks,

John



text = %q{Los Angeles has some of the nicest weather In the country.}
stopwords = %w{the a by on for of are with just but and to the my in I
has some}

#stopwords = stopwords.match(/[A-Z,a-z]/)

words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }

puts keywords.join(' ')
You probably figured this out by yourself. Anyway, get the stopwords
array in lowercase, like this:

stopwords.map!{|el| el.downcase}

This gets rid of the disturbing "I" in your stopwords.


"keywords = words.select { |word| !stopwords.include?(word) }"

Almost works. Adjust like this:

keywords = words.select { |word| !stopwords.include?(word.downcase)}

hth,

Siep
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,092
Latest member
vinaykumarnevatia1

Latest Threads

Top