Ruby utilities for part of speech tagging and text categorization

M

Mark Watson

Released as Free Software on my open source page at www.markwatson.com.

I have started using Ruby for some of my research. I wrote these
utilities for my own use, but I hope that other people find them both
useful and fun.

Enjoy!
 
P

pat eyler

Mark,


Released as Free Software on my open source page at www.markwatson.com.

I have started using Ruby for some of my research. I wrote these
utilities for my own use, but I hope that other people find them both
useful and fun.

Thanks for writing these, I've been interested in using Ruby for some
(armchair) text analysis myself.

As I read through the tagger.rb script, I saw a lot of non-ruby idiom (not
a complaint, just an observation). Before Doing anything else with it, I
ran ZenTest (http://rubyforge.org/projects/zentest) against your script and
started moving your tests over to Test::Unit. Here's a first shot for you:

require 'test/unit' unless defined? $ZENTEST and $ZENTEST
require 'tagger'

class TestTagger < Test::Unit::TestCase
def test_getTags
tt = Tagger.new
assert_equal(["NN"], tt.getTags("bank"))
assert_equal(["DT", "NN", "VBZ", "DT", "JJ", "NN", "JJ", "NN"],
tt.getTags("The dog bites the black cat last week."))
assert_equal(["DT", "NN", "VBD", "NNP", "DT", "NN", "JJ", "NN",
nil, "PRP", "MD", "NN", "DT", "NN", "RB", "RB"],
tt.getTags("The bank gave Sam a loan last week. He
can bank an airplane really well."))
assert_equal([],tt.getTags(""))
end
end


I'll be interested in seeing how these projects grow.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top