[ANN] Bishop 0.3.0 - bayesian classifier for Ruby ported from Python

Discussion in 'Ruby' started by Matt Mower, Apr 15, 2005.

  1. Matt Mower

    Matt Mower Guest

    Hi folks,

    I've recently released a Ruby port "Bishop" of the "Reverend" bayesian
    classifier written in Python. Bishop-0.3.0 is available as a Gem and
    from RubyForge

    http://rubyforge.org/projects/bishop/

    Bishop is a reasonably direct port of the original Python code, bug
    reports and suggestions for improving the structure of the code would
    be welcomed.

    Bishop includes both Robinson and Robinson-Fisher algorithms for
    classification. It is presumed that they were correctly implemented
    in Reverend. I aim to test this in my own use of the code.

    Support is included for saving/loading the trained classifier to/from YAML.

    An example of using Bishop:

    require 'bishop'

    b = Bishop::Bayes.new
    b.train( "ham", "a great message from a close friend" )
    b.train( "spam", "buy viagra here" )
    puts b.guess( "would a friend send you a viagra advert?" )

    => [ [ "ham", <prob> ], [ "spam", <prob> ] ]

    Bishop defaults to using the Robinson algorithm. To use a different
    algorithm construct the classifier passing a block which will call the
    choosen algorithm:

    Bishop::Bayes.new { |probs,ignore| Bishop::robinson_fisher( probs, ignore ) }

    To save to a YAML file:

    b.save "myclassifier.yaml"

    To load from a YAML file:

    b.load "myclassifier.yaml"

    You can uniquely identify training items

    b.train( "ham", "friends don't let friends develop on shared
    hosting", "<>" )

    An can untrain items:

    b.untrain( <pool>, <item>[, <uid> ] )

    I'm using this in a project of my own and would welcome any feedback
    or suggested improvements.

    Regards,

    Matt

    --
    Matt Mower :: http://matt.blogs.it/
    Matt Mower, Apr 15, 2005
    #1
    1. Advertising

  2. On 4/15/05, Matt Mower <> wrote:
    > Hi folks,
    >
    > I've recently released a Ruby port "Bishop" of the "Reverend" bayesian
    > classifier written in Python. Bishop-0.3.0 is available


    Could this be combined with http://rubyforge.org/projects/classifier/ ?

    It looks like they both have a similar syntax:

    classifier.train :symbol, "content"

    Would the method_missing syntax be easy to add to Bishop? Would
    untrain be easy to add to projects/classifier? From what I've looked
    at them so far, sounds like the answer to both would be yes. If they
    had the same API, they could go in the same module so that swapping
    filter types would be as simple as changing the Classifier::XXX.new
    line.

    Cheers,
    Douglas
    Douglas Livingstone, Apr 15, 2005
    #2
    1. Advertising

  3. Re: [ANN] Bishop 0.3.0 - bayesian classifier for Ruby ported fromPython

    Douglas Livingstone ha scritto:
    > On 4/15/05, Matt Mower <> wrote:
    >
    >>Hi folks,
    >>
    >>I've recently released a Ruby port "Bishop" of the "Reverend" bayesian
    >>classifier written in Python. Bishop-0.3.0 is available

    >
    >
    > Could this be combined with http://rubyforge.org/projects/classifier/ ?


    +1 on this question/suggestion.
    There may be reasons to have two different libraries, but IMVHO it
    would be better to have one slightly bigger library sharing APIs,
    services and keeping the useful differences.
    gabriele renzi, Apr 16, 2005
    #3
  4. Matt Mower

    Jaypee Guest

    Re: [ANN] Bishop 0.3.0 - bayesian classifier for Ruby ported fromPython

    Matt Mower a écrit :
    > Hi folks,
    >
    > I've recently released a Ruby port "Bishop" of the "Reverend" bayesian
    > classifier written in Python. Bishop-0.3.0 is available as a Gem and
    > from RubyForge

    ....
    >
    > Regards,
    >
    > Matt
    >

    Hello Matt,

    Thank you for this useful librbary.
    I am trying to use it to analyse the project of text for the european
    constitution (Is it social? liberal? respectful of human rights?) I am
    doing this for myself, just out of curiosity, there is no responsibility
    or any liability involved in the usage of the classifier or in the result.
    I'd like to know what the behaviour of the training of a classifier is
    when two different set of words are submitted in two successive "train"
    method invocations for a given category. Does the second invocation
    resets the training or does it accumulate the "experience" progressively.

    Thanks again ...
    Jean-Pierre
    Jaypee, Apr 18, 2005
    #4
  5. Matt Mower

    Matt Mower Guest

    On 4/16/05, gabriele renzi <> wrote:
    > Douglas Livingstone ha scritto:
    > > On 4/15/05, Matt Mower <> wrote:
    > >
    > >>Hi folks,
    > >>
    > >>I've recently released a Ruby port "Bishop" of the "Reverend" bayesian
    > >>classifier written in Python. Bishop-0.3.0 is available

    > >
    > >
    > > Could this be combined with http://rubyforge.org/projects/classifier/ ?

    >
    > +1 on this question/suggestion.
    > There may be reasons to have two different libraries, but IMVHO it
    > would be better to have one slightly bigger library sharing APIs,
    > services and keeping the useful differences.
    >


    I thought it was about time I responded to this.

    If I had known Lucas was working on his classifier library before I
    did the port of Reverend I probably wouldn't have bothered. However I
    have done it and am using it in another project of my own and have had
    some ideas about possible future developments.

    One example is to build a version which runs directly from a SQL
    database (possibly using ActiveRecord). I'm also interested in new
    algorithms and possible improvements for support classifying RSS items
    within a tag space.

    None of which precludes rolling Bishop and Classifier into one project.

    However right now I'd like to keep control of Bishop and not be
    constrained from making possibly incompatible changes to the API or
    implementation. Similarly Lucas may have his own plans for how he
    wants to see Classifier develop.

    I don't see the harm in having two projects and what I've suggested to
    Lucas is that we should compare notes periodically and see if it makes
    sense to merge the projects. I guess also if a lot of users of the
    libraries made a fuss this would affect my opinon.

    Regards,

    Matt

    ---
    Matt Mower :: http://matt.blogs.it/
    Matt Mower, Apr 19, 2005
    #5
  6. Matt Mower

    Matt Mower Guest

    Hi Jean-Pierre,

    On 4/18/05, Jaypee <> wrote:
    > Thank you for this useful librbary.


    You're welcome.

    > I am trying to use it to analyse the project of text for the european
    > constitution (Is it social? liberal? respectful of human rights?)
    > [..snip..]
    > I'd like to know what the behaviour of the training of a classifier is
    > when two different set of words are submitted in two successive "train"
    > method invocations for a given category. Does the second invocation
    > resets the training or does it accumulate the "experience" progressively.
    >


    You're right when you say it accumulates. Further training supplies
    more evidence to the classifier about which words are associated with
    which categories . It uses this evidence to work out conditional
    probabilities which are then combined to make a guess about the
    approriate category for an item.

    There is an #untrain method if you want to remove previously trained
    information.

    Regards,

    Matt

    --
    Matt Mower :: http://matt.blogs.it/
    Matt Mower, Apr 19, 2005
    #6
  7. Re: [ANN] Bishop 0.3.0 - bayesian classifier for Ruby ported fromPython

    Matt Mower ha scritto:

    <snip all>
    thanks for taking time to answer, I can understand your reasons and I'm
    glad to know there is at least a touch beetween different hackers on
    similar projects, thanks both :)
    gabriele renzi, Apr 19, 2005
    #7
  8. Matt Mower

    Jaypee Guest

    Re: [ANN] Bishop 0.3.0 - bayesian classifier for Ruby ported fromPython

    Matt Mower a écrit :
    > Hi Jean-Pierre,
    >
    > On 4/18/05, Jaypee <> wrote:
    >
    >>Thank you for this useful librbary.

    >
    >
    > You're welcome.
    >
    >
    >>I am trying to use it to analyse the project of text for the european
    >>constitution (Is it social? liberal? respectful of human rights?)
    >>[..snip..]
    >>I'd like to know what the behaviour of the training of a classifier is
    >>when two different set of words are submitted in two successive "train"
    >>method invocations for a given category. Does the second invocation
    >>resets the training or does it accumulate the "experience" progressively.
    >>

    >
    >
    > You're right when you say it accumulates. Further training supplies
    > more evidence to the classifier about which words are associated with
    > which categories . It uses this evidence to work out conditional
    > probabilities which are then combined to make a guess about the
    > approriate category for an item.
    >
    > There is an #untrain method if you want to remove previously trained
    > information.
    >
    > Regards,
    >
    > Matt
    >

    Thank you,
    Jean-Pierre
    Jaypee, Apr 19, 2005
    #8
  9. Re: Bishop 0.3.0 - bayesian classifier for Ruby ported from Python

    The subversion trunk of projects/classifier (see
    http://rufy.com/svn/classifier/trunk) has the untrain method in it.
    This will be released soon under Classifier 1.2.
    Lucas Carlson, Apr 21, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dilou

    to david bishop

    dilou, Mar 30, 2006, in forum: VHDL
    Replies:
    4
    Views:
    578
    David Bishop
    Apr 6, 2006
  2. Lucas Carlson

    [ANN| Bayesian Classification for Ruby

    Lucas Carlson, Apr 11, 2005, in forum: Ruby
    Replies:
    14
    Views:
    217
    Dave Brown
    Apr 13, 2005
  3. Lucas Carlson
    Replies:
    4
    Views:
    136
    Dave Fayram
    Apr 25, 2005
  4. Tom Reilly

    classifier lsi and ruby gsl

    Tom Reilly, May 4, 2005, in forum: Ruby
    Replies:
    2
    Views:
    112
    Dave Fayram
    May 5, 2005
  5. Ryo Fojiba

    ruby classifier

    Ryo Fojiba, Mar 28, 2007, in forum: Ruby
    Replies:
    5
    Views:
    159
    Dan Wade
    Apr 10, 2007
Loading...

Share This Page