upping numerical precision

Discussion in 'Ruby' started by rpardee@gmail.com, Mar 16, 2009.

  1. Guest

    Hey All,

    I got me this fancy method for classifying documents that basically
    does this at one point:

    p = 1
    words.each do |w|
    p *= calc_prob(w)
    end
    chi = -2.0 * Math.log(p)

    I'm finding that p is often going to 0.0 b/c the numbers returned by
    calc_prob are sometimes outlandishly small (or there are just so many
    words in the doc that the loop runs long enough to zero out the
    variable p). This causes problems for the call to Math.log of course
    (e.g., Errno::EDOM).

    I have tried two things. First, after some desperate flailing on
    google I added:

    require 'rational'
    require 'mathn'

    to my script and hoped that ruby would read my mind WRT using
    rationals where possible & that rationals would extend the reach of
    ruby's arithmetic into the too-outlandishly-small-for-floats range.

    When that did not seem to avail me, I put this in my words.do loop:

    if p == 0.0 then
    p = Float::MIN
    end

    That works, but makes me wonder if there's a smarter thing to do w/
    those rational and mathn libs to really get the effect I hoped for
    just from including them in my script.

    Is there?

    Many thanks!

    -Roy
     
    , Mar 16, 2009
    #1
    1. Advertising

  2. wrote:
    > Hey All,

    (...)
    > chi = -2.0 * Math.log(p)
    >
    > I'm finding that p is often going to 0.0 b/c the numbers returned by
    > calc_prob are sometimes outlandishly small

    (...)
    > That works, but makes me wonder if there's a smarter thing to do w/
    > those rational and mathn libs to really get the effect I hoped for
    > just from including them in my script.
    >
    > Is there?
    >
    > Many thanks!
    >
    > -Roy


    Are you sure you required 'mathn' before defining your calc_prob method?

    big = 10**100

    small = 1/big
    p small.zero? # true

    require 'mathn'
    small = 1/big
    p small.zero? # false
    p small.class # Rational

    p -2.0*Math.log(small)

    hth,

    Siep

    --
    Posted via http://www.ruby-forum.com/.
     
    Siep Korteling, Mar 16, 2009
    #2
    1. Advertising

  3. Roy Pardee Guest

    On Mar 16, 2:06 pm, Siep Korteling <> wrote:
    > wrote:
    > > Hey All,

    > (...)
    > >   chi = -2.0 * Math.log(p)

    >
    > > I'm finding that p is often going to 0.0 b/c the numbers returned by
    > > calc_prob are sometimes outlandishly small

    > (...)
    > > That works, but makes me wonder if there's a smarter thing to do w/
    > > those rational and mathn libs to really get the effect I hoped for
    > > just from including them in my script.

    >
    > > Is there?

    >
    > > Many thanks!

    >
    > > -Roy

    >
    > Are you sure you required 'mathn' before defining your calc_prob method?
    >
    > big = 10**100
    >
    > small = 1/big
    > p small.zero? # true
    >
    > require 'mathn'
    > small = 1/big
    > p small.zero? # false
    > p small.class # Rational
    >
    > p -2.0*Math.log(small)
    >
    > hth,
    >
    > Siep


    Thanks for the response! I think the issue may be that I'm not doing
    any division--just multiplication. Check it out:

    irb(main):001:0> require 'mathn'
    => true
    irb(main):002:0> x = 0.5
    => 0.5
    irb(main):003:0> 1000.times do
    irb(main):004:1* x *= x
    irb(main):005:1> end
    => 1000
    irb(main):006:0> x
    => 0.0
    irb(main):007:0> x.class
    => Float
    irb(main):008:0>

    But the more I think about it, the more I think I'm fussing over
    nothing (ha ha!). I think if my p var goes to zero, I should just set
    it = Float::MIN & break out of that loop. My calc_prob method will
    only ever return values <= 1, so there's no sense in letting it
    continue to spin down the value of p (if you can tell what I'm trying
    to say).

    Thanks!

    -Roy
     
    Roy Pardee, Mar 17, 2009
    #3
  4. t3ch.dude Guest

    On Mar 16, 10:56 pm, Roy Pardee <> wrote:
    >
    > Thanks for the response!  I think the issue may be that I'm not doing
    > any division--just multiplication.  Check it out:
    >
    >   irb(main):001:0> require 'mathn'
    >   => true
    >   irb(main):002:0> x = 0.5
    >   => 0.5
    >   irb(main):003:0> 1000.times do
    >   irb(main):004:1* x *= x
    >   irb(main):005:1> end
    >   => 1000
    >   irb(main):006:0> x
    >   => 0.0
    >   irb(main):007:0> x.class
    >   => Float
    >   irb(main):008:0>
    >
    > But the more I think about it, the more I think I'm fussing over
    > nothing (ha ha!).  I think if my p var goes to zero, I should just set
    > it = Float::MIN & break out of that loop.  My calc_prob method will
    > only ever return values <= 1, so there's no sense in letting it
    > continue to spin down the value of p (if you can tell what I'm trying
    > to say).
    >
    > Thanks!
    >
    > -Roy


    Roy,

    It all depends on how much range of data you want. If you need more
    granularity at the tiny end, you can always re-normalize... just
    initialize p to be 1e6 or something, rather than 1. Then after the log
    you can just subtract the constant exponent to get back to your
    original range.

    -t3ch.dude
     
    t3ch.dude, Mar 17, 2009
    #4
  5. Sander Land Guest

    I don't think you need more precision. Basic math can help you here:
    log(a*b) =3D log(a) + log(b)

    so

    logp=3D0
    words.each do |w|
    logp +=3D Math.log( calc_prob(w) )
    end
    chi =3D -2.0 * logp

    On Mon, Mar 16, 2009 at 6:57 PM, <> wrot=
    e:
    > Hey All,
    >
    > I got me this fancy method for classifying documents that basically
    > does this at one point:
    >
    > =A0p =3D 1
    > =A0words.each do |w|
    > =A0 =A0p *=3D calc_prob(w)
    > =A0end
    > =A0chi =3D -2.0 * Math.log(p)
    >
    > I'm finding that p is often going to 0.0 b/c the numbers returned by
    > calc_prob are sometimes outlandishly small (or there are just so many
    > words in the doc that the loop runs long enough to zero out the
    > variable p). =A0This causes problems for the call to Math.log of course
    > (e.g., Errno::EDOM).
    >
    > I have tried two things. =A0First, after some desperate flailing on
    > google I added:
    >
    > =A0require 'rational'
    > =A0require 'mathn'
    >
    > to my script and hoped that ruby would read my mind WRT using
    > rationals where possible & that rationals would extend the reach of
    > ruby's arithmetic into the too-outlandishly-small-for-floats range.
    >
    > When that did not seem to avail me, I put this in my words.do loop:
    >
    > =A0if p =3D=3D 0.0 then
    > =A0 =A0p =3D Float::MIN
    > =A0end
    >
    > That works, but makes me wonder if there's a smarter thing to do w/
    > those rational and mathn libs to really get the effect I hoped for
    > just from including them in my script.
    >
    > Is there?
    >
    > Many thanks!
    >
    > -Roy
    >
    >
     
    Sander Land, Mar 17, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. iMonkey

    Numerical Order

    iMonkey, Jul 31, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    377
    iMonkey
    Jul 31, 2003
  2. Al Murphy

    Java numerical recipes online...

    Al Murphy, Feb 5, 2004, in forum: Java
    Replies:
    1
    Views:
    5,653
    Steve W. Jackson
    Feb 5, 2004
  3. Replies:
    4
    Views:
    533
    Roedy Green
    Mar 30, 2006
  4. Replies:
    1
    Views:
    347
    Johannes Koch
    Oct 11, 2005
  5. Ken
    Replies:
    2
    Views:
    257
    Albert van der Horst
    Feb 27, 2012
Loading...

Share This Page