Perl numerics: EPSILON and equality between floats

Discussion in 'Perl Misc' started by bill, Aug 11, 2005.

  1. bill

    bill Guest

    I just recently came across this snippet:

    my $EPSILON = 1;
    $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;

    where $EPSILON was later used, e.g., to compare floats for equality:

    sub float_equal {
    my ($x, $y) = @_;
    return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
    }

    (The above is the gist of what I recall, not verbatim production
    code, but I think I got it right.)

    I'm not very knowledgeable about numerics, so I was wondering
    whether this was the best way to compare floats for equality.

    I thought that this would be a FAQ, but I couldn't find it.

    The first edition of The Perl Cookbook recommends converting both
    numbers to the same string format, and then doing a string comparison:

    sub equal {
    my ($A, $B, $dp) = @_;
    return sprintf("%.${dp}g", $A) eq sprintf("%.${dp}g", $B);
    }

    The main reason I don't like this approach is that, in my experience,
    some numerical objects, such as Math::BigFloat and Math::pari,
    don't work well with sprintf.

    Also, I recall that in C there are pre-defined constants (e.g.
    DBL_EPSILON in limits.h) that can be used for this kind of tests.
    Is there something similar in Perl?

    Many thanks in advance!

    bill

    P.S. If someone in Perl publishing reads this: a book on Perl
    numerics is sorely needed (fundamentals, algorithms, recipes, best
    practices) is sorely needed.
    bill, Aug 11, 2005
    #1
    1. Advertising

  2. bill

    Paul Lalli Guest

    bill wrote:
    > I'm not very knowledgeable about numerics, so I was wondering
    > whether this was the best way to compare floats for equality.


    Floats should not be compared for "equality". In any language.

    > I thought that this would be a FAQ, but I couldn't find it.


    perldoc -q 999

    Paul Lalli
    Paul Lalli, Aug 11, 2005
    #2
    1. Advertising

  3. bill

    Guest

    bill <> wrote:
    > I just recently came across this snippet:
    >
    > my $EPSILON = 1;
    > $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
    >
    > where $EPSILON was later used, e.g., to compare floats for equality:
    >
    > sub float_equal {
    > my ($x, $y) = @_;
    > return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
    > }
    >
    > (The above is the gist of what I recall, not verbatim production
    > code, but I think I got it right.)
    >
    > I'm not very knowledgeable about numerics, so I was wondering
    > whether this was the best way to compare floats for equality.


    The best way to compare floats for equality is highly dependent on
    why you want to compare floats for equality.

    There are no magic bullets.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Aug 11, 2005
    #3
  4. bill

    bill Guest

    In <20050811160034.410$> writes:

    >bill <> wrote:
    >> I just recently came across this snippet:
    >>
    >> my $EPSILON = 1;
    >> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
    >>
    >> where $EPSILON was later used, e.g., to compare floats for equality:
    >>
    >> sub float_equal {
    >> my ($x, $y) = @_;
    >> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
    >> }
    >>
    >> (The above is the gist of what I recall, not verbatim production
    >> code, but I think I got it right.)
    >>
    >> I'm not very knowledgeable about numerics, so I was wondering
    >> whether this was the best way to compare floats for equality.


    >The best way to compare floats for equality is highly dependent on
    >why you want to compare floats for equality.


    That's interesting. I can't think of any examples in which method
    X would be better than method Y for some purposes but worse for
    others. Can you give an example?

    In recent times, I've needed to compare floats in unit tests
    (expected vs. obtained). I typically use an approach similar to
    the one I described in my OP (i.e. declare two floats as "equal"
    if the difference between them is below some tolerance), except
    that I just pull the tolerance out of thin air (I usually go for
    2**-20, instead of having a halfway defensible procedure for
    determining it.

    bill
    bill, Aug 11, 2005
    #4
  5. bill

    Anno Siegel Guest

    bill <> wrote in comp.lang.perl.misc:
    >
    >
    >
    >
    > I just recently came across this snippet:
    >
    > my $EPSILON = 1;
    > $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
    >
    > where $EPSILON was later used, e.g., to compare floats for equality:
    >
    > sub float_equal {
    > my ($x, $y) = @_;
    > return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
    > }
    >
    > (The above is the gist of what I recall, not verbatim production
    > code, but I think I got it right.)
    >
    > I'm not very knowledgeable about numerics, so I was wondering
    > whether this was the best way to compare floats for equality.
    >
    > I thought that this would be a FAQ, but I couldn't find it.
    >
    > The first edition of The Perl Cookbook recommends converting both
    > numbers to the same string format, and then doing a string comparison:
    >
    > sub equal {
    > my ($A, $B, $dp) = @_;
    > return sprintf("%.${dp}g", $A) eq sprintf("%.${dp}g", $B);
    > }


    For an ad-hoc approach I simply use "eq" on the float-point arguments
    (without sprintf). "If it prints the same, it is the same."

    > The main reason I don't like this approach is that, in my experience,
    > some numerical objects, such as Math::BigFloat and Math::pari,
    > don't work well with sprintf.


    For comparison of non-native number formats you'll have to resort to
    special methods. Unless the package comes with its own sprintf(),
    sprintf-based methods won't work. The epsilon-finding procedure
    would probably loop forever.

    > Also, I recall that in C there are pre-defined constants (e.g.
    > DBL_EPSILON in limits.h) that can be used for this kind of tests.
    > Is there something similar in Perl?


    Those constants are in POSIX.

    > P.S. If someone in Perl publishing reads this: a book on Perl
    > numerics is sorely needed (fundamentals, algorithms, recipes, best
    > practices) is sorely needed.


    Perl doesn't have its own numerics, it uses the native operations
    of the C compiler. A book about Perl numerics would have to be one
    about computer numerics in general, which abound.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Aug 12, 2005
    #5
  6. bill

    bill Guest

    In <ddhtkk$iuq$-Berlin.DE> -berlin.de (Anno Siegel) writes:

    >For an ad-hoc approach I simply use "eq" on the float-point arguments
    >(without sprintf). "If it prints the same, it is the same."


    Another objection I have against string-based methods (other than
    the fact that it doesn't work well with certain numeric classes)
    is the loss of precision incurred by translating to base 10. If
    there were a way to do a "binary eq", where the string representation
    was in base 2 instead of base 10, your approach would be fine with
    me.

    >> Also, I recall that in C there are pre-defined constants (e.g.
    >> DBL_EPSILON in limits.h) that can be used for this kind of tests.
    >> Is there something similar in Perl?


    >Those constants are in POSIX.


    Thanks!

    >If you want to post a followup via groups.google.com, don't use
    >the broken "Reply" link at the bottom of the article. Click on
    >"show options" at the top of the article, then click on the
    >"Reply" at the bottom of the article headers.


    Is it my imagination or did Google Groups go to the dogs after they
    "overhauled it" about a year ago? It's got to be the buggiest of
    all Google services.

    bill
    bill, Aug 12, 2005
    #6
  7. bill

    Anno Siegel Guest

    bill <> wrote in comp.lang.perl.misc:
    > In <ddhtkk$iuq$-Berlin.DE>
    > -berlin.de (Anno Siegel) writes:
    >
    > >For an ad-hoc approach I simply use "eq" on the float-point arguments
    > >(without sprintf). "If it prints the same, it is the same."

    >
    > Another objection I have against string-based methods (other than
    > the fact that it doesn't work well with certain numeric classes)
    > is the loss of precision incurred by translating to base 10. If
    > there were a way to do a "binary eq", where the string representation
    > was in base 2 instead of base 10, your approach would be fine with
    > me.


    But you use rounding, or epsilon fuzz, to get rid of spurious precision.
    If you round off only a single decimal digit, you're giving up more
    precision than the conversion to decimal costs in the first place.

    Something like that "binary eq" could probably be written (in Perl
    or otherwise), but what would it gain? In effect, it gives you a
    choice of epsilons that are powers of two, instead of powers of ten
    like rounding. The practical difference is small.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Aug 12, 2005
    #7
  8. bill

    bill Guest

    In <ddi69e$o30$-Berlin.DE> -berlin.de (Anno Siegel) writes:

    >bill <> wrote in comp.lang.perl.misc:
    >> In <ddhtkk$iuq$-Berlin.DE>
    >> -berlin.de (Anno Siegel) writes:
    >>
    >> >For an ad-hoc approach I simply use "eq" on the float-point arguments
    >> >(without sprintf). "If it prints the same, it is the same."

    >>
    >> Another objection I have against string-based methods (other than
    >> the fact that it doesn't work well with certain numeric classes)
    >> is the loss of precision incurred by translating to base 10. If
    >> there were a way to do a "binary eq", where the string representation
    >> was in base 2 instead of base 10, your approach would be fine with
    >> me.


    >But you use rounding, or epsilon fuzz, to get rid of spurious precision.
    >If you round off only a single decimal digit, you're giving up more
    >precision than the conversion to decimal costs in the first place.


    >Something like that "binary eq" could probably be written (in Perl
    >or otherwise), but what would it gain? In effect, it gives you a
    >choice of epsilons that are powers of two, instead of powers of ten
    >like rounding. The practical difference is small.


    The effect is to leave it up to the user to decide, instead of
    forcing him/her to rationalize a suboptimal situation after the
    fact. The phrase "practical difference" makes a huge blanket
    assumption about the task at hand. What's "good enough"? 10%?
    1%? 0.0001%? The answer obviously depends on the application, and
    my philosophy is to give as much power to the user as possible.
    In this case, the best that can be done is to allow a binary
    specification of precision, and let the user avail him/herself of
    this as desired.

    I suppose "binary eq" could be written using pack/unpack, but even
    though I've been programming Perl for 8+ years and consider myself
    pretty proficient at it, I remain a pack/unpack illiterate. For
    some reason, even after I study pack/unpack intensively, its
    operations seem mysterious to me, and therefore my minds retains
    whatever I learn about it for roughly one nanosecond. Maybe I
    should just read the perl source to finally understand it.

    bill
    bill, Aug 12, 2005
    #8
  9. bill

    Anno Siegel Guest

    bill <> wrote in comp.lang.perl.misc:
    > In <ddi69e$o30$-Berlin.DE>
    > -berlin.de (Anno Siegel) writes:
    >
    > >bill <> wrote in comp.lang.perl.misc:
    > >> In <ddhtkk$iuq$-Berlin.DE>
    > >> -berlin.de (Anno Siegel) writes:


    [snip comparison of floats]

    > >Something like that "binary eq" could probably be written (in Perl
    > >or otherwise), but what would it gain? In effect, it gives you a
    > >choice of epsilons that are powers of two, instead of powers of ten
    > >like rounding. The practical difference is small.

    >
    > The effect is to leave it up to the user to decide, instead of
    > forcing him/her to rationalize a suboptimal situation after the
    > fact. The phrase "practical difference" makes a huge blanket
    > assumption about the task at hand. What's "good enough"? 10%?
    > 1%? 0.0001%?


    Hmmm? Even the rounding approach gives you the choice of exactly
    that? I don't get the big diff. The right approach to float
    comparison is an epsilon approach, where (in principle) epsilon
    must be specified for each comparison. The same goes for rounding,
    whether done in decimal or binary. In principle, the number of
    places to round must be specified on each comparison. There's your
    chance to decide what's good enough. Nothing is being forced on you.

    The rounding approach is basically equivalent to epsilon values
    that are powers of ten (or two), just more convenient than a
    specific comparison procedure. You get fewer choices of epsilon,
    but you don't need that many.

    > The answer obviously depends on the application, and
    > my philosophy is to give as much power to the user as possible.
    > In this case, the best that can be done is to allow a binary
    > specification of precision, and let the user avail him/herself of
    > this as desired.


    Nah... that would depend on the odds and ends of the local representation
    of floating point numbers. That's not a good thing for a program to
    depend on. Specify the "range of equality" numerically, explicitly
    or implicitly through rounding.

    > I suppose "binary eq" could be written using pack/unpack, but even
    > though I've been programming Perl for 8+ years and consider myself
    > pretty proficient at it, I remain a pack/unpack illiterate. For
    > some reason, even after I study pack/unpack intensively, its
    > operations seem mysterious to me, and therefore my minds retains
    > whatever I learn about it for roughly one nanosecond. Maybe I
    > should just read the perl source to finally understand it.


    That's not the way to understand pack/unpack. To learn about them,
    play with them. Few people learn all the pack/unpack specifications
    by heart, that's something to look up in each case. But it's useful
    to acquaint oneself with the basic possibilities, so write a few test
    cases and see what you get. It isn't used often, but for some things
    it is just the ticket. One of them is reading of fixed-column data,
    another (somewhat surprisingly) is counting the one-bits in a bit
    string. Pack/unpack might be used in a "binary compare", but not
    before we know what exactly it does.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Aug 12, 2005
    #9
  10. bill

    Guest

    bill <> wrote:
    > In <20050811160034.410$> writes:
    >
    > >bill <> wrote:
    > >> I just recently came across this snippet:
    > >>
    > >> my $EPSILON = 1;
    > >> $EPSILON /= 2 while 0.5 + $EPSILON/2 > 0.5;
    > >>
    > >> where $EPSILON was later used, e.g., to compare floats for equality:
    > >>
    > >> sub float_equal {
    > >> my ($x, $y) = @_;
    > >> return !($x||$y) || ($x+$y) && abs(($x-$y)/($x+$y)) < $EPSILON;
    > >> }
    > >>
    > >> (The above is the gist of what I recall, not verbatim production
    > >> code, but I think I got it right.)
    > >>
    > >> I'm not very knowledgeable about numerics, so I was wondering
    > >> whether this was the best way to compare floats for equality.

    >
    > >The best way to compare floats for equality is highly dependent on
    > >why you want to compare floats for equality.

    >
    > That's interesting. I can't think of any examples in which method
    > X would be better than method Y for some purposes but worse for
    > others.


    You mean in general, or just in this context?

    > Can you give an example?


    If you are running an optimization, or an interation-until-convergence,
    where the score function loses all physical significance after 3 digits but
    you converge out to 28 digits, merely because your machine supports that
    precision, you are likely to waste hours or decades of computer time for no
    purpose.

    >
    > In recent times, I've needed to compare floats in unit tests
    > (expected vs. obtained). I typically use an approach similar to
    > the one I described in my OP (i.e. declare two floats as "equal"
    > if the difference between them is below some tolerance), except
    > that I just pull the tolerance out of thin air (I usually go for
    > 2**-20, instead of having a halfway defensible procedure for
    > determining it.


    The defensible tolerance level is the one that makes sense given the
    physical interpretation of the floats. Otherwise your algorithm
    which gives a perfectly good answer will be deemed wrong simply because
    your machine precision is high enough to detect a meaningless difference.
    Or an algorithm that gives a fatally wrong answer (on a particular machine)
    will be deemed right merely because your machine precision is too low to
    realize it is wrong.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Aug 13, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Micah
    Replies:
    2
    Views:
    322
    Jacek Dziedzic
    Apr 28, 2004
  2. Anton81
    Replies:
    7
    Views:
    8,102
    Dan Sommers
    Nov 30, 2005
  3. Kosio

    Floats to chars and chars to floats

    Kosio, Sep 16, 2005, in forum: C Programming
    Replies:
    44
    Views:
    1,285
    Tim Rentsch
    Sep 23, 2005
  4. tahir.rauf
    Replies:
    0
    Views:
    396
    tahir.rauf
    Apr 24, 2011
  5. Une Bévue
    Replies:
    3
    Views:
    131
    Une Bévue
    Nov 23, 2007
Loading...

Share This Page