python blogging software

C

Carl Banks

Paul said:
I needed this for something I was doing just now. It came out pretty
enough that I thought I'd post it. It's like xrange, except for floating
point values.

def frange(start, stop, step=1.0):
sign = cmp(0, step)
while cmp(start, stop) == sign:
yield start
start += step

It's a nice little iterator. Three problems, though:

1. It can accumulate rounding errors
2. It can return an integer on the first iteration (minor)
3. The last iteration might not happen because of floating point
uncertainties, even if rounding errors are minimized

A more robust version would look something like this (but it still has
problem #3):

def frange(start, stop, step=1.0):
sign = cmp(0,step)
for i in xrange(sys.maxint):
v = start+i*step
if cmp(v,stop) != sign:
break
yield v

The most robust way to handle this is to iterpolate, i.e., instead of
passing start, stop, and step, pass start, stop, and n_intervals:

def interiter(start, stop, n_intervals):
diff = stop - start
for i in xrange(n_intervals+1):
yield start + (i*diff)/n_intervals

An IEEE 754 (whatever) geek can probably point out even more accurate
ways to do these.
 
C

Carl Banks

Carl said:
3. The last iteration might not happen because of floating point
uncertainties, even if rounding errors are minimized

Just to be sure: what I have should have said was there might be more
extra iteration. The Python convention, of course, it not to include
the final point. However, with floating point, it could happen to
slightly undershoot the stop value, which would cause an evident extra
iteration.
 
N

Neal Holtz

Carl Banks said:
...
A more robust version would look something like this (but it still has
problem #3):

def frange(start, stop, step=1.0):
sign = cmp(0,step)
for i in xrange(sys.maxint):
v = start+i*step
if cmp(v,stop) != sign:
break
yield v

The most robust way to handle this is to iterpolate, i.e., instead of
passing start, stop, and step, pass start, stop, and n_intervals:

def interiter(start, stop, n_intervals):
diff = stop - start
for i in xrange(n_intervals+1):
yield start + (i*diff)/n_intervals

An IEEE 754 (whatever) geek can probably point out even more accurate
ways to do these.


You can easily invent examples where the number of elements generated
may (or may not) suprise you. Here is an implementation that attempts
to account for some of the problems of floating point arithmetic:

# (See http://www.python.org/doc/current/tut/node14.html)

def epsilon():
"""Compute machine epsilon: smallest number, e, s.t. 1 + e > 1"""
one = 1.0
eps = 1.0
while (one + eps) > one:
eps = eps / 2.0
return eps*2.0

_Epsilon = epsilon()

def frange2( start, stop, step=1.0, fuzz=_Epsilon ):
sign = cmp( step, 0.0 )
_fuzz = sign*fuzz*abs(stop)
for i in xrange(sys.maxint):
value = start + i*step
if cmp( stop, value + _fuzz ) != sign:
break
yield value

And we can generate a bunch of test cases to compare this
implementation with the first one above:

for step in frange( 0.0, 1.0, 0.01 ):
for stop in frange( 0.01, 10.0, 0.01 ):
l1 = [x for x in frange( 0.0, stop, step )]
l2 = [x for x in frange2( 0.0, stop, step )]
if l1 != l2:
print stop, step, len(l1), len(l2)
print repr(stop), repr(step)
print l1
print l2
print

The first of many the differences noted is this:

0.06 0.01 7 6
0.060000000000000005 0.01
[0.0, 0.01, 0.02, 0.029999999999999999, 0.040000000000000001,
0.050000000000000003, 0.059999999999999998]
[0.0, 0.01, 0.02, 0.029999999999999999, 0.040000000000000001,
0.050000000000000003]

Under most normal output conditions, you would think that the stop
value was 0.06 and the step was 0.01, so you would expect to get 6
elements in the sequence. Yet the first implementation gives you 7.
This might be surprising, as under most output conditions the last
value would appear to be equal to the stop value. Of course, the
actual stop value is just slightly greater than 0.06. The second
implementation gives you 6 elements in the sequence; though it can be
argued which is "correct", the second implementation is probably less
surprising in most cases.

This is just another example, of course, of the idea that if it
really, really matters how many elements are in the sequence, don't
use floating point arithmetic to control the # of iterations.
 
B

Bengt Richter

The most robust way to handle this is to iterpolate, i.e., instead of
passing start, stop, and step, pass start, stop, and n_intervals:

def interiter(start, stop, n_intervals):
diff = stop - start
for i in xrange(n_intervals+1):
yield start + (i*diff)/n_intervals

To guarantee the exact end points, maybe:

def interiter(start, stop, n_intervals):
fn=float(n_intervals)
for i in xrange(n_intervals+1):
yield ((n_intervals-i)/fn)*start + (i/fn)*stop

but shouldn't that be xrange(n_intervals) and leave out
the final point (thus exact but invisible ;-).

Regards,
Bengt Richter
 
C

Carl Banks

Bengt said:
To guarantee the exact end points, maybe:

def interiter(start, stop, n_intervals):
fn=float(n_intervals)
for i in xrange(n_intervals+1):
yield ((n_intervals-i)/fn)*start + (i/fn)*stop
Good.


but shouldn't that be xrange(n_intervals) and leave out
the final point (thus exact but invisible ;-).

Well, I thought about that.

In the interests of practicality beats purity, when you're
interpolating (instead of stepping as in range) you should include the
final endpoint. In my experience, when when you divide a segment into
intervals, you almost always want to include both endpoints.

If you use the Python convention of not including the final point in
the iteration, and someone decides they do want the final endpoint,
then it's kind of a pain to calculate what point you should use to get
the final "endpoint", and you lose accuracy on top of it.

So, my suggestion when iterating over an interval is this: If you pass
in a stepsize, then don't include the final point. If you pass in a
number of intervals, then do include the final point. I think the
practicality of it justifies the different conventions.
 
P

Paul Rubin

Carl Banks said:
In the interests of practicality beats purity, when you're
interpolating (instead of stepping as in range) you should include the
final endpoint. In my experience, when when you divide a segment into
intervals, you almost always want to include both endpoints.

I think you're right about this, but it goes against the Python notion
of a "range" function, so an frange that includes both endpoints
should be called something different, maybe "fstep" or something like
that. For what I was doing, the frange that I posted did the job.

I see now that that the problem is subtle enough that IMO, a fully
worked out solution should be written and included in the Python
library or at least the Python cookbook.
 
B

Bengt Richter

I think you're right about this, but it goes against the Python notion
of a "range" function, so an frange that includes both endpoints
should be called something different, maybe "fstep" or something like
that. For what I was doing, the frange that I posted did the job.

I see now that that the problem is subtle enough that IMO, a fully
worked out solution should be written and included in the Python
library or at least the Python cookbook.

How about if we indicate the kind of interval, e.g.,
>>> def interiter(start, stop, n_intervals, ends='[]'):
... fn=float(n_intervals)
... for i in xrange(ends[0]=='(', n_intervals+(ends[1]==']')):
... yield ((n_intervals-i)/fn)*start + (i/fn)*stop
... ...
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
>>> for f in interiter(0, 8, 8, ends='[)'): print f,
...
1.0 2.0 3.0 4.0 5.0 6.0 7.0
>>> for f in interiter(0, 8, 8, ends='(]'): print f,
...
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 ...
1.0 2.0 3.0 4.0 5.0 6.0 7.0
>>> for f in interiter(0, 8, 8, ends='[]'): print f,
...
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0


Regards,
Bengt Richter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top