is int(round(val)) safe?

R

Russell E. Owen

I realize this probably a stupid question, but...is it safe to
round to the nearest integer by using int(round(val))?

I suspect it is fine, but wanted to be sure that weird floating point
representation on some platform might make it unsafe there (i.e. get the
wrong value due to the floating point value being an approximation) and
if so, is there a Better Way).

-- Russell
 
P

Peter Hansen

Russell said:
I realize this probably a stupid question, but...is it safe to
round to the nearest integer by using int(round(val))?

Define "safe".
I suspect it is fine, but wanted to be sure that weird floating point
representation on some platform might make it unsafe there (i.e. get the
wrong value due to the floating point value being an approximation) and
if so, is there a Better Way).

Since round() returns an integer, and since (ignoring really
large integers, since I doubt you're concerned about them
here) floating point values can handle *integer* values
perfectly well, there's no reason it shouldn't do what you
want.

The problem* with floating point is inaccurate representation
of certain _fractional_ values, not integer values.

-Peter

* Well, _one_ of the problems. ;-)
 
S

Steven Bethard

Peter said:
Since round() returns an integer

Just to clarify in case anyone else misreads this, I belive the intent
here was to say that round(f) -- that is, round with only a single
argument -- returns a floating point number with no fractional component
after the decimal point. The result is still a float, not an int though:
3.1400000000000001

I think the intent was clear from the rest of the post, but I figured it
wouldn't hurt to clarify this for any newbies who misread it like I did.

Steve
 
P

Peter Hansen

Steven said:
Just to clarify in case anyone else misreads this, I belive the intent
here was to say that round(f) -- that is, round with only a single
argument -- returns a floating point number with no fractional component
after the decimal point. The result is still a float, not an int though:

3.1400000000000001

I think the intent was clear from the rest of the post, but I figured it
wouldn't hurt to clarify this for any newbies who misread it like I did.

All true.

I wonder if it would be appropriate to say something along
the lines of '''round() returns an integer, but not an "int".'''

Obviously the mathematicians will have something to say about
this. In computers, 1.0 may not be an integer data type, but
I think in math it's still considered an integer. I am most
definitely not going to claim authority in this area, however,
since as an engineer I consider 1.0 and 1 merely "equal to a
first approximation". <wink>

-Peter
 
T

Tim Peters

[Peter Hansen]
....
I wonder if it would be appropriate to say something along
the lines of '''round() returns an integer, but not an "int".'''

Well, round() is a 2-argument function, whose second argument defaults
to 0. It's quite defensible to say that round returns an integer
value when the second argument is 0.
Obviously the mathematicians will have something to say about
this. In computers, 1.0 may not be an integer data type, but
I think in math it's still considered an integer.

Depends on which mathematician you're talking to. The integer 1 is
most often defined as the set containing the empty set, or, in a
suitably restricted set theory, with the set of all sets containing 1
element (which is a proper class in most set theories). A real, OTOH,
is a godawful construction that Americans typically don't learn until
after they've completed "calculus" and gone on to "analysis".

So, instead of talking to mathematicians, I advise talking to me
<wink>. Yes, 1.0 is an integer! In fact, so is 1.9998e143: all
sufficient large floats are exact integers. That's why, e.g.,
math.ceil() and math.floor() return arguments like 1.9998e143
unchanged -- such inputs are already integers.
I am most definitely not going to claim authority in this area, however,
since as an engineer I consider 1.0 and 1 merely "equal to a
first approximation". <wink>

If they differ at all, then e = 1.0 - 1 must be nonzero. Since it
isn't, they're identical <wink>.
 
B

Bengt Richter

Define "safe".


Since round() returns an integer, and since (ignoring really
large integers, since I doubt you're concerned about them
here) floating point values can handle *integer* values
perfectly well, there's no reason it shouldn't do what you
want.

The problem* with floating point is inaccurate representation
of certain _fractional_ values, not integer values.

Well, you mentioned really large integers, and I think it's worth
mentioning that you can get inaccurate representation of certain of those
values too. I.e., what you really have (for ieee 754 doubles) is 53 bits
to count with in steps of one weighted unit, and the unit can be 2**0
or 2**otherpower, where otherpower has 11 bits to represent it, more or less
+- 2**10 with an offset for 53. If the unit step is 2**1, you get twice the range
of integers, counting by two's, which doesn't give you a way of representing the
odd numbers between accurately. So it's not only fractional values that can get
truncated on the right. Try adding 1.0 to 2.0**53 ;-)
9007199254740993L

The float gets rounded down, but you can count by two's
9007199254740994.0

(2.0**53-1.0 is the last number with a LSB value of 1.0, but you can add 1.0
to that and get an exact 2.0**53 because the unit bit that results is 0 so it
doesn't matter that it's to the right of the new LSB of 2.0**53 (which has a weight of 2.0)).

Another way of thinking about it is that it's not about the accuracy of the numbers,
it's about how far apart the available accurate number representations are and the
choice you have to make if the value you want to represent falls between.
-Peter

* Well, _one_ of the problems. ;-)

Regards,
Bengt Richter
 
D

Dan Bishop

Russell E. Owen said:
I realize this probably a stupid question, but...is it safe to
round to the nearest integer by using int(round(val))?

I suspect it is fine, but wanted to be sure that weird floating point
representation on some platform might make it unsafe there (i.e. get the
wrong value due to the floating point value being an approximation) and
if so, is there a Better Way).

Yes, int(round(val)) is safe. The only concern is avoiding errors in
"val" itself. For example,
3 # Should have been 4
 
S

Steve Holden

Tim said:
[Peter Hansen]
....
I wonder if it would be appropriate to say something along
the lines of '''round() returns an integer, but not an "int".'''


Well, round() is a 2-argument function, whose second argument defaults
to 0. It's quite defensible to say that round returns an integer
value when the second argument is 0.

Obviously the mathematicians will have something to say about
this. In computers, 1.0 may not be an integer data type, but
I think in math it's still considered an integer.


Depends on which mathematician you're talking to. The integer 1 is
most often defined as the set containing the empty set, or, in a
suitably restricted set theory, with the set of all sets containing 1
element (which is a proper class in most set theories). A real, OTOH,
is a godawful construction that Americans typically don't learn until
after they've completed "calculus" and gone on to "analysis".
And most of them don't understand even then. What I want to know is why
doesn't Python 2.4 have a speedy implementation of infinite-dimensional
Banach spaces?
So, instead of talking to mathematicians, I advise talking to me
<wink>. Yes, 1.0 is an integer! In fact, so is 1.9998e143: all
sufficient large floats are exact integers. That's why, e.g.,
math.ceil() and math.floor() return arguments like 1.9998e143
unchanged -- such inputs are already integers.




If they differ at all, then e = 1.0 - 1 must be nonzero. Since it
isn't, they're identical <wink>.

You can tell this by examining the behavior of something as simple (;-)
as a Python dict:

Python has known for a long time that 1.0 and 1 are the same thing.
Note, however, that I don't believe it's guaranteed that the contents of
d will turn out the same in different Python versions. I suppose Tim
would be able to quote chapter and verse, given his familiarity with
every little implementation detail of the dict.

guaranteed-to-confuse-the-confusable-ly y'rs - steve
 
M

Mike Meyer

Well, you mentioned really large integers, and I think it's worth
mentioning that you can get inaccurate representation of certain of those
values too. I.e., what you really have (for ieee 754 doubles) is 53 bits
to count with in steps of one weighted unit, and the unit can be 2**0
or 2**otherpower, where otherpower has 11 bits to represent it, more or less
+- 2**10 with an offset for 53. If the unit step is 2**1, you get twice the range
of integers, counting by two's, which doesn't give you a way of representing the
odd numbers between accurately. So it's not only fractional values that can get
truncated on the right. Try adding 1.0 to 2.0**53 ;-)

It's much easier than that to get integer floating point numbers that
aren't correct. Consider:
10000000000000000725314363815292351261583744096465219555182101554790400L

I don't know the details on 754 FP, but the FP I'm used to represents
*all* numbers as a binary fraction times an exponent. Since .1 can't
be represented exactly, 1e<anything> will be wrong if you ask for
enough digits.

This recently caused someone to propose that 1e70 should be a long
instead of a float. No one mentioned the idea of making

[0-9]+[eE]+?[0-9]+ be of integer type, and

[0-9]*.[0-9]+[eE][+-]?[0-9]+ be a float. [0-9]+[eE]-[0-9]+ would also
be a float. No simple rule for this, unfortunately.

<mike
 
B

Bengt Richter

It's much easier than that to get integer floating point numbers that
aren't correct. Consider:

10000000000000000725314363815292351261583744096465219555182101554790400L

Yes. I was just trying to identify the exact point where you lose 1.0 granularity.
The last number, with all ones in the available significant bits (including the hidden one)
is 2.**53-1.0
53

the last power of 10 that is accurate is 22, and the reason is plain
when you look at the bits:

10**22 has more than 53 bits but only zeroes to the right of the 53, but 10**23
has a bit to the right.
'10000111100001100111100000110010011011101010110010010000000000000000000000'

whereas for 23 1e23 != 10**23 '10101001011010000001011000111111000010100101011110110000000000000000000000000'

The zeroes are eliminated if you use a power of 10/2 or 5, which is always odd '11111111111111111111111111111111111111111111111111111'

Or in decimal terms:
9007199254740992L

so what makes 5.**22 ok is that
True
I don't know the details on 754 FP, but the FP I'm used to represents
*all* numbers as a binary fraction times an exponent. Since .1 can't
be represented exactly, 1e<anything> will be wrong if you ask for
enough digits.
I don't understand the "since .1 ..." logic, but I agree with the second
part. Re *all* numbers, if you multiply the fraction represented by the
53 fractional bits of any number by 2**53 you get an integer that you can
consider to be multiplied by 2 ** (the exponent for the fraction - 53),
which doesn't change anything, so I did that, so I could talk about
counting by increments of 1 unit of least precision. But yes, the
usual description is as a fraction times a power of two.
This recently caused someone to propose that 1e70 should be a long
instead of a float. No one mentioned the idea of making

[0-9]+[eE]+?[0-9]+ be of integer type, and

[0-9]*.[0-9]+[eE][+-]?[0-9]+ be a float. [0-9]+[eE]-[0-9]+ would also
be a float. No simple rule for this, unfortunately.
I wrote a little exact decimal module based on keeping decimal exponents and
a rational numerator/denominator pair, which allows keeping an exact representation
of any reasonable (that you might feel like typing ;-) literal, like 1e70, etc., e.g.,
(1, 1, -1)


The reason I mention this is not because I think all floating constants should be represented
this way in final code, but that maybe they should in the compiler ast, before code has been
generated. At that point, it seems a shame to have done a premature lossy conversion to platform
floating point, since one might want to take the AST and generate code with other representations.
I.e.,
>>> import compiler
>>> compiler.parse('a=0.1') Module(None, Stmt([Assign([AssName('a', 'OP_ASSIGN')], Const(0.10000000000000001))]))
>>> compiler.parse('0.1')
Module(None, Stmt([Discard(Const(0.10000000000000001))]))

but (1, 1, -1)

vs what's represented by the actual floating point bits
(1000000000000000055511151231257827021181583404541015625L, 1L, -55)

Anyway, tuple is an easy exact possibility for intermediate representation of the number.
Of course, I guess you'd have to tag it as being from a floating point literal or else a code
generator would lose that implicit representation directive for ordinary code generation ...

Regards,
Bengt Richter
 
T

Tim Peters

[Steve Holden]
,,,
Python has known for a long time that 1.0 and 1 are the same thing.
Note, however, that I don't believe it's guaranteed that the contents of
d will turn out the same in different Python versions. I suppose Tim
would be able to quote chapter and verse, given his familiarity with
every little implementation detail of the dict.

Alas, this chapter hasn't been written yet, let alone the verse. If you do

d[k] = v

when d already has a key equal to k, its associated value is replaced,
but it's really not defined whether the old key is replaced or
retained. All known implementations of Python retain the old key in
this case.

The other *seeming* ambiguity here isn't one: whether, in {a: b, c:
d}, a or c is added first. Python requires "left to right"
evaluation, so that's actually defined -- although this one may be
more clearly defined in Guido's head than by the docs.
 
P

Peter Hansen

Tim said:
[Steve Holden]
Note, however, that I don't believe it's guaranteed that the contents of
d will turn out the same in different Python versions.
>
If you do

d[k] = v

when d already has a key equal to k, its associated value is replaced,
but it's really not defined whether the old key is replaced or
retained. All known implementations of Python retain the old key in
this case.

The other *seeming* ambiguity here isn't one: whether, in {a: b, c:
d}, a or c is added first. Python requires "left to right"
evaluation, so that's actually defined -- although this one may be
more clearly defined in Guido's head than by the docs.

Leading to very interesting results like this:
{1: 'b'}

Perhaps unexpected, but clearly explained by the comments above...

-Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top