[BUG] IMO, but no opinions? Uncle Tim? was: int(float(sys.maxint)) buglet ?

B

Bengt Richter

Peculiar boundary cases:
-2147483648

some kind of one-off error? I.e., just inside extremes works:
>>> [int(x) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)]
[-2147483648L, -2147483647, 2147483646, 2147483647L]

But those longs at the extremes can be converted successfully, so int(int(x)) works ;-/
>>> [int(int(x)) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)]
[-2147483648, -2147483647, 2147483646, 2147483647]

ISTM this is a buglet, or at least a wartlet for a 32-bit system ;-)

Almost forgot:
Python 2.4b1 (#56, Nov 3 2004, 01:47:27)
[GCC 3.2.3 (mingw special 20030504-1)] on win32

but same thing on 2.3.2:

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> [int(x) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)] [-2147483648L, -2147483647, 2147483646, 2147483647L]
>>> [int(int(x)) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)]
[-2147483648, -2147483647, 2147483646, 2147483647]

Hm, ... except for the thought that CPUs with 64-bit integers might truncate maxint
when converting to float, I might say maybe these reprs should be tested
for equality in the system tests?
('-2147483648L', '-2147483648')

or maybe at least check for equality of these?:
(<type 'long'>, <type 'int'>)

Regards,
Bengt Richter
 
J

jepler

Python 2.2.2:2147483647

Python 2.3.2:2147483647L

If you're looking for a suspicious change, this should narrow it down.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFBtHcmJd01MZaTXX0RAt/lAJ91VLt4pPGvCae6Mi7Ha62Z3339MgCferrW
XXtJRR45oggqvfnBmD0Q1WE=
=l90g
-----END PGP SIGNATURE-----
 
T

Tim Peters

[Bengt Richter]
Peculiar boundary cases:

-2147483648

some kind of one-off error?

It would help if you were explicit about what you think "the error"
is. I see a correct result in all cases there.

Is it just that sometimes

int(a_float)

returns a Python long when

int(a_long_with_the_same_value_as_that_float)

returns a Python int? If so, that's not a bug -- there's no promise
anywhere, e.g., that Python will return an int whenever it's
physically possible to do so.

Python used to return a (short) int in all cases above, but that lead
to problems on some oddball systems. See the comments for float_int()
in floatobject.c for more detail. Slowing float_int() to avoid those
problems while returning a short int whenever physically possible is a
tradeoff I would oppose.
 
B

Bengt Richter

[Bengt Richter]
Peculiar boundary cases:

-2147483648

some kind of one-off error?

It would help if you were explicit about what you think "the error"
is. I see a correct result in all cases there.

Is it just that sometimes

int(a_float)

returns a Python long when

int(a_long_with_the_same_value_as_that_float)

returns a Python int? If so, that's not a bug -- there's no promise
anywhere, e.g., that Python will return an int whenever it's
physically possible to do so.
Ok, I understand the expediency of that policy, but what is now the meaning
of int, in that case? Is it now just a vestigial artifact on the way to
transparent unification of int and long to a single integer type?

Promises or not, ISTM that if int->float succeeds in preserving all significant bits, then
then a following float->int should also succeed without converting to long.
Python used to return a (short) int in all cases above, but that lead
to problems on some oddball systems. See the comments for float_int()
in floatobject.c for more detail. Slowing float_int() to avoid those
problems while returning a short int whenever physically possible is a
tradeoff I would oppose.

The 2.3.2 source snippet in floatobject.c :
--------------
static PyObject *
float_int(PyObject *v)
{
double x = PyFloat_AsDouble(v);
double wholepart; /* integral portion of x, rounded toward 0 */

(void)modf(x, &wholepart);
/* Try to get out cheap if this fits in a Python int. The attempt
* to cast to long must be protected, as C doesn't define what
* happens if the double is too big to fit in a long. Some rare
* systems raise an exception then (RISCOS was mentioned as one,
* and someone using a non-default option on Sun also bumped into
* that). Note that checking for >= and <= LONG_{MIN,MAX} would
* still be vulnerable: if a long has more bits of precision than
* a double, casting MIN/MAX to double may yield an approximation,
* and if that's rounded up, then, e.g., wholepart=LONG_MAX+1 would
* yield true from the C expression wholepart<=LONG_MAX, despite
* that wholepart is actually greater than LONG_MAX.
*/
if (LONG_MIN < wholepart && wholepart < LONG_MAX) {
const long aslong = (long)wholepart;
return PyInt_FromLong(aslong);
}
return PyLong_FromDouble(wholepart);
}
--------------

But this is apparently accessed through a table of pointers, so would you oppose
an auto-configuration that one time tested whether
int(float(sys.maxint))==sys.maxint and int(float(-sys.maxint-1))==-sys.maxint-1
(assuming that's sufficient, of which I'm not 100% sure ;-) and if so switched
the pointer to a version that tested if(LONG_MIN <= wholepart && wholepart<=LONG_MAX)
instead of the safe-for-some-obscure-system version?

Of course, if int isn't all that meaningful any more, I guess the problem can be moved to the
ctypes module, if that gets included amongst the batteries ;-)

Regards,
Bengt Richter
 
T

Tim Peters

[Tim Peters]
[Bengt Richter]
Ok, I understand the expediency of that policy, but what is now the meaning
of int, in that case? Is it now just a vestigial artifact on the way to
transparent unification of int and long to a single integer type?

I don't really know what you mean by "int". Python isn't C, and the
distinction between Python's historical short integers and unbounded
integers is indeed going away. "int" is the name of a specific Python
type, and the constructor for that type (which old-timers will think
of as the builtin function named "int()") is happy to return unbounded
integers in modern Pythons too. Python-level distinctions here have
become increasingly meaningless over time; I expect that "int" and
"long" will eventually become synonyms for the same type at the Python
level.

The distinction remains very visible at the Python C API level, for
obvious reasons, but even C code has to be prepared to deal with that
a PyIntObject or a PyLongObject may be given in contexts where "an
integer" is required.
Promises or not, ISTM that if int->float succeeds in preserving all significant bits,
then then a following float->int should also succeed without converting to long.

Yes, that was obvious <wink>. But you haven't explained why you
*care*, or, more importantly, why someone else should care. It just
as obviously doesn't bother me, and I'm bold enough to claim that it
"shouldn't" bother anyone. This seems as peripheral to me as arguing
that "there's something wrong" about returning "a long" in either of
these cases:
0L

The implementations of getsize() and .tell() certainly could have
endured complications to ensure that "an int", and not "a long", was
returned whenever physically possible to do so -- but why bother?

....
The 2.3.2 source snippet in floatobject.c :
--------------
static PyObject *
float_int(PyObject *v)
{ ....

But this is apparently accessed through a table of pointers, so would you oppose
an auto-configuration that one time tested whether
int(float(sys.maxint))==sys.maxint and int(float(-sys.maxint-1))==-sys.maxint-1
(assuming that's sufficient, of which I'm not 100% sure ;-) and if so switched
the pointer to a version that tested if(LONG_MIN <= wholepart &&
instead of the safe-for-some-obscure-system version?

In the absence of identifying an actual problem this would solve, I
would oppose adding *gratuitous* complication. Abusing your sense of
aesthetics isn't "an actual problem" in this sense to me, although it
may be to you. Of course you're welcome to make any code changes you
like in your own copy of Python said:
Of course, if int isn't all that meaningful any more, I guess the problem can be
moved to the ctypes module, if that gets included amongst the batteries ;-)

What problem? If there's an actual bug here, please open a bug report.
 
N

Nick Coghlan

In the absence of identifying an actual problem this would solve, I
would oppose adding *gratuitous* complication. Abusing your sense of
aesthetics isn't "an actual problem" in this sense to me, although it
may be to you. Of course you're welcome to make any code changes you
like in your own copy of Python <wnk>.

..>>> _int = int
..>>> def int(*args): return _int(_int(*args))
.....
..>>> from sys import maxint
..>>> int(maxint)
..2147483647
..>>> int(-maxint-1)
..-2147483648

Pretty! };>

Cheers,
Nick.
 
B

Bengt Richter

[Tim Peters]
[Bengt Richter]
Ok, I understand the expediency of that policy, but what is now the meaning
of int, in that case? Is it now just a vestigial artifact on the way to
transparent unification of int and long to a single integer type?

I don't really know what you mean by "int". Python isn't C, and the
Me neither, now that I'd been nudged into thinking about it -- that's why I was *asking* ;-)
distinction between Python's historical short integers and unbounded
integers is indeed going away. "int" is the name of a specific Python
type, and the constructor for that type (which old-timers will think
of as the builtin function named "int()") is happy to return unbounded
integers in modern Pythons too. Python-level distinctions here have
become increasingly meaningless over time; I expect that "int" and
"long" will eventually become synonyms for the same type at the Python
level.
I guess the above is a long spelling of "yes" as an answer to my question ;-)

I assumed that python int was not indifferent to the underlying platform's
native C integer representations, and that its presence was a compromise
(now being deprecated) to permit armslength internal representation control,
for whatever reason (most likely to ease interfacing with something using a fixed
representation deriving from a C library).
The distinction remains very visible at the Python C API level, for
obvious reasons, but even C code has to be prepared to deal with that
a PyIntObject or a PyLongObject may be given in contexts where "an
integer" is required.


Yes, that was obvious <wink>. But you haven't explained why you
*care*, or, more importantly, why someone else should care. It just
IME a corner-case discrepancy in a 1:1 correspondence is a bug waiting to appear
and bite. I admit to an aesthetic component to my unease with it though ;-)
as obviously doesn't bother me, and I'm bold enough to claim that it
"shouldn't" bother anyone. This seems as peripheral to me as arguing
that "there's something wrong" about returning "a long" in either of
these cases:

0L

The implementations of getsize() and .tell() certainly could have
endured complications to ensure that "an int", and not "a long", was
returned whenever physically possible to do so -- but why bother?
Those calls have obvious reasons for handling large-file sizes, but
they don't explicitly call for an int representation (whatever it means).

"""
Help on class int in module __builtin__:

class int(object)
| int(x[, base]) -> integer
|
| Convert a string or number to an integer, if possible. A floating point
| argument will be truncated towards zero (this does not include a string
| representation of a floating point number!) When converting a string, use
| the optional base. It is an error to supply a base when converting a
| non-string. If the argument is outside the integer range a long object
| will be returned instead.
"""
Maybe the above should be amended to say that the "integer range"
is [-sys.maxint, sys.maxint-1] when the argument is a float ;-)

Or explain what "an integer" means ;-)
...

In the absence of identifying an actual problem this would solve, I
would oppose adding *gratuitous* complication. Abusing your sense of
aesthetics isn't "an actual problem" in this sense to me, although it
may be to you. Of course you're welcome to make any code changes you


What problem? If there's an actual bug here, please open a bug report.
I guess there won't be a bug until it bites, so we have to imagine an app
where it would matter not to be able to get sys.maxint from a float
(except via int(int(floatarg)) ;-)

However unlikely, that seemed to me most likely to happen if there was
interfacing to some external function requiring a C-derived integer
representation, which is why I thought ctypes might be where a corner case
long that should have been an int might be detected, and the "problem"
might be solved there with a corner-case test and conversion, so as not
to fail on a legitimate value whose specific representation is something
the Python language per se is disassociating itself from ;-)

Not saying it's a bug wrt Python's unified integer future, just noting
the slow death of legacy expectations, and I guess tending to view it
as a bug so long as int implies anything at all about representation,
and sys.maxint purports to mean something specific about that ;-)

Regards,
Bengt Richter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top