A funnily inconsistent behavior of int and float

L

Lie

I've noticed some oddly inconsistent behavior with int and float:

Python 2.5.1 (r251:54863, Mar 7 2008, 03:39:23)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2-345

works, but
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): - 345.083

The problem seems to be that float can't accept spaces between the
sign and the number while int can. Possibly caused by some missing
regex statement. Minor and harmless (most of the time), but should be
made known.
 
C

Colin J. Williams

Grant said:
I've noticed some oddly inconsistent behavior with int and float:

Python 2.5.1 (r251:54863, Mar 7 2008, 03:39:23)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
int('- 345')
-345

works,

IMO, it oughtn't.

Agreed it seems inconsistent with the
integer literal syntax
Python 2.5 Docs: 2.4.4 Integer and long
integer literals
Python 3.0 doesn't appear to spell out
the literal syntax.
It would be helpful if it did.

Colin W.

[snip]
 
M

Mark Dickinson

I've noticed some oddly inconsistent behavior with int and float:

Python 2.5.1 (r251:54863, Mar  7 2008, 03:39:23)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2>>> int('-          345')

-345

works, but

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -       345.083

This is a known issue, that has been fixed for Python 3.0.
It was decided not to risk breakage by changing this in
Python 2.x. See:

http://bugs.python.org/issue1779

Mark
 
C

Colin J. Williams

Mark said:
I've noticed some oddly inconsistent behavior with int and float:

Python 2.5.1 (r251:54863, Mar 7 2008, 03:39:23)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2>>> int('- 345')

-345

works, but
float('- 345.083')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): - 345.083

This is a known issue, that has been fixed for Python 3.0.
It was decided not to risk breakage by changing this in
Python 2.x. See:

http://bugs.python.org/issue1779

Mark

This is good but the documentation for
3.0 is missing the syntax documentation
from 2.5

Colin W.
 
M

Mark Dickinson

This is good but the documentation for
3.0 is missing the syntax documentation
from 2.5

Is

http://docs.python.org/dev/3.0/reference/lexical_analysis.html#integer-literals

the documentation that you're looking for?

But it seems to me that Lie's original point isn't really
about integer *literals* anyway---it's about the behaviour
of the built-in int() function when applied to a string. So

http://docs.python.org/dev/3.0/library/functions.html#int

is probably the appropriate place in the documentation. And
I agree that it could be made clearer exactly what strings are
acceptable here.

Mark
 
T

Terry Reedy

| > This is good but the documentation for
| > 3.0 is missing the syntax documentation
| > from 2.5
|
| Is
|
|
http://docs.python.org/dev/3.0/reference/lexical_analysis.html#integer-literals
|
| the documentation that you're looking for?
|
| But it seems to me that Lie's original point isn't really
| about integer *literals* anyway---it's about the behaviour
| of the built-in int() function when applied to a string. So
|
| http://docs.python.org/dev/3.0/library/functions.html#int
|
| is probably the appropriate place in the documentation. And
| I agree that it could be made clearer exactly what strings are
| acceptable here.

Agreed.

It says "If radix is zero, the interpretation is the same as for integer
literals."
But integer literals are unsigned. Is radix 0 any different from the
default of radix 10?

It also says "If the argument is a string, it must contain a possibly
signed number of arbitrary size, possibly embedded in whitespace." But
only integers, not 'numbers' as some would understand that, are accepted.

My suggestions:
1. Change signature to: int([number | string[, radix]).
This makes it clear that radix can only follow a string without having to
say so in the text.

2. Replace text with:
Convert a number or string to an integer. If no arguments are given,
return 0. If a number is given, return number.__int__(). Conversion of
floating point numbers to integers truncates towards zero. A string must
be a base-radix integer literal optionally preceded by '+' or '-' (with no
space in between) and optionally surrounded by whitespace. A base-n
literal consists of the digits 0 to n-1, with 'a' to 'z' (or 'A' to 'Z')
having values 10 to 35. The default radix is 10. The allowed values are 0
and 2-36, with 0 the same as 10.

If 0 is not the same as 10, the last would have to be changed, but I could
not detect any difference in a quick test.

After looking at any comments here, I will consider submitting these to the
tracker.

Terry Jan Reedy
 
M

Mark Dickinson

My suggestions:
1. Change signature to: int([number | string[, radix]).
This makes it clear that radix can only follow a string without having to
say so in the text.

2. Replace text with:
Convert a number or string to an integer. If no arguments are given,
return 0. If a number is given, return number.__int__(). Conversion of
floating point numbers to integers truncates towards zero. A string must
be a base-radix integer literal optionally preceded by '+' or '-' (with no
space in between) and optionally surrounded by whitespace. A base-n
literal consists of the digits 0 to n-1, with 'a' to 'z' (or 'A' to 'Z')
having values 10 to 35. The default radix is 10. The allowed values are 0
and 2-36, with 0 the same as 10.

If 0 is not the same as 10, the last would have to be changed, but I could
not detect any difference in a quick test.

After looking at any comments here, I will consider submitting these to the
tracker.

Terry Jan Reedy

Looks good! The description should probably also mention the optional
'0b', '0o' or '0x' (or '0B', '0O', '0X') allowed between the sign and
the digits (or before the digits in the case of a missing sign) when
base=2, base=8 or base=16.

The only base 0 versus base 10 difference I could find was the
following:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 0: '033'
[38720 refs]33

Mark
 
M

Mark Dickinson

The only base 0 versus base 10 difference I could find was the
following:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 0: '033'
[38720 refs]>>> int('033')

33

Mark

And also things like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '0x33'
[38719 refs]51
[38719 refs]

Mark
 
T

Terry Reedy

Thank you for the corrections. Here is my revised proposal:

int([number | string[, radix])
Convert a number or string to an integer. If no arguments are given,
return 0. If a number is given, return number.__int__(). Conversion of
floating point numbers to integers truncates towards zero. A string must
be a base-radix integer literal optionally preceded by '+' or '-' (with no
space in between) and optionally surrounded by whitespace. A base-n
literal consists of the digits 0 to n-1, with 'a' to 'z' (or 'A' to 'Z')
having values 10 to 35. The default radix is 10. The allowed values are 0
and 2-36. Base-2, -8, and -16 literals can be optionally prefixed with
0b/0B, 0o/0O, or 0x/0X, as with integer literals in code. Radix 0 means to
interpret exactly as a code literal, so that the actual radix is 2, 8, 10,
or 16, and so that int('010',0) is not legal, while int('010') is.

Terry Jan Reedy
 
L

Lie

2. Replace text with:
Convert a number or string to an integer.  If no arguments are given,
return 0.  If a number is given, return number.__int__().  Conversion of
floating point numbers to integers truncates towards zero.  A string must
be a base-radix integer literal optionally preceded by '+' or '-' (with no
space in between) and optionally surrounded by whitespace.  A base-n
literal consists of the digits 0 to n-1, with 'a' to 'z' (or 'A' to 'Z')
having values 10 to 35.  The default radix is 10. The allowed values are 0
and 2-36, with 0 the same as 10.

If 0 is not the same as 10, the last would have to be changed, but I could
not detect any difference in a quick test.

One thing though, I think it should say "may be surrounded by
whitespace" as opposed to "optionally surrounded by whitespace".

I've noticed some oddly inconsistent behavior with int and float:
Python 2.5.1 (r251:54863, Mar 7 2008, 03:39:23)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
int('- 345') -345

works,

IMO, it oughtn't.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): - 345.083

That's the behavior I'd expect.

Sorry to confuse you, by works I mean that the interpreter doesn't
complain at all, I didn't mean that it works as it should be.
 
C

Colin J. Williams

Terry said:
| > This is good but the documentation for
| > 3.0 is missing the syntax documentation
| > from 2.5
|
| Is
|
|
http://docs.python.org/dev/3.0/reference/lexical_analysis.html#integer-literals
|
| the documentation that you're looking for?
Yes, thanks. I missed it.

Colin W.
|
| But it seems to me that Lie's original point isn't really
| about integer *literals* anyway---it's about the behaviour
| of the built-in int() function when applied to a string. So
|
| http://docs.python.org/dev/3.0/library/functions.html#int
|
| is probably the appropriate place in the documentation. And
| I agree that it could be made clearer exactly what strings are
| acceptable here.

Agreed.

It says "If radix is zero, the interpretation is the same as for integer
literals."
But integer literals are unsigned. Is radix 0 any different from the
default of radix 10?

It also says "If the argument is a string, it must contain a possibly
signed number of arbitrary size, possibly embedded in whitespace." But
only integers, not 'numbers' as some would understand that, are accepted.

My suggestions:
1. Change signature to: int([number | string[, radix]).
This makes it clear that radix can only follow a string without having to
say so in the text.

2. Replace text with:
Convert a number or string to an integer. If no arguments are given,
return 0. If a number is given, return number.__int__(). Conversion of
floating point numbers to integers truncates towards zero. A string must
be a base-radix integer literal optionally preceded by '+' or '-' (with no
space in between) and optionally surrounded by whitespace. A base-n
literal consists of the digits 0 to n-1, with 'a' to 'z' (or 'A' to 'Z')
having values 10 to 35. The default radix is 10. The allowed values are 0
and 2-36, with 0 the same as 10.

If 0 is not the same as 10, the last would have to be changed, but I could
not detect any difference in a quick test.

After looking at any comments here, I will consider submitting these to the
tracker.

Terry Jan Reedy
 
M

Mark Dickinson

Thank you for the corrections. Here is my revised proposal:

int([number | string[, radix])
...

Excellent!

It looks to me as though this covers everything. I'm tempted to
quibble
about exact wordings, but probably the most productive thing to do
would
be just to submit this to bugs.python.org and then let Georg Brandl
work
his magic on it. :)

Mark
 
T

Terry Reedy

| Excellent!
| It looks to me as though this covers everything. I'm tempted to
| quibble about exact wordings, but probably the most productive thing to
do
| would be just to submit this to bugs.python.org and then let Georg Brandl
| work his magic on it. :)

http://bugs.python.org/issue2580

tjr
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top