int/long unification hides bugs

kartik · Oct 25, 2004

there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.

granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.

the other benefit of the unification - portability - can be achieved
by defining int32 & int64 types (or by defining all integers to be
32-bit (or 64-bit))

PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.

-kartik

Peter Hansen · Oct 25, 2004

kartik said:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs. [snip]
PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.

Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

I'm not sure what you're asking, because even the exposure of
latent bugs which you are describing can happen only when you
*run* the code. Are you planning to have your users report
that there are bugs when the program crashes in a code path
which you didn't get around to testing manually?

-Peter

Istvan Albert · Oct 25, 2004

kartik said:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.

No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.

> i feel this is not so important as the quality
> of code a programmer writes

A code that relies on hitting the integer limit
is anything but high quality.

If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.

Istvan.

Rocco Moretti · Oct 25, 2004

kartik said:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.
most of the time, i expect my numbers to be small.

The question is how small is small? Less than 2**7? Less than 2**15?
Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)

> PEP 237 says, "It will give new Python programmers [...] one less
> thing to learn [...]". i feel this is not so important as the quality
> of code a programmer writes once he does learn the language.

The thing is, the int/long cutoff is arbitrary, determined soley by
implemetation detail. A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"

Alex Martelli · Oct 25, 2004

kartik said:
there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.

So does allowing strings to be any length.

most of the time, i expect my numbers to be small. 2**31 is good
enough for most uses of variables, and when more is needed, 2**63
should do most of the time.

Most of the time, I expect my strings to be short. 1000 characters is
good enough for most uses of strings, and when more is needed, a million
should do most of the time.

granted, unification allows code to work for larger numbers than
foreseen (as PEP 237 states) but i feel the potential for more
undetected bugs outweighs this benefit.

Granted, unlimited string length allows code to work for longer strings
than foreseen (as common sense states) but (if you're consistent) you
feel the potential for more undetected bugs outweighs this benefit.

By this parallel, I intend to communicate that (and part of why) I
consider your observations to be totally without merit.

Alex

Michael J. Fromberger · Oct 25, 2004

there seems to be a serious problem with allowing numbers to grow in a
nearly unbounded manner, as int/long unification does: it hides bugs.

Can you exhibit any non-trivial examples of the types of bugs you are
talking about?

-M

Jeff Epler · Oct 26, 2004

Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFBfbriJd01MZaTXX0RArSWAJ4oSiMopjUa21IIpBl0ZYitXWj1OQCfedWu
/4aKnk9Lcd5balUhxzJBYDM=
=bDVr
-----END PGP SIGNATURE-----

Cliff Wells · Oct 26, 2004

Here's a bug that passes silently because ints are not limited in range
from 1 to 100:
...

OK, just joking. I couldn't think of one.

Here's one:

# count how many ferrets I have
ferrets = 0
while 1:
try:
ferrets += 1
except:
break
print ferrets

As you can clearly see, the answer should have been 3, but due to Python
silently allowing numbers larger than 3 the program gets stuck in an
apparently interminable loop, requiring me to reboot Microsoft Bob.

kartik · Oct 26, 2004

Peter Hansen said:
Do you feel strongly enough about the quality of your code to write
automated tests for it? Or are you just hoping that one tiny class
of potential bugs will be caught for you by this feature of the
language?

1)catching overflow bugs in the language itself frees u from writing
overflow tests.
2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.
3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.

I'm not sure what you're asking, because even the exposure of
latent bugs which you are describing can happen only when you
*run* the code.

Agreed. i'm saying that without int/long unification, the bugs will b
found sooner & closer to where they occur, rather than propagating
throughout the program's objects & being found far away from the
source, if at all.

-kartik

kartik · Oct 26, 2004

Istvan Albert said:
No it does not.

Just because a runaway program stops sooner by hitting the
integer limit it does not mean that this having this limit
is a validation method.

i didn't say it is. all i say is that it catches bugs - & that's
valuable.

A code that relies on hitting the integer limit
is anything but high quality.

once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception than by end
users complaining about wrong data values.

If you are worried about some numbers growing too much, then
check them yourself, you'll get much better results that way.

maybe, why not use an automated test built-in 2 the language? i get it
4 free.

-kartik

Cliff Wells · Oct 26, 2004

i didn't say it is. all i say is that it catches bugs - & that's
valuable.

You did say it is. And then you said it again right there.

once again, i'm not relying on the integer limit to catch bugs, but
i'd much rather have bugs exposed by an overflow exception

Again that is using the integer limit to catch bugs. Repeated self-
contradiction does little to bolster your argument.

maybe, why not use an automated test built-in 2 the language? i get it
4 free.

Because, strangely enough, most people want limitations *removed* from
the language, not added to it. If you are looking for a language with
arbitrary limits then I think Python isn't quite right for you.

Steve Holden · Oct 26, 2004

kartik said:
1)catching overflow bugs in the language itself frees u from writing
overflow tests.

That seems to me to be a bit like saying you don't need to do any
engineering calculations for your bridge because you'll find out if it's
not strong enough when it falls down.

2)no test (or test suite) can catch all errors, so language support 4
error detection is welcome.

Yes, but you appear to feel that an arbitrary limit on the size of
integers will be helpful, while I feel it's much better to assert that
they are in bounds as necessary. Relying on hardware overflows as error
detection is pretty poor, really.

3)overflow detection helps when u dont have automated tests 4 a
particular part of your program.

But writing such tests would help much more.

Agreed. i'm saying that without int/long unification, the bugs will b
found sooner & closer to where they occur, rather than propagating
throughout the program's objects & being found far away from the
source, if at all.

Even if we assume that this specious argument is valid, what consolation
would you offer the people who actually did find that huge integers were
helpful and that their programs no longer ran after such a change?

regards
Steve

kartik · Oct 26, 2004

The question is how small is small? Less than 2**7? Less than 2**15?

Less than 2**31? Less than 2**63? And what's the significance of powers
of two? And what happens if you move from a 32 bit machine to a 64 bit
one? (or a 1024 bit one in a hundred years time?)

less than 2**31 most of the time & hardly ever greater than 2**63 - no
matter if my machine is 32-bit, 64-bit or 1024-bit. the required range
depends on the data u want 2 store in the variable & not on the
hardware.

PEP 237 says, "It will give new Python programmers [...] one less
thing to learn [...]". i feel this is not so important as the quality
of code a programmer writes once he does learn the language.

Click to expand...

The thing is, the int/long cutoff is arbitrary, determined soley by
implemetation detail.

agreed, but it need not be that way. ints can be defined to be 32-bit
(or 64-bit) on all architectures.

A much better idea is the judicious use of assertions.

assert x < 15000

Not only does it protect you from runaway numbers, it also documents
what the expected range is, resulting in a much better "quality of code"

such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.

a 32-bit limit is a crude kind of assertion that u get for free, and
one u expect should hold for most variables. for those few variables
it doesn't, u can use a long.

-kartik

Steve Holden · Oct 26, 2004

kartik said:
Istvan Albert said:

kartik wrote:

Click to expand...

[yada yada]

maybe, why not use an automated test built-in 2 the language? i get it
4 free.

-kartik

Perhaps you'd like Intel to produce a CPU where the overflow limit can
be arbitrarily set?

I'm getting a bit 6 of this nonsense. Maybe I 8 something that didn't
agree with me. I of10 do that. You've 1. 4give me.

regards
S3ve

Terry Reedy · Oct 26, 2004

kartik said:
1)catching overflow bugs in the language itself frees u from writing
overflow tests.

It is a fundamental characteristic of counts and integers that adding 1 is
always valid. Given that, raising an overflow exception is itself a bug,
one that Python had and has now eliminated.

If one wishes to work with residue classes mod n, +1 is also still always
valid. It is just that (n-1) + 1 is 0 instead of n. So again, raising an
overflow error is a bug.

A number system that prohibits +1 for some fixed number n models, for
instance, packing items into a container. However, the limit n could be
anything, so fixing it at, say, 2**31 - 1 is almost always useless.

The use of fixed range ints is a space-time machine performance hack that
has been costly in human thought time.

Terry J. Reedy

Steve Holden · Oct 26, 2004

Cliff said:
Here's one:

# count how many ferrets I have
ferrets = 0
while 1:
try:
ferrets += 1
except:
break
print ferrets

As you can clearly see, the answer should have been 3, but due to Python
silently allowing numbers larger than 3 the program gets stuck in an
apparently interminable loop, requiring me to reboot Microsoft Bob.

Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.

And heaven knows how all those different characters got encoded in
three-bit bytes. That's got to be tricky.

7-I-can-see-that-ly y'rs - steve

Grant Edwards · Oct 26, 2004

such an assertion must be placed before avery assignment to the
variable - & that's tedious. moreover, it can give u a false sense of
security when u think u have it wherever needed but u've forgotten it
somewhere.

Aargh... The word is "you".

We now return you to an argument over I'm not sure what...

kartik · Oct 26, 2004

So does allowing strings to be any length.

Most of the time, I expect my strings to be short. 1000 characters is
good enough for most uses of strings, and when more is needed, a million
should do most of the time.

Granted, unlimited string length allows code to work for longer strings
than foreseen (as common sense states) but (if you're consistent) you
feel the potential for more undetected bugs outweighs this benefit.

By this parallel, I intend to communicate that (and part of why) I
consider your observations to be totally without merit.

integers are used in different ways from strings. i may expect file
paths to be around 100 characters, and if i get a 500-character path,
i have no problem just because of the length. but if a person's age is
500 where i expect it to be less than 100, then **definitely**
something's wrong.

as another example, using too long a string as an index into a
dictionary is not a problem (true, the dictionary may not have a
mapping, but i have the same issue with a short string). but too long
an index into a list rewards me with an exception.

as i look at my code, i rarely have an issue with string sizes, but if
an integer variable gets very large (say > 2**31 or 2**63), it
generally reflects a bug in my code.

i suggest u base your comments on real code, rather than reasoning in
an abstract manner from your ivory tower.

-kartik

Grant Edwards · Oct 26, 2004

Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.

Ah, but split octal or, uh, unsplit octal?

377 377 vs. 177777

That is the real question.

Cliff Wells · Oct 26, 2004

Come on, the answer should clearly have been seven. Don't try your trick
with ne, buddy. I know octal is the only true number system.

Microsoft Bob assures me that octal isn't a word, but suggested
'octopus' instead. My program was about ferrets, so I think you are
mistaken.

Octopus hugs,
Cliff

EuroPython 2006 and Py3.0	23	Jul 5, 2006
Recommend an E-book Meeting the Following Criteria (Newbie, Long)	4	Dec 14, 2005
python-dev Summary for 2003-08-16 through 2003-08-31	0	Sep 13, 2003
python-dev Summary for 2004-08-01 through 2004-08-15	17	Aug 24, 2004
Memory management strategies in C. (long)	7	Aug 20, 2003
PEP 350: Codetags	20	Sep 26, 2005
10 Reasons Business Intelligence spooks IT Managers	0	Mar 3, 2008
Mixed clocked/combinatorial coding styles (another thread)	23	Aug 21, 2008

int/long unification hides bugs

kartik

Peter Hansen

Istvan Albert

Rocco Moretti

Alex Martelli

Michael J. Fromberger

Jeff Epler

Cliff Wells

kartik

kartik

Cliff Wells

Steve Holden

kartik

Steve Holden

Terry Reedy

Steve Holden

Grant Edwards

kartik

Grant Edwards

Cliff Wells

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads