range() is not the best way to check range?

S

Summercoolness

it seems that range() can be really slow:

the following program will run, and the last line shows how long it ran
for:

import time

startTime = time.time()

a = 1.0
for i in range(0, 30000):
if i in range (0, 10000):
a += 1
if not i % 1000: print i

print a, " ", round(time.time() - startTime, 1), "seconds"


---------------------------------
the last line of output is
---------------------------------

10001.0 22.8 seconds

so if i change the line

if i in range (0, 10000):

to

if i >= 0 and i < 10000:

the the last line is

10001.0 0.2 seconds

so approximately, the program ran 100 times faster!

or is there an alternative use of range() or something similar that can
be as fast?
 
D

Dan Bishop

it seems that range() can be really slow: ....
if i in range (0, 10000):

This creates a 10,000-element list and sequentially searches it. Of
course that's gonna be slow.
 
L

Leif K-Brooks

or is there an alternative use of range() or something similar that can
be as fast?

You could use xrange:

leif@ubuntu:~$ python -m timeit -n10000 "1 in range(10000)"
10000 loops, best of 3: 260 usec per loop
leif@ubuntu:~$ python -m timeit -n10000 "1 in xrange(10000)"
10000 loops, best of 3: 0.664 usec per loop
 
D

Dan Bishop

Leif said:
You could use xrange:

leif@ubuntu:~$ python -m timeit -n10000 "1 in range(10000)"
10000 loops, best of 3: 260 usec per loop
leif@ubuntu:~$ python -m timeit -n10000 "1 in xrange(10000)"
10000 loops, best of 3: 0.664 usec per loop

That's only because you're choosing a number that's early in the list.

~$ python -m timeit -n10000 "1 in xrange(10000)"
10000 loops, best of 3: 1.22 usec per loop
~$ python -m timeit -n10000 "9999 in xrange(10000)"
10000 loops, best of 3: 1.24 msec per loop

That's *milliseconds*, not microseconds.
 
J

John Machin

it seems that range() can be really slow:

the following program will run, and the last line shows how long it ran
for:

import time

startTime = time.time()

a = 1.0
for i in range(0, 30000):
if i in range (0, 10000):
a += 1
if not i % 1000: print i

print a, " ", round(time.time() - startTime, 1), "seconds"


---------------------------------
the last line of output is
---------------------------------

10001.0 22.8 seconds

so if i change the line

if i in range (0, 10000):

to

if i >= 0 and i < 10000:

the the last line is

10001.0 0.2 seconds

so approximately, the program ran 100 times faster!

or is there an alternative use of range() or something similar that can
be as fast?

Some things to try:
1a. Read what the manual has to say about the range() function ... what
does it produce?

1b. Read what the manual has to say about time.time() and time.clock().
Change over to using time.clock(). Change the round(...., 1) to (say) 4.

Alternatively, use something like this:

print "%.1f ... %.4f seconds" % (a, time.clock() - startTime)

1c. Repeat the two ways that you tried already.

2. First alternative:

Do this:

test_range = range(10000)

*once*, just after "a = 1.0".

and change your if test to

if i in test_range:

3. Now change that to:

test_range = set(range(10000))

4. Now forget about test_range, and change your if test to this:

if 0 <= i < 10000:

HTH,
John
 
K

K.S.Sreeram

so if i change the line
if i in range (0, 10000):
to
if i >= 0 and i < 10000: [snip;]
is there an alternative use of range() or something similar that can
be as fast?


you've found that alternative yourself! just use the comparison operators...

in fact, you can write a little more compact as:
if 0 <= i < 10000 :


[sreeram;]


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEvFGdrgn0plK5qqURAlXhAJ9mrod2ofLGlSCksnXqjzWDd0Y35wCgtx0f
+Fn3h07cOFT16NvCX+DDqnY=
=Hk3J
-----END PGP SIGNATURE-----
 
G

Grant Edwards

it seems that range() can be really slow:

the following program will run, and the last line shows how long it ran
for:

import time

startTime = time.time()

a = 1.0
for i in range(0, 30000):
if i in range (0, 10000):
a += 1
if not i % 1000: print i

print a, " ", round(time.time() - startTime, 1), "seconds"

or is there an alternative use of range() or something similar that can
be as fast?

Creating and then searching a 10,000 element list to see if a
number is between two other numbers is insane.

Using xrange as somebody else suggested is also insane.

If you want to know if a number is between two other numders,
for pete's sake use the comparison operator like god intended.

if 0 <= i <= 10000:
 
T

tac-tics

Grant said:
for pete's sake use the comparison operator like god intended.

if 0 <= i <= 10000:

I'm assuming you used Python's compound comparison as opposed to the
C-style of and'ing two comparisons together to emphasize the fact it is
god's chosen way of doing this ;-)
 
M

Marc 'BlackJack' Rintsch

tac-tics said:
I'm assuming you used Python's compound comparison as opposed to the
C-style of and'ing two comparisons together to emphasize the fact it is
god's chosen way of doing this ;-)

Pete doesn't like to be called god in public. ;-)

Ciao,
Marc 'BlackJack' Rintsch
 
N

Nick Craig-Wood

Grant Edwards said:
Creating and then searching a 10,000 element list to see if a
number is between two other numbers is insane.
Using xrange as somebody else suggested is also insane.

Aye to both
If you want to know if a number is between two other numders,
for pete's sake use the comparison operator like god intended.

if 0 <= i <= 10000:

Sets are pretty fast too, and have the advantage of flexibility in
that you can put any numbers in you like

$ python2.4 -m timeit -s 's=range(0,10000); i=5000' 'i in s'
1000 loops, best of 3: 228 usec per loop

$ python2.4 -m timeit -s 's=set(range(0,10000)); i=5000' 'i in s'
1000000 loops, best of 3: 0.312 usec per loop

$ python2.4 -m timeit -s 'i=5000' '0 <= i < 10000'
1000000 loops, best of 3: 0.289 usec per loop

The below prints

range) That took 21.512 seconds: result 10001.0
set) That took 0.023 seconds: result 10001.0
comparison) That took 0.024 seconds: result 10001.0

.............................................................

import time

start = time.time()
a = 1.0
for i in range(0, 30000):
if i in range(0, 10000):
a += 1
dt = time.time() - start
print "range) That took %.3f seconds: result %s" % (dt, a)

start = time.time()
a = 1.0
mine = set(range(0, 10000))
for i in range(0, 30000):
if i in mine:
a += 1
dt = time.time() - start
print "set) That took %.3f seconds: result %s" % (dt, a)

start = time.time()
a = 1.0
mine = set(range(0, 10000))
for i in range(0, 30000):
if 0 <= i < 10000:
a += 1
dt = time.time() - start
print "comparison) That took %.3f seconds: result %s" % (dt, a)
 
P

Paul Boddie

John said:
it seems that range() can be really slow:
[...]

Some things to try:
1a. Read what the manual has to say about the range() function ... what
does it produce?

Indeed. Still, the addition of a __contains__ method to range (and
xrange) would permit acceptable performance for the code given. Perhaps
this is a convenience worth considering for future Python releases.

Paul
 
A

Andy Dingley

it seems that range() can be really slow:
if i in range (0, 10000):

RTFM on range()

You're completely mis-using it here, using it with an if ... in ...
test. The purpose of range() in Python is as loop control, not
comparisons! It's not a SQL BETWEEN statement.

Although you _can_ do this (you've done it!) you've also found that
it's slow. Many people would argue that even using range() for loop
control is unusably slow.
 
L

Leif K-Brooks

Grant said:
Using xrange as somebody else suggested is also insane.

Sorry about that, I somehow got the misguided notion that xrange defines
its own __contains__, so that it would be about the same speed as using
comparison operators directly. I figured the OP might have a better
reason for wanting to use range() than his post mentioned -- perhaps the
range to check was being passed from a function, and it would be easier
to pass an object than a tuple of lower and upper bound -- but since
xrange does looping for a membership test, my suggestion was indeed insane.
 
J

John Machin

John said:
it seems that range() can be really slow:
[...]

Some things to try:
1a. Read what the manual has to say about the range() function ... what
does it produce?

Indeed. Still, the addition of a __contains__ method to range (and
xrange) would permit acceptable performance for the code given. Perhaps
this is a convenience worth considering for future Python releases.

range() and xrange() are functions. You are suggesting that 2
*functions* should acquire a __contains__ method each? I trust not.

Perhaps you meant that the acquisitors should be the objects that those
functions return.

range() returns a list. Lists already have a __contains__ method. It's
been getting a fair old exercising in the last few hours, but here we go
again:
>python -mtimeit -s"x=range(30000)" "x.__contains__(40000)"
1000 loops, best of 3: 1.09 msec per loop
>python -mtimeit -s"x=range(30000)" "40000 in x"
1000 loops, best of 3: 1.09 msec per loop
>python -mtimeit -s"x=range(30000)" "x.__contains__(1)"
1000000 loops, best of 3: 0.434 usec per loop
>python -mtimeit -s"x=range(30000)" "1 in x"
1000000 loops, best of 3: 0.137 usec per loop

A __contains__ method for xrange objects? The case of x in xrange(lo,
hi) is met by lo <= x < hi. IMHO, anyone who wants x in xrange(lo, hi,
step) should be encouraged to do the necessary algebra themselves; I
wouldn't suggest that any core dev time be wasted on implementing such
folderol.

In any case, isn't the *whole* xrange gizmoid on GvR's I-wish-I-hadn't list?

Cheers,
John
 
P

Paul Boddie

John said:
range() and xrange() are functions. You are suggesting that 2
*functions* should acquire a __contains__ method each? I trust not.

Well, range is a function in the current implementation, although its
usage is similar to that one would get if it were a class, particularly
a subclass of list or one providing a list-style interface. With such a
class, you could provide a __contains__ method which could answer the
question of what the range contains based on the semantics guaranteed
by a range (in contrast to a normal list).
Perhaps you meant that the acquisitors should be the objects that those
functions return.

The whole point was to avoid expanding the range into a list.

[...]
A __contains__ method for xrange objects? The case of x in xrange(lo,
hi) is met by lo <= x < hi. IMHO, anyone who wants x in xrange(lo, hi,
step) should be encouraged to do the necessary algebra themselves; I
wouldn't suggest that any core dev time be wasted on implementing such
folderol.

Sure, you could always implement your own class to behave like an
existing range and to implement the efficient semantics proposed above.
In any case, isn't the *whole* xrange gizmoid on GvR's I-wish-I-hadn't list?

Yes, he wants range to return an iterator, just like xrange more or
less does now. Given that xrange objects support __getitem__, unlike a
lot of other iterators (and, of course, generators), adding
__contains__ wouldn't be much of a hardship. Certainly, compared to
other notational conveniences bounced around on the various development
lists, this one would probably provide an order of magnitude
improvement on the usual bang per buck development ratio.

Paul
 
D

Dan Bishop

Paul said:
Yes, he wants range to return an iterator, just like xrange more or
less does now. Given that xrange objects support __getitem__, unlike a
lot of other iterators (and, of course, generators), adding
__contains__ wouldn't be much of a hardship. Certainly, compared to
other notational conveniences bounced around on the various development
lists, this one would probably provide an order of magnitude
improvement on the usual bang per buck development ratio.

xrange already has __contains__. The problem is, it's implemented by a
highly-inefficient sequential search. Why not modify it to merely
check the bounds and (value - start) % step == 0?
 
G

Grant Edwards

I'm assuming you used Python's compound comparison as opposed to the
C-style of and'ing two comparisons together to emphasize the fact it is
god's chosen way of doing this ;-)

Exactly!
 
G

Grant Edwards

Pete doesn't like to be called god in public. ;-)

Interesting point. Does the phrase "for pete's sake as god
intended" equate pete with god?

It's possible that pete is not god and yet god's intentions are
in pete's best interest, so something could be "for pete's
sake" and "as god intended" without pete being god.

That said, "for pete's sake" is probably a just an cleaned up
version of "for god's sake", so I probably did call pete god.
 
G

Grant Edwards

Well, range is a function in the current implementation,
although its usage is similar to that one would get if it were
a class, particularly a subclass of list or one providing a
list-style interface. With such a class, you could provide a
__contains__ method which could answer the question of what
the range contains based on the semantics guaranteed by a
range (in contrast to a normal list).


The whole point was to avoid expanding the range into a list.

It's unclear what you're referring to as "the range".

Perhaps you're thinking of a slice? Somethign like

if (0:10000).contains(x):
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top