__lt__ slowing the "in" operator even if not called

E

Emanuele Aina

I have some code which does a lot of "in" on lists containing objects
with no __eq__ defined.

It all goes fast until I add the __lt__() method: then I have a
slowdown comparable to the one I get using the overridden __eq__, while
the __lt__ method is never called.

Someone can explain me why?


I have here a simple program which shows the behaviour:

*-------------------------------------------------------------------*

#!/usr/bin/env python

import timing

class State(object):
def __init__(self, value):
self.v = value

def generate(self):
return [self.__class__(self.v+1)]

class StateLT(State):
def __lt__ (self, other):
print 'Foo'
return self.v < other.v

class StateEQ(State):
def __eq__ (self, other):
return self.v == other.v


def test(state_cls):
print state_cls

timing.start()
initial = state_cls(0)
l = [initial]
for i in xrange(3000):

successors = l.generate()
successors = [s for s in successors if s not in l]

l += successors
timing.finish()
print timing.milli()

if __name__ == '__main__':
test(State)
print ' '
test(StateLT)
print ' '
test(StateEQ)

*-------------------------------------------------------------------*

On my machine the output is:

<class '__main__.State'>
406

<class '__main__.StateLT'>
4982

<class '__main__.StateEQ'>
5395

Here you can see that even with only the __lt__ method it goes 10x
slower, but __lt__ is never called as "Foo" is not printed.

I use the python 2.3.5-8 package on a debian unstable, but even the
python 2.4.3-5 package shows the same behaviour.
 
M

Maric Michaud

Le Mercredi 14 Juin 2006 22:57, Emanuele Aina a écrit :
Here you can see that even with only the __lt__ method it goes 10x
slower, but __lt__ is never called as "Foo" is not printed.
No, that is not what it shows. The only thing it shows is that in operator is
slow for lists, and as it iterates over the entire list untill it find an
element, it is much slower when it fails a lot.
Use dict in operator instead.

Here is a program to show this in details :

import timing

class State(object):
def __init__(self, value):
self.v = value

class StateEQ(State):
def __eq__ (self, other):
if isinstance(other, State) :
return self.v == other.v
else :
return self.v == other

class StateLT(State) :
def __lt__ (self, other):
if isinstance(other, State) :
return self.v < other.v
else :
return self.v < other

class StateLTEQ(StateEQ, StateLT) : pass

def test(state_cls):
num_elem = 3*10**4
print
print state_cls

l = [state_cls(0)]
for i in xrange(num_elem): l.append(state_cls(i+1))
print
print "in operator at the beginning of list:",
timing.start()
for i in xrange(300): i in l
timing.finish()
print timing.milli()
print "in operator at the end of list:",
timing.start()
for i in xrange(num_elem-300, num_elem): i in l
timing.finish()
print timing.milli()
print
print "converting to dict :",
timing.start()
d = {}.fromkeys(l)
timing.finish()
print timing.milli()
print "in operator for a dict for %s elements:" % (num_elem*2),
timing.start()
for i in xrange(num_elem, num_elem*2): i in d
timing.finish()
print timing.milli()

if __name__ == '__main__':
test(State)
test(StateLT)
test(StateEQ)
test(StateLTEQ)


for which the output is :

<class '__main__.State'>

in operator at the beginning of list: 1983
in operator at the end of list: 2011

converting to dict : 82
in operator for a dict for 60000 elements: 12

<class '__main__.StateLT'>

in operator at the beginning of list: 3866
in operator at the end of list: 3871

converting to dict : 332
in operator for a dict for 60000 elements: 50

<class '__main__.StateEQ'>

in operator at the beginning of list: 173
in operator at the end of list: 28249

converting to dict : 79
in operator for a dict for 60000 elements: 14

<class '__main__.StateLTEQ'>

in operator at the beginning of list: 202
in operator at the end of list: 30472

converting to dict : 68
in operator for a dict for 60000 elements: 50


Every classes that define the __eq__ operator will find quickly the elements
if they are at the beginning of the list.
If they are at the end, the in oprerator is slower in these classes because of
the overhead of function calls (in StateLt there is also an overhead due to
internal lookup, IMO).
Converting to dicts and looking for keys is *really* and equally fast in all
cases, it does not depend on success or failure.






--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
 
E

Emanuele Aina

Maric Michaud spiegò:
Le Mercredi 14 Juin 2006 22:57, Emanuele Aina a écrit :

No, that is not what it shows. The only thing it shows is that in operator is
slow for lists, and as it iterates over the entire list untill it find an
element, it is much slower when it fails a lot.

Yes, I now that "in" is slow for lists and that a dict is preferable,
but what I'm pointing out is that in one case is much slower without a
reason.

In my test, the one without __lt__ runs in ~500ms, while the one with
the __lt__ method runs in ~5000ms.

Just for comparision I've also tested with an overridden __eq__ and it
shows timings similar to the ones in the second test, but this time it
is expected, as "in" has to call __eq__ for every element in the list.

In the first test python uses the native equality (comparing memory
pointer) and it is fast because it is implemented in C in the
interpreter (no python function call).

In the last case python has to call __eq__ for every element, and a
python function call is obviuosly more expensive than the native
comparision.

But the __lt__ case leaves me wondering why it is slow as the __eq__
case, while using the native comparision.

What I asked is what is the difference between the first and the second
case?
Use dict in operator instead.

Yes, I know that a dict is the right datastructure for a fast "in", but
I need to keep track of the ordering, so I cannot use a dict.

Here is a program to show this in details :

import timing

class State(object):
def __init__(self, value):
self.v = value

class StateEQ(State):
def __eq__ (self, other):
if isinstance(other, State) :
return self.v == other.v
else :
return self.v == other

class StateLT(State) :
def __lt__ (self, other):
if isinstance(other, State) :
return self.v < other.v
else :
return self.v < other

class StateLTEQ(StateEQ, StateLT) : pass

def test(state_cls):
num_elem = 3*10**4
print
print state_cls

l = [state_cls(0)]
for i in xrange(num_elem): l.append(state_cls(i+1))
print
print "in operator at the beginning of list:",
timing.start()
for i in xrange(300): i in l
timing.finish()
print timing.milli()
print "in operator at the end of list:",
timing.start()
for i in xrange(num_elem-300, num_elem): i in l
timing.finish()
print timing.milli()

Sorry, but this works only when you override __eq__.

Without it it will alway scan the entire list, so you are comparing the
timings of two different things.
print
print "converting to dict :",
timing.start()
d = {}.fromkeys(l)
timing.finish()
print timing.milli()
print "in operator for a dict for %s elements:" % (num_elem*2),
timing.start()
for i in xrange(num_elem, num_elem*2): i in d
timing.finish()
print timing.milli()

As I said, that was only a test demo exploiting my observation, in my
real program I need to keep track of the order in which I add items in
the list, so I can't use a dict.
for which the output is :

<class '__main__.State'>

in operator at the beginning of list: 1983
in operator at the end of list: 2011

converting to dict : 82
in operator for a dict for 60000 elements: 12

<class '__main__.StateLT'>

in operator at the beginning of list: 3866
in operator at the end of list: 3871

Here python is using the equality comparator from "int", giving those
fast results.
I've substituted the ints with instances of state_cls:
- for i in xrange(300): i in l
+ for i in xrange(300): state_cls(i) in l

- for i in xrange(num_elem-300, num_elem): i in l
+ for i in xrange(num_elem-300, num_elem): state_cls(i) in l

Running these test, in the case of StateLT, now takes ~10000ms and
~9000ms.

[...]
Every classes that define the __eq__ operator will find quickly the elements
if they are at the beginning of the list.
If they are at the end, the in oprerator is slower in these classes because of
the overhead of function calls (in StateLt there is also an overhead due to
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
internal lookup, IMO).
^^^^^^^^^^^^^^^

Yes, that was my question! :)

But I hoped in a more exaustive answer: why python has to do this
lookup when the __lt__ method is not involved at all?
 
M

Maric Michaud

Le Jeudi 15 Juin 2006 14:14, Emanuele Aina a écrit :
But I hoped in a more exaustive answer: why python has to do this
lookup when the __lt__ method is not involved at all?

It is not the case, that's what my program shows :

<class '__main__.StateEQ'>

in operator at the beginning of list: 173
in operator at the end of list: 28249 <- here

converting to dict : 79
in operator for a dict for 60000 elements: 14

<class '__main__.StateLTEQ'>

in operator at the beginning of list: 202
in operator at the end of list: 30472 <- and here


--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
 
E

Emanuele Aina

Maric Michaud continuò:
It is not the case, that's what my program shows :

<class '__main__.StateEQ'>

in operator at the beginning of list: 173
in operator at the end of list: 28249 <- here

converting to dict : 79
in operator for a dict for 60000 elements: 14

<class '__main__.StateLTEQ'>

in operator at the beginning of list: 202
in operator at the end of list: 30472 <- and here

It is very obvious that these two have similiar timings, as both call
__eq__.

I asked why the State and StateLT don't give similar results, but
StateLT is noticeably slower.
 
C

Christophe

Emanuele Aina a écrit :
Maric Michaud continuò:




It is very obvious that these two have similiar timings, as both call
__eq__.

I asked why the State and StateLT don't give similar results, but
StateLT is noticeably slower.

Maybe because when python tries to compare 2 elements, checks if __lt__
is defined, and if it is, it checks for __gt__ ( or maybe __lteq__ ) but
since it isn't defined, it fallbacks on __cmp__. So for each comparison,
if you define only __lt__ you have some additional lookups to go through.
 
M

Maric Michaud

Le Jeudi 15 Juin 2006 14:14, Emanuele Aina a écrit :
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


  ^^^^^^^^^^^^^^^

Yes, that was my question! :)

But I hoped in a more exaustive answer: why python has to do this
lookup when the __lt__ method is not involved at all?

Ah, ok sorry, got your point, I didn't notice it in first time.
There is effectively an overhead when the class defines at less one of __lt__,
__gt__, __neq__ ... special attributes (comparison attributes), it seems in
my example that this overhead is a ratio of 1,5x.

It is related to comparison, as the following code show :


import timing

class State(object):
def __init__(self, value):
self.v = value

class StateFoo(State) :
def foo(self) : pass

class StateNew(State) :
def __nonzero__(self) : return None

class StateGT(State) :
def __gt__(self) : return None

class StateGTLT(StateGT) :
def __lt__(self) : return None

def test(cls) :
print cls
inst = cls(5)
timing.start()
for i in xrange(10**6) : i == inst
timing.finish()
print timing.milli()

test(State)
test(StateFoo)
test(StateNew)
test(StateGT)
test(StateGTLT)

for which ouput is :

<class '__main__.State'>
589
<class '__main__.StateFoo'>
596
<class '__main__.StateNew'>
615
<class '__main__.StateGT'>
815
<class '__main__.StateGTLT'>
836

Can't say more about what in the internals of python produce this non
negligeable overhead, but looking at the source code of builtin comparison
(for new style classes as it seems we have not this overhead with old style
classes) should be the fist step...

--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
 
A

andrewdalke

Emanuele said:
I have some code which does a lot of "in" on lists containing objects
with no __eq__ defined.

It all goes fast until I add the __lt__() method: then I have a
slowdown comparable to the one I get using the overridden __eq__, while
the __lt__ method is never called.

Someone can explain me why?

The list's __contains__ method is very simple

for (i = 0, cmp = 0 ; cmp == 0 && i < a->ob_size; ++i)
cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a,
i),
Py_EQ);

The PyObject_RichCompareBool becomes more complex. The
relevant code occurs after the check for object identity. The next
step is

res = PyObject_RichCompare(v, w, op);

which goes into your case

/* If the types are equal, and not old-style instances, try to
get out cheap (don't bother with coercions etc.). */
if (v->ob_type == w->ob_type && !PyInstance_Check(v)) {
cmpfunc fcmp;
richcmpfunc frich = RICHCOMPARE(v->ob_type);
/* If the type has richcmp, try it first.
try_rich_compare
tries it two-sided, which is not needed since we've
a
single type only. */
if (frich != NULL) {
res = (*frich)(v, w, op);
if (res != Py_NotImplemented)
goto Done;
Py_DECREF(res);
}
/* No richcmp, or this particular richmp not
implemented.
Try 3-way cmp. */
fcmp = v->ob_type->tp_compare;

One new-style class defines '__lt__' while the other does not.

Here's where things get shaky for me. I think new-style classes
are instances of PyInstanceObject (and not PyClassObject). Instances
use 'instance_richcompare' to implement the rich comparison
between two instances. This function does the lookup for '__eq__'
in the two classes.

The tp_richcompare slot is set to instance_richcompare when any
of __lt__, __gt__, __eq_, etc. are defined in a new type. The macro-
based code looks like

TPSLOT("__lt__", tp_richcompare, slot_tp_richcompare,
richcmp_lt,
"x.__lt__(y) <==> x<y"),
TPSLOT("__le__", tp_richcompare, slot_tp_richcompare,
richcmp_le,
"x.__le__(y) <==> x<=y"),
TPSLOT("__eq__", tp_richcompare, slot_tp_richcompare,
richcmp_eq,
"x.__eq__(y) <==> x==y"),
TPSLOT("__ne__", tp_richcompare, slot_tp_richcompare,
richcmp_ne,
"x.__ne__(y) <==> x!=y"),
TPSLOT("__gt__", tp_richcompare, slot_tp_richcompare,
richcmp_gt,
"x.__gt__(y) <==> x>y"),
TPSLOT("__ge__", tp_richcompare, slot_tp_richcompare,
richcmp_ge,
"x.__ge__(y) <==> x>=y"),

So if you define "__lt__" in your object then the type gets a richcmp
function and your == test implicit in the 'in' search always incurs the
cost of figuring out that "__eq__" is not defined.


Andrew
(e-mail address removed)
 
E

Emanuele Aina

(e-mail address removed) dettagliò:
Someone can explain me why?

The list's __contains__ method is very simple
[...]

So if you define "__lt__" in your object then the type gets a richcmp
function and your == test implicit in the 'in' search always incurs the
cost of figuring out that "__eq__" is not defined.

Thank you for the detailed explanation! :)

Do you think I should report this as a performance bug, maybe with the
'wishlist' priority, or I should accept the truth and hope for better
luck next time? ;)
 
S

Steven Bethard

Emanuele said:
(e-mail address removed) dettagliò:
Someone can explain me why?
The list's __contains__ method is very simple
[...]

So if you define "__lt__" in your object then the type gets a richcmp
function and your == test implicit in the 'in' search always incurs the
cost of figuring out that "__eq__" is not defined.

Thank you for the detailed explanation! :)

Do you think I should report this as a performance bug, maybe with the
'wishlist' priority, or I should accept the truth and hope for better
luck next time? ;)

It certainly wouldn't hurt to report it. But I suspect it's not ever
going to get "fixed". Classes with a __lt__ but no __eq__ really aren't
that common.

STeVe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top