Python interpreter bug

A

alainpoint

Hello,

I came accross what i think is a serious bug in the python interpreter.

Membership testing seems not to work for list of objects when these
objects have a user-defined __cmp__ method.
It is present in Python 2.3 and 2.4. I don't know about other versions.
The following code illustrates the bug:
from random import choice
class OBJ:
def __init__(self,identifier):
self.id=identifier
self.allocated=0
def __cmp__(self,other):
return cmp(other.allocated,self.allocated)
mylist=[OBJ(i) for i in range(20)]
excluded=[obj for obj in mylist if obj.id>choice(range(20))]
for obj in mylist:
if obj in excluded:
assert obj.id in [objt.id for objt in excluded]
continue

Running the above snippet will trigger the assert. The culprit seems to
be the __cmp__ method which sorts on a key with constant value.
Best regards
Alain
 
S

Simon Percivall

Why would it be a bug? You've made it so that every instance of OBJ is
equal to every other instance of OBJ. The behaviour is as expected.
 
S

Steve Holden

Hello,

I came accross what i think is a serious bug in the python interpreter.

Membership testing seems not to work for list of objects when these
objects have a user-defined __cmp__ method.
It is present in Python 2.3 and 2.4. I don't know about other versions.
The following code illustrates the bug:
from random import choice
class OBJ:
def __init__(self,identifier):
self.id=identifier
self.allocated=0
def __cmp__(self,other):
return cmp(other.allocated,self.allocated)
mylist=[OBJ(i) for i in range(20)]
excluded=[obj for obj in mylist if obj.id>choice(range(20))]
for obj in mylist:
if obj in excluded:
assert obj.id in [objt.id for objt in excluded]
continue
I presume you just put the "continue" in there for fun?
.... print obj in excluded
....
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True

Running the above snippet will trigger the assert. The culprit seems to
be the __cmp__ method which sorts on a key with constant value.

Well indeed. As far as I can see your objects will all test equal. Did
you mean the __cmp__ method to return cmp(other.id, self.id)?

regards
Steve
 
A

alainpoint

There is definitely a bug.
Maybe the follownig snippet is more clear:
class OBJ:
def __init__(self,identifier):
self.id=identifier
self.allocated=0
#def __cmp__(self,other):
# return cmp(other.allocated,self.allocated)
mylist=[OBJ(i) for i in range(10)]
excluded=[obj for obj in mylist if obj.id in [2,4,6,8]]
exclusion_list_by_id=[2,4,6,8]
print 'exclusion list by id=',exclusion_list_by_id
for obj in mylist:
print 'current obj=',obj.id,
if obj in excluded:
print ' ---> THIS OBJECT IS EXCLUDED'
assert obj.id in exclusion_list_by_id
continue
print

If you uncomment the two lines, the assert will be erroneously
triggered.
Alain
 
F

Fredrik Lundh

I came accross what i think is a serious bug in the python interpreter.
Membership testing seems not to work for list of objects when these
objects have a user-defined __cmp__ method.

it does not work if they have *your* __cmp__ method, no. if you add
a print statement to the method, you'll figure out why:

def __cmp__(self,other):
print "CMP", other.allocated, self.allocated
return cmp(other.allocated,self.allocated)

</F>
 
A

alainpoint

Sorry Fredrik but I don't understand. Just comment out the assert and
you have different results depending on whether an unrelated sort
function is defined.
This seems weird to me !
 
S

Steve Holden

Sorry Fredrik but I don't understand. Just comment out the assert and
you have different results depending on whether an unrelated sort
function is defined.
This seems weird to me !
Perhaps you don't understand what's going on. The test

obj in excluded

is succeeding for all your objects because all instances of the OBJ
class compare equal, and so the assert is failing for the ones that
don;t actually appear in the "excluded" list.

regards
Steve
 
F

Fredrik Lundh

Sorry Fredrik but I don't understand. Just comment out the assert and
you have different results depending on whether an unrelated sort
function is defined.

This seems weird to me !

not if you look at what it prints.

(if it seems weird to you that 0 equals 0, it's time for a break)

</F>
 
A

alainpoint

I understand this, Steve.
I thought the _cmp_ method was a helper for sorting purposes. Why is it
that a membership test needs to call the __cmp__ method?
If this isn't a bug, it is at least unexpected in my eyes.
Maybe a candidate for inclusion in the FAQ?
Thank you for answering
Alain
 
W

wittempj

Your __cmp__ method will always return 0, so all objects will be equal
when you add the method, as Simon and Steve pointed out. The result is
all objects will pass the test of being a member of excluded.
If you do not add a __cmp__ method objects will be compared on identy -
call the id() function to see the identity of an object. An alternative
would be Steve's proposal to compare on the id attribute instead of the
allocated attribute
 
W

wittempj

Your __cmp__ method will always return 0, so all objects will be equal
when you add the method, as Simon and Steve pointed out. The result is
all objects will pass the test of being a member of excluded.
If you do not add a __cmp__ method objects will be compared on identy -
call the id() function to see the identity of an object. An alternative
would be Steve's proposal to compare on the id attribute instead of the
allocated attribute
 
F

Fredrik Lundh

Why is it that a membership test needs to call the __cmp__ method?

because the membership test has to check if the tested item is a member
of the sequence. if it doesn't do that, it's hardly qualifies as a membership
test. from the reference manual:

For the list and tuple types, x in y is true if and only if there exists an
index i such that x == y is true.

</F>
 
A

alainpoint

In fact, i want to sort the list based on the 'allocated attribute' and
at the same time, test membership based on the id attribute.
__cmp__ logically implies an ordering test, not an identity test. These
two notions seems to be confounded in python which is unfortunate. Two
objects could have the same rank while still being different.
Alain
 
B

brianmce

For this, you can also define the __eq__ method, which will be
preferred to __cmp__ for equallity tests while still using __cmp__ for
searching / comparisons.
 
R

Robert Kern

In fact, i want to sort the list based on the 'allocated attribute' and
at the same time, test membership based on the id attribute.
__cmp__ logically implies an ordering test, not an identity test. These
two notions seems to be confounded in python which is unfortunate. Two
objects could have the same rank while still being different.

In that case, define __lt__ and __eq__ separately instead of __cmp__.
list.sort() will use __lt__ if it's provided rather than __cmp__.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
C

Carsten Haese

In fact, i want to sort the list based on the 'allocated attribute' and
at the same time, test membership based on the id attribute.
__cmp__ logically implies an ordering test, not an identity test. These
two notions seems to be confounded in python which is unfortunate. Two
objects could have the same rank while still being different.
Alain

You could use the id in __cmp__ as a tie-breaker if the two objects have
equal "allocated" values.

By the way, __cmp__ is not an identity test. The "in" operator doesn't
use object identity, it uses object equality, and __cmp__ does serve to
test object equality (if no __eq__ method is present).

Perhaps the following example will clarify the behavior of "in":
A = [1]
B = [1]
A==B True
A is B False
A in ["spam", 42, B]
True

HTH,

Carsten Haese.
 
B

bruno modulix

Sorry Fredrik but I don't understand. Just comment out the assert and
you have different results depending on whether an unrelated sort
function is defined.
This seems weird to me !

code snippet:
from random import choice
class OBJ:
def __init__(self,identifier):
self.id=identifier
self.allocated=0 # Your problem begins here...
def __cmp__(self,other):
# it continues here
return cmp(other.allocated,self.allocated)
mylist=[OBJ(i) for i in range(20)]
excluded=[obj for obj in mylist if obj.id>choice(range(20))]
for obj in mylist:
if obj in excluded: # and you see it in action here
assert obj.id in [objt.id for objt in excluded]
continue

How do you think the "membership" test works ? Could it be possible that
it uses the __cmp__ method ? And if so, how do you think your instances
will compare knowing that your __cmp__ method will return equality for
all the instances in this snippet ?

Just change your __init__ method to:

def __init__(self,identifier):
self.id=identifier
self.allocated=identifier

and rerun the test.

I don't think this is a bug...
 
F

Fredrik Lundh

In fact, i want to sort the list based on the 'allocated attribute' and
at the same time, test membership based on the id attribute.
__cmp__ logically implies an ordering test

really?

http://dictionary.reference.com/search?q=compare

com·pare
v. com·pared, com·par·ing, com·pares
v. tr.
1. To consider or describe as similar, equal, or analogous; liken.
2. To examine in order to note the similarities or differences of.
/.../
not an identity test.

it's not an identity test. it's a comparision, and the "in" operator
uses it to determine *equality*, not identity. you need to read
up on object comparisions in the reference manual.

</F>
 
A

alainpoint

No doubt you're right but common sense dictates that membership testing
would test identity not equality.
This is one of the rare occasions where Python defeats my common sense
;-(

Alain
 
C

Christopher Subich

No doubt you're right but common sense dictates that membership testing
would test identity not equality.
This is one of the rare occasions where Python defeats my common sense

But object identity is almost always a fairly ill-defined concept.
Consider this (Python 2.2, 'cuz that's what I have right now):

Python 2.2.3 (#1, Nov 12 2004, 13:02:04)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.1 # True on Py2.4


The only reliable thing that object identity tells you is "these two
foos occupy the same memory location." Even for identical, immutable
objects, this may not be true (see case above) -- some immutables do end
up occupying the same memory location (try i=1;j=2;k=j-1;i is k), but
this is by no means guraranteed, and indeed only happens sometimes
because of optimizations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,112
Latest member
VinayKumar Nevatia
Top