Would there be support for a more general cmp/__cmp__

A

Antoon Pardon

I was wondering how people would feel if the cmp function and
the __cmp__ method would be a bit more generalised.

The problem now is that the cmp protocol has no way to
indicate two objects are incomparable, they are not
equal but neither is one less or greater than the other.

So I thought that either cmp could return None in this
case or throw a specific exception. People writing a
__cmp__ method could do the same.
 
D

Duncan Booth

Antoon said:
The problem now is that the cmp protocol has no way to
indicate two objects are incomparable, they are not
equal but neither is one less or greater than the other.

If that is the case then you implement __eq__ and __ne__ to return
True/False and make cmp throw an exception. I don't see any need to extend
the cmp protocol.
 
A

Antoon Pardon

Op 2005-10-20 said:
If that is the case then you implement __eq__ and __ne__ to return
True/False and make cmp throw an exception. I don't see any need to extend
the cmp protocol.

That is not sufficient. There are domains where an order exists, but not
a total order. So for some couples of elements it is possible to say
one is less than the other, but not for all such couples.

Your solution wouldn't allow for such a case.
 
D

Duncan Booth

Antoon said:
That is not sufficient. There are domains where an order exists, but
not a total order. So for some couples of elements it is possible to
say one is less than the other, but not for all such couples.

Your solution wouldn't allow for such a case.

I don't see why not. If an ordering exists then __cmp__ returns a result
otherwise it throws an exception. So long as you define __eq__ and __ne__,
__cmp__ is never called when checking for equality.
 
M

Mikael Olofsson

Antoon said:
That is not sufficient. There are domains where an order exists, but not
a total order. So for some couples of elements it is possible to say
one is less than the other, but not for all such couples.

Your solution wouldn't allow for such a case.

Wouldn't the full power of rich comparisons be enough? See section 3.3.1
in Python Reference Manual:

http://www.python.org/doc/2.4.2/ref/customization.html

See methods __lt__, __le__, __eq__, __ne__, __gt__, __ge__.

/MiO
 
A

Antoon Pardon

Op 2005-10-20 said:
Wouldn't the full power of rich comparisons be enough? See section 3.3.1
in Python Reference Manual:

http://www.python.org/doc/2.4.2/ref/customization.html

See methods __lt__, __le__, __eq__, __ne__, __gt__, __ge__.

Sure, but it would also be rather cumbersome. It is not uncommon
that a general compare function which could give you one of
four results: one_less_two, one_equal_two, one_greater_two or
one_unequal_two, wouldn't differ much from each of those six
comparison methods. So you either write about six times the
same code or you have to do something like the following:

class partial_order:

def __compare(self, other):
...
return ...

def __lt__(self, other):
return self.__compare(other) == one_less_two

def __le__(self, other):
return self.__compare(other) in [one_less_two, one_equal_two]

def __eq__(self, other):
return self.__compare(other) == one_equal_two

def __ne__(self, other):
return self.__compare(other) == one_unequal_two

def __gt__(self, other):
return self.__compare(other) == one_greater_two

def __ge__(self, other):
return self.__compare(other) in [one_greater_two, one_equale_two]


And then you still wouldn't get a usefull result from:

delta = cmp(e1, e2)
 
S

Steve Holden

Antoon said:
I was wondering how people would feel if the cmp function and
the __cmp__ method would be a bit more generalised.

The problem now is that the cmp protocol has no way to
indicate two objects are incomparable, they are not
equal but neither is one less or greater than the other.

So I thought that either cmp could return None in this
case or throw a specific exception. People writing a
__cmp__ method could do the same.
The current behaviour is, of course, by design: """The operators <, >,
==, >=, <=, and != compare the values of two objects. The objects need
not have the same type. If both are numbers, they are converted to a
common type. Otherwise, objects of different types always compare
unequal, and are ordered consistently but arbitrarily."""

Personally I'm still not convinced that your requirement reflects a
substantial use case (but then I'm getting used to that ;-). Just
because an ordering is partial that doesn't mean that two instances of a
class shouldn't be compared.

What would you have Python do when the programmer tries to perform an
invalid comparison (i.e. what are the exact semantics imposed when
__cmp__() returns None/raises an exception)?

regards
Steve
 
A

Antoon Pardon

Op 2005-10-20 said:
The current behaviour is, of course, by design: """The operators <, >,
==, >=, <=, and != compare the values of two objects. The objects need
not have the same type. If both are numbers, they are converted to a
common type. Otherwise, objects of different types always compare
unequal, and are ordered consistently but arbitrarily."""

Personally I'm still not convinced that your requirement reflects a
substantial use case (but then I'm getting used to that ;-). Just
because an ordering is partial that doesn't mean that two instances of a
class shouldn't be compared.

I'm not suggesting that they shouldn't be compared.
What would you have Python do when the programmer tries to perform an
invalid comparison (i.e. what are the exact semantics imposed when
__cmp__() returns None/raises an exception)?

My appologies, I should have been more complete.

What I want is a way to say that all of the following are False:

a < b, a <= b, a == b, a > b, a >= b

and that only the following is True:

a != b


So if a coder writes one of the comparisons and as a result python
calls the __cmp__ method on one of the terms which would return
either None or raise an exceptions I would like it to reflect
the above behaviour.
 
T

Toby Dickenson

Personally I'm still not convinced that your requirement reflects a
substantial use case (but then I'm getting used to that ;-). Just
because an ordering is partial that doesn't mean that two instances of a
class shouldn't be compared.

C++ has a cmp template function which can be implemented to define a total
ordering for one type. This can be reliably used by implementations of
ordered tree containers (for example) so that this container template class
can only be instantiated for holding types that provide this comparison
template function.

One disadvantage is that these template container classes can only hold one
type.

ZODB's BTrees work in a similar way but use the regular python comparison
function, and the lack of a guarantee of a total ordering can be a liability.
Described here in 2002, but I think same is true today:
http://mail.zope.org/pipermail/zodb-dev/2002-February/002304.html

A BTree might contain two objects that are incomparable. That is, they raise
an exception when compared. However the user will not know this if by
coincidence the BTree implementation has never needed to compare these two
objects.

Now removing a third object from the container such that these two
incomparable objects are adjacent in the BTree. There is a possibility that
an exception might be raised when *reading* content from the BTree, since the
BTree implementation now might need to compare this pair of objects.
What would you have Python do when the programmer tries to perform an
invalid comparison (i.e. what are the exact semantics imposed when
__cmp__() returns None/raises an exception)?

Its too late to handle this by the time a specific comparison method of an
individual object is being called. If you need a total ordering across a
domain of objects then you need to involve some representation of that domain
as a whole.
 
S

Steve Holden

Antoon said:
I'm not suggesting that they shouldn't be compared.
Sorry, I thought that was the meaning of "The problem now is that the
cmp protocol has no way to indicate two objects are incomparable".
My appologies, I should have been more complete.

What I want is a way to say that all of the following are False:

a < b, a <= b, a == b, a > b, a >= b

and that only the following is True:

a != b


So if a coder writes one of the comparisons and as a result python
calls the __cmp__ method on one of the terms which would return
either None or raise an exceptions I would like it to reflect
the above behaviour.
Ergo: use rich comparisons.

regards
Steve
 
T

Tim Peters

[Toby Dickenson]
...
ZODB's BTrees work in a similar way but use the regular python comparison
function, and the lack of a guarantee of a total ordering can be a liability.
Described here in 2002, but I think same is true today:
http://mail.zope.org/pipermail/zodb-dev/2002-February/002304.html

There's a long discussion of this in the ZODB Programming Guide,
subsection "5.3.1 Total Ordering and Persistence" on this page:

http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html

That talks about more than the referenced ZODB thread covered.

Persistence adds more twists, such as that the total ordering among
keys must remain the same across time (including across Python
releases; for example, how None compares to objects of other types has
changed across releases).
 
A

Antoon Pardon

Op 2005-10-20 said:
Ergo: use rich comparisons.

rich comparosons can only solve the problem partly.

Python does have the cmp function and that can give
totaly inconsistent results even when the order
defined by the rich comparison functions is consistent
but only partial.

I think it is a problem if cmp gives me the following
results.
-1

It would be better if cmp would give an indication it
can't compare two objects instead of giving incorrect
and inconsistent results.
 
C

Christopher Subich

Antoon said:
It would be better if cmp would give an indication it
can't compare two objects instead of giving incorrect
and inconsistent results.

If two objects aren't totally comparable, then using 'cmp' on them is
ill-defined to begin with. The Standard Thing To Do is throw an
exception; see the Highly Obscure Case of the Complex Numbers.
Traceback (most recent call last):
File "<stdin>", line 1, in ?
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

So using the well-known case of complex numbers, the semantics are
already well-defined.
.... def __cmp__(self,other):
.... raise TypeError("cannot compare Incomparables using <,
<=, >, >=")
.... def __eq__(self,other):
.... return self is other
.... def __neq__(self,other):
.... return self is not otherTraceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in __cmp__
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in __cmp__
TypeError: cannot compare Incomparables using <, <=, >, >=


So your use-case is already well-defined, and rather simple. Make
__cmp__ raise an exception of your choice, and define rich comparators
only for the comparisons that are supported. If, as you say in another
post, "some" pairs in D cross D are comparable on an operator but not
all of them (and further that this graph is not transitive), then your
_ONLY_ choices, no matter your implementation, is to return some
possibly inconsistent result (a < b == 1, b < c == 1, a < c == 0) or
raise an exception for unapplicable comparisons.

This isn't a Python problem; this is a problem from formal mathematics.

Personally, I'd favor the "raise an exception" case, which is exactly
what will happen if you define rich comparisons and let cmp throw an
exception. Operators that assume comparable objects, like sort, also
almost always assume a total order -- inconsistent operations can give
weird results, while throwing an exception will at least (usually) give
some sort of error that can be caught.

Another poster mentioned the B-tree example, and that isn't solvable in
this case -- B-trees assume a total order, but by their nature aren't
explicit about checking for it; inserting a "partial plus exception"
order might result in tree failure at weird times. An inconsistent
order, however, is even worse -- it would corrupt the tree at the same
times.
 
A

Antoon Pardon

Op 2005-10-21 said:
If two objects aren't totally comparable, then using 'cmp' on them is
ill-defined to begin with.

Where is that documented?
The Standard Thing To Do is throw an
exception; see the Highly Obscure Case of the Complex Numbers.

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

I would say this is the wrong answer. The right answer should
be False IMO.

Especially in the following case do I think the TypeError is
the wrong answer:
Traceback (most recent call last):
File "<stdin>", line 1, in ?

Look at sets:
set([1]) <= set([2]) False
set([2]) <= set([1]) False
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

So using the well-known case of complex numbers, the semantics are
already well-defined.

I think one can argue against this, but I'm willing to leave this
as it is for complex numbers. But I think the situation with sets
should be trated differently than currently.
... def __cmp__(self,other):
... raise TypeError("cannot compare Incomparables using <,
<=, >, >=")
... def __eq__(self,other):
... return self is other
... def __neq__(self,other):
... return self is not other
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in __cmp__

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in __cmp__
TypeError: cannot compare Incomparables using <, <=, >, >=


So your use-case is already well-defined, and rather simple. Make
__cmp__ raise an exception of your choice,

That is not well defined. It doesn't allow to distinghuish between
something went wrong in __cmp__ and and answer that indicates the
only usefull thing you can say is that they are unequal.

Besides I find the following result inconsistent:
set([1]) <= set([1,2]) True
cmp(set([1]), set([1,2]))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare sets using cmp()

The first set is smaller than the second so acoording to the
documentation, I should get a negative number as a result.
and define rich comparators
only for the comparisons that are supported. If, as you say in another
post, "some" pairs in D cross D are comparable on an operator but not
all of them (and further that this graph is not transitive),

I was talking about partial orders, partial orders are transitive
and python still doesn't handle them right.
then your
_ONLY_ choices, no matter your implementation, is to return some
possibly inconsistent result (a < b == 1, b < c == 1, a < c == 0) or
raise an exception for unapplicable comparisons.

This isn't a Python problem; this is a problem from formal mathematics.

I think it is python's problem if python gives the wrong results.

when a < b then cmp(a,b) should give a negative number as a result.
Python sometimes does not!

when not a < b then cmp(a,b) should not give a negative number as
a result. Python sometimes does!

I think those are python problems.
Personally, I'd favor the "raise an exception" case, which is exactly
what will happen if you define rich comparisons and let cmp throw an
exception.

But I don't want cmp to raise an exception when the two elements are
comparable.

That cmp(set([1]), set([2])) would raise an exception, I can understand,
although I would prefer it returned None, but why should
cmp(set([1]), set([1,2])) raise an exception. There is a perfectly
well defined answer for this.
Operators that assume comparable objects, like sort, also
almost always assume a total order -- inconsistent operations can give
weird results, while throwing an exception will at least (usually) give
some sort of error that can be caught.

No it wont. the sort method will happily sort a list of sets.
lst [set([1]), set([2])]
lst.sort()

I agree that it would have been better if an exception had been in
raised in the sort method here.
Another poster mentioned the B-tree example, and that isn't solvable in
this case -- B-trees assume a total order, but by their nature aren't
explicit about checking for it; inserting a "partial plus exception"
order might result in tree failure at weird times. An inconsistent
order, however, is even worse -- it would corrupt the tree at the same
times.

Yes, and it is almost unavoidable now in python. There is no
documentation about handling these cases. So the most likely
scenario is that the person writing the partial ordered class
will use rich comparisons and leave it at that. In that case
using such a class in a tree will certainly result in a corrupt
tree. But even if __cmp__ raises an exceptions at the appropiate
time, that is no good if the implementation of the tree won't
be affected by it because it uses the comparasons operators instead
of the cmp function.

That is why I would suggest the following:

Define a new exception: UnequalException.

If two elements are just unequal whithout one being smaller
or greater than the other, this exception should be raised
by cmp and programmers can do so in __cmp__.

If __cmp__ is used to solve one of the comparisons, then
this exception should result in 'False' except for !=

If the rich comparisons are used to give a result for cmp(a,b)
then the following should happen.

test if a <= b

if so test a == b to decide between a negative number
and zero

if not test a > b to decide between a positive number
and raising an exception.

This would allow for people implementing tree's or sort
algorithms to detect when the data used is not appropiate
and throw an exception instead of giving meaningless results
or corrupt data structures.
 
S

Steve Holden

Antoon said:
Where is that documented?




I would say this is the wrong answer. The right answer should
be False IMO.
Well in that case it's time to declare Principia Mathematica obsolete.
The real numbers are a two-dimensional field, and you are proposing to
define an ordering for it.

This is a snail that really shoudn't be salted at all.
Especially in the following case do I think the TypeError is
the wrong answer:



Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=
Here you seem to be asking for special-case code to detect complex
numbers that are on the real line. Mathematically speaking this is an
infintesimal subset of the complex plane, so you are expecting a lot.
And presumably next you would argue that "since (3+0j) < (2+0j) returns
False so should (3+1j) < (2+2j)".

I think you should find something more productive to do with our time.
And mine ;-)
Look at sets:

set([1]) <= set([2])
False
set([2]) <= set([1])

False
Set orderingd are explicitly documented as being based on proper
subsetting. This is an abuse of the operators to make subset tests more
convenient rather than a definition of an ordering.[...]
The rest of your post does highlight one or two inconsistencies, but I
don't frankly see yor proposed solutions making anything better.

regards
Steve
 
A

Antoon Pardon

Op 2005-10-24 said:
Antoon said:
set([1]) <= set([2])
False

set([2]) <= set([1])

False
Set orderingd are explicitly documented as being based on proper
subsetting. This is an abuse of the operators to make subset tests more
convenient rather than a definition of an ordering.

It *is* a definition of an ordering.

For something to be an ordering it has to be anti symmetric and transitive.

The subset relationships on sets conform to these conditions so it is a (partial)
ordering. Check your mathematic books, Why you would think this is abuse is beyond me.
 
S

Steven D'Aprano

Where is that documented?

Surely that is so obvious, it doesn't need documenting? If comparisons
aren't defined between two things ("which is longer, water or fire?") then
comparing them is ill-defined.

On second thoughts... no, I don't believe it is so obvious. People are
used to thinking about comparisons in the context of numbers and strings,
and it won't hurt to remind them that not every pair of objects can be
compared.
I would say this is the wrong answer. The right answer should
be False IMO.

I'm afraid that mathematically you are wrong. Greater and less than is not
defined for complex numbers.

I think I see where you are coming from: you are reading "1 < 1j" as "is
one less than one*j", the answer to which is "no". That's a sensible
answer to a sensible question, except for two problems:

(1) you will upset the mathematical purists, who argue that the English
sentence is not quite the same as the mathematical relation; and

(2) you will upset the programmers, since your suggestion would create a
whole class of gotchas and bugs, where people wrongly assume that since
1<1j is False, 1>1j should be True.

This is one case where practicality *is* purity.

Especially in the following case do I think the TypeError is the wrong
answer:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

That's an interesting case. A mathematical purist might argue that
ordering is not defined for complex numbers, even if the imaginary
component is zero. A *different* purist might argue:

3 = 3+0j
2 = 2+0j

then since 3 > 2 it should also be true that 3+0j > 2+0j.

I can see merit in both arguments, but I'm inclined to go with the second
argument, if and only if the imaginary component is exactly zero. I take
comfort in the fact that mathematicians allow continued fractions with
zero terms even though dividing by zero is verboten.

Look at sets:
set([1]) <= set([2]) False
set([2]) <= set([1])
False

No, sorry, you are making a mistake. In the context of sets, the <
operator doesn't mean "less than", it means "is a subset of". It is
correct to say neither "set([1]) is a subset of set([2])" nor vice versa
is true. Both are false.

By analogy, one can ask, "is the cat inside the box?" and get the answer
"No", but this does not imply that therefore the box must be inside the
cat.


[snip]
That is not well defined. It doesn't allow to distinghuish between
something went wrong in __cmp__ and and answer that indicates the only
usefull thing you can say is that they are unequal.

Agreed there: there should be a distinct exception for trying to compare
incomparable objects, rather than ValueError or TypeError.

Besides I find the following result inconsistent:
set([1]) <= set([1,2]) True
cmp(set([1]), set([1,2]))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare sets using cmp()

The first set is smaller than the second so acoording to the
documentation, I should get a negative number as a result.

No, the first expression tests whether set([1]) is a subset of set([1,2]),
not whether it is "smaller". Smaller and bigger aren't defined for sets.
Informally, we can say that one set is smaller than another if it has
fewer members, but that's only one possible definition of "smaller" out of
many.

Think, for example, of the set of all US military forces. That has one
member. Compare it to the set of all US presidents whose surname is or was
Bush. That set has two members. Do you think it is sensible definition of
"smaller" to say that the set of US military forces is smaller than the
set of Presidents called Bush?


[snip]
I think it is python's problem if python gives the wrong results.

when a < b then cmp(a,b) should give a negative number as a result.
Python sometimes does not!

when not a < b then cmp(a,b) should not give a negative number as a
result. Python sometimes does!

I think those are python problems.

I think you need to give examples of where < or > *defined as comparisons*
returns the wrong results. The examples from sets do not count, because
the operators aren't being used for comparisons.

Personally, I'd favor the "raise an exception" case, which is exactly
what will happen if you define rich comparisons and let cmp throw an
exception.

But I don't want cmp to raise an exception when the two elements are
comparable.

That cmp(set([1]), set([2])) would raise an exception, I can understand,
although I would prefer it returned None, but why should cmp(set([1]),
set([1,2])) raise an exception. There is a perfectly well defined answer
for this.

Alas, there is not. First you have to define what "less than" and "greater
than" means for sets, and there is no well-defined answer to that. They
certainly don't mean "is a subset" and "is a superset". Nor is it sensible
to define them in terms of the number of elements:

def lt(set1, set2):
"""Defines less than for sets in terms of the number of
elements."""
return len(set1) < len(set2)

def gt(set1, set2):
"""Defines greater than for sets in terms of the number of
elements."""
return len(set1) > len(set2)

py> lt( set([1]), set([1, 2]) )
True
py> gt( set([1]), set([1, 2]) )
False

So far so good. But now:

py> lt( set([1]), set([2]) )
False
py> gt( set([1]), set([2]) )
False
py> set([1]) == set([2])
False

So set([1]) is neither less than, nor greater than, nor equal to set([2]).

Operators that assume comparable objects, like sort, also almost always
assume a total order -- inconsistent operations can give weird results,
while throwing an exception will at least (usually) give some sort of
error that can be caught.

No it wont. the sort method will happily sort a list of sets.
lst [set([1]), set([2])]
lst.sort()

That's an accident of implementation. Sorting nows uses only <, rather
than __cmp__, and since < (subset) for sets is defined, sort blindly goes
and sorts the list.

However, look at this:

py> from sets import Set as set
py> L1 = [set([1]), set([2]), set([3])]
py> L2 = [set([2]), set([3]), set([1])]

Notice that L1 and L2 contain the same elements, just in different orders.
Sorting those two lists should give the same order, correct?

py> L1.sort()
py> L2.sort()
py> L1
[Set([1]), Set([2]), Set([3])]
py> L2
[Set([2]), Set([3]), Set([1])]

Should, but doesn't. Oops. That's a bug.

Personally, I argue that sorting is something that you do to lists, and
that all lists should be sortable regardless of whether the objects within
them have a total order or not. If not, the list should impose a sensible
ordering on them, such as (perhaps) lexicographical order ("dictionary
order").

Look at it this way: books do not have a total ordering. What does it mean
to say that Tolstoy's "War and Peace" is less than Tom Clancy's "The Hunt
For Red October"? Are we comparing by title? Author? Contents? Cost of the
book, dimensions of the book, number of pages, some subjective idea of
quality, average word length, maximum sentence length, publication date,
number of protagonists, or what?

And yet libraries somehow find no difficulty in sorting shelves of
hundreds or thousands of books.

So in my opinion, sorting is a red-herring. In an ideal world (Python 3
perhaps?) sorting will be divorced from comparisons, or at least seeing
other people. But that still leaves the problem of comparisons.
 
A

Antoon Pardon

Op 2005-10-24 said:
Surely that is so obvious, it doesn't need documenting? If comparisons
aren't defined between two things ("which is longer, water or fire?") then
comparing them is ill-defined.

On second thoughts... no, I don't believe it is so obvious. People are
used to thinking about comparisons in the context of numbers and strings,
and it won't hurt to remind them that not every pair of objects can be
compared.

I also think there is the problem that people aren't used to partial
ordering. There is an ordering over sets, it is just not a total
ordering. But that some pairs are uncomparable (meaning that neither
one is smaller or greater) doesn't imply that comparing them is
ill defined.
I'm afraid that mathematically you are wrong. Greater and less than is not
defined for complex numbers.

I think I see where you are coming from: you are reading "1 < 1j" as "is
one less than one*j", the answer to which is "no". That's a sensible
answer to a sensible question, except for two problems:

(1) you will upset the mathematical purists, who argue that the English
sentence is not quite the same as the mathematical relation; and

(2) you will upset the programmers, since your suggestion would create a
whole class of gotchas and bugs, where people wrongly assume that since
1<1j is False, 1>1j should be True.

This is one case where practicality *is* purity.

Well it is a wrong assumption is general. There is nothing impure about
partial ordering.
Look at sets:
set([1]) <= set([2]) False
set([2]) <= set([1])
False

No, sorry, you are making a mistake. In the context of sets, the <
operator doesn't mean "less than", it means "is a subset of". It is
correct to say neither "set([1]) is a subset of set([2])" nor vice versa
is true. Both are false.

That is IMO irrelevant. The subset relationship is an ordering and as
such has all characteristics of other orderings like "less than",
except that the subset relationship is partial and the less than
relationship is total. How it is called "subset" vs "less than" is
IMO irrelevant. It is about mathematical characteristics.
By analogy, one can ask, "is the cat inside the box?" and get the answer
"No", but this does not imply that therefore the box must be inside the
cat.

Bad analogy, this doesn't define a mathematical ordering, the subset
relationship does.
set([1]) <= set([1,2]) True
cmp(set([1]), set([1,2]))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare sets using cmp()

The first set is smaller than the second so acoording to the
documentation, I should get a negative number as a result.

No, the first expression tests whether set([1]) is a subset of set([1,2]),
not whether it is "smaller".

This is quibling with words. Both test whether one comes before the
other is a certain order.
Smaller and bigger aren't defined for sets.

It is not about smaller or bigger, it is about the general concept
of (mathematical) order. Whether you call this order, subset or
less than is not important.
I think you need to give examples of where < or > *defined as comparisons*
returns the wrong results. The examples from sets do not count, because
the operators aren't being used for comparisons.

IMO they count, because the subset relationship is a mathematical order
relation just as less than is one.
Personally, I'd favor the "raise an exception" case, which is exactly
what will happen if you define rich comparisons and let cmp throw an
exception.

But I don't want cmp to raise an exception when the two elements are
comparable.

That cmp(set([1]), set([2])) would raise an exception, I can understand,
although I would prefer it returned None, but why should cmp(set([1]),
set([1,2])) raise an exception. There is a perfectly well defined answer
for this.

Alas, there is not. First you have to define what "less than" and "greater
than" means for sets, and there is no well-defined answer to that. They
certainly don't mean "is a subset" and "is a superset". Nor is it sensible
to define them in terms of the number of elements:

Look I have to define this for whatever class of objects I want to use
anyway. On what grounds do you argue that using partial orderings
shouldn't work.

Wether you want to call being a subset "less than" or not is not the
issue. If I need it for my program I can certainly define a set class
where "less than" will be like the subset relationship or I can define
other classes where an ordering exists but not a total one.

IMO there is nothing wrong with using the order symbols python already
has, to use ordering on other objects..
def lt(set1, set2):
"""Defines less than for sets in terms of the number of
elements."""
return len(set1) < len(set2)

def gt(set1, set2):
"""Defines greater than for sets in terms of the number of
elements."""
return len(set1) > len(set2)

py> lt( set([1]), set([1, 2]) )
True
py> gt( set([1]), set([1, 2]) )
False

So far so good. But now:

py> lt( set([1]), set([2]) )
False
py> gt( set([1]), set([2]) )
False
py> set([1]) == set([2])
False

So set([1]) is neither less than, nor greater than, nor equal to set([2]).

So? It seems you never have encounted a partial ordering before. What
you have illustrated here is not special among (partial) orderings.
And the world is full of partial orderings. I think it is wrong to
shoehorn everything in a total ordering and that is why I think
python should give programmers a way in handling partial orderings.
Operators that assume comparable objects, like sort, also almost always
assume a total order -- inconsistent operations can give weird results,
while throwing an exception will at least (usually) give some sort of
error that can be caught.

No it wont. the sort method will happily sort a list of sets.
lst [set([1]), set([2])]
lst.sort()

That's an accident of implementation. Sorting nows uses only <, rather
than __cmp__, and since < (subset) for sets is defined, sort blindly goes
and sorts the list.

However, look at this:

py> from sets import Set as set
py> L1 = [set([1]), set([2]), set([3])]
py> L2 = [set([2]), set([3]), set([1])]

Notice that L1 and L2 contain the same elements, just in different orders.
Sorting those two lists should give the same order, correct?

No. Since we don't have a total ordering, sorting doesn't has to give
the same result. For as far as sorting is defined on this kind of
object, the only thing I would expect is that after a sort the
following condition holds.

for all i said:
py> L1.sort()
py> L2.sort()
py> L1
[Set([1]), Set([2]), Set([3])]
py> L2
[Set([2]), Set([3]), Set([1])]

Should, but doesn't. Oops. That's a bug.

Personally, I argue that sorting is something that you do to lists, and
that all lists should be sortable regardless of whether the objects within
them have a total order or not. If not, the list should impose a sensible
ordering on them, such as (perhaps) lexicographical order ("dictionary
order").

Look at it this way: books do not have a total ordering. What does it mean
to say that Tolstoy's "War and Peace" is less than Tom Clancy's "The Hunt
For Red October"? Are we comparing by title? Author? Contents? Cost of the
book, dimensions of the book, number of pages, some subjective idea of
quality, average word length, maximum sentence length, publication date,
number of protagonists, or what?

And yet libraries somehow find no difficulty in sorting shelves of
hundreds or thousands of books.

But will all libraries sort the shelves in the same way. As far as I
understand a book gets somekind of code in a library and books are
sorted according to these codes. Now everybody sorts these codes the
same way, but the attibution of the code may differ from library to
library.
 
J

James Stroud

Well in that case it's time to declare Principia Mathematica obsolete.
The real numbers are a two-dimensional field, and you are proposing to
define an ordering for it.

I wasn't a math major, but don't you mean "the complex numbers are a two
dimensional field"? If you mean real numbers, please do explain.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
S

Steve Holden

James said:
I wasn't a math major, but don't you mean "the complex numbers are a two
dimensional field"?
Correct.

> If you mean real numbers, please do explain.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top