Python 3.0 - is this true?

R

Rhamphoryncus

Duncan said:
Very Hard? Several key functions have been suggested on this thread.
Given that 2.x only sorts most but not all types and that the sort is
only guaranteed to be consistent within a session, as I remember, I
suspect you can choose or write something at least as good for your
purposes.

It's actually worse than that: A different input within a single
session will produce different results. Sorting relies on
transitivity, and comparing such types doesn't provide it.

You might as well comment out the sort and call it good. That's what
you really had in 2.x. It was close enough most of the time to *look*
right, yet in truth it silently failed. 3.0 makes it an explicit
failure.

(There were some situations that worked, but they're exceptional. You
can still do them now, but you need to be explicit (via a key function
or a special singleton.))
 
S

Steve Holden

Huh? Thats like saying it's ok if cmp raises an error
when comparing negative numbers because "abs(x)" always
return positive ones. You will find plenty of cases
when db apps return NULL, e.g.:

SELECT name, salary WHERE name LIKE 'Steven %'
I'm not saying an RDBMS can't return NULL values. I am saying that
comparisons with NULL return NULL, not true or false. SQL uses
three-valued logic.
So you could say that 3.0 is forcing us to acknowledge database

(Again) huh?
Reality in databases is that NULL *is* comparable.
"NULL==something" returns False, it doesn't raise an error.

That's at best misleading and at worst just plain wrong. If I have the
following table T:

+-------+-------+
| a | b |
+-------+-------+
| 1 | 1 |
+-------+-------+
| 2 | NULL |
+-------+-------+

you appear to be telling me that

SELECT * FROM T WHERE b <> 1

will return (2, NULL), whereas in fact it returns the empty set, since
the tests NULL = something, and NULL <> something both in fact return NULL.

You can't do an equality or inequality comparison between NULL and
anything else - even another NULL - and get anything but a NULL result.

You have to explicitly test for NULLs using IS NULL.

regards
Steve
 
R

Russ P.

I have read that in Python 3.0, the following will raise an exception:

Will that raise an exception? And, if so, why are they doing this? How
is this helpful? Is this new "enhancement" Pythonic?


I realize that I am late to this discussion, but I would like to add
something. I did not read all the replies, so please forgive me if the
following points have already been made.

This new behavior enhances the dynamic typing of Python and is very
helpful for preventing hard-to-detect bugs.

Let me give you an example of one such bug that cost me a significant
amount of time a while back. I had a function that returned a number,
and I meant to use it in a comparison test like this:

if myfunc() < 0: ...

Well, I mistakenly left off the parens:

if myfunc < 0: ...

Python just went ahead and did the comparison anyway. This sort of
behavior is more like Perl, with its weak typing. It is not "Pythonic"
in the sense of strong dynamic typing, without which Python would be
much less powerful than it is. This kind of bug can be hard to detect,
and I am glad that Python 3.0 prevents them. I wonder how much
existing Python code has these kinds of bugs lurking.
 
R

rurpy

I'm not saying an RDBMS can't return NULL values. I am saying that
comparisons with NULL return NULL, not true or false. SQL uses
three-valued logic.



That's at best misleading and at worst just plain wrong.

Yes, it's just plain wrong. :-( You are correct that it
returns NULL not False. Nevertheless, that typo does
not change my point that NULLs are comparable to other
values in SQL, in contrast to your original post that
seemed to be using SQLs NULL behavior as justification
for Py3K's making None not comparable to anything.
 
R

rurpy

Given that in SQL "NULL `op` something" is False for all comparison
operators (even NULL=NULL), raising an error seems a much lesser evil

s/False/NULL/.
Why is that evil? It is logically consistent, and more importantly,
useful.

In Python, the logically consistent argument is a little weaker (not
having tri-state logic) but the useful argument certainly still seems
true.
 
S

Steve Holden

Yes, it's just plain wrong. :-( You are correct that it
returns NULL not False. Nevertheless, that typo does
not change my point that NULLs are comparable to other
values in SQL, in contrast to your original post that
seemed to be using SQLs NULL behavior as justification
for Py3K's making None not comparable to anything.

Well I guess now I understand your real point that's fair enough.

regards
Steve
 
S

Steven D'Aprano

I have an object database written in Python. It, like Python, is
dynamically typed. It heavily relies on being able to sort lists where
some of the members are None. To some extent, it also sorts lists of
other mixed types. It will be very hard to migrate this aspect of it to
Python 3.

No, it is "very hard" to sort *arbitrary* objects consistently. If it
appears to work in Python 2.x that's because you've been lucky to never
need to sort objects that cause it to break.

But if you have a list consisting of only a few types, then it is not
that hard to come up with a strategy for sorting them. All you need is a
key function. Suppose that you want to sort a list of numbers and None.
Then this should do what you expect:

# untested
alist.sort(key=lambda x: (0, -99) if x is None else (1, x))


Another suggestion would be a key function that wraps the objects in a
"compare anything" proxy class. This is very hard to get right for
arbitrary types, which is why sorting in Python 2.x apparently contains
subtle bugs. But if you control the types that can appear in your list,
it's much simpler. I leave the full details as an exercise, but the heart
of it will be something like this:

class SortableProxy(object):
# define the order of types
types = [NoneType, str, int, MyWackyObjectClass]
def __lt__(self, other):
if type(self.proxy) == type(other.proxy):
return self.proxy < other.proxy
p = self.types.index(type(self.proxy)
q = self.types.index(type(other.proxy)
return p < q

I leave it to you to sort out the remaining details (pun intended).
 
S

Steven D'Aprano

You might as well comment out the sort and call it good. That's what
you really had in 2.x. It was close enough most of the time to *look*
right, yet in truth it silently failed. 3.0 makes it an explicit
failure.

I don't doubt that this is correct, but I think the argument that sorting
in Python 2.x has silent bugs would be much stronger if somebody could
demonstrate arrays that sort wrongly.

A shiny wooden nickel for the first person to show such an example!
 
R

Rhamphoryncus

You might as well comment out the sort and call it good.  That's what
you really had in 2.x.  It was close enough most of the time to *look*
right, yet in truth it silently failed.  3.0 makes it an explicit
failure.

I don't doubt that this is correct, but I think the argument that sorting
in Python 2.x has silent bugs would be much stronger if somebody could
demonstrate arrays that sort wrongly.

A shiny wooden nickel for the first person to show such an example!

--
Steven
sorted([2, 1.5, Decimal('1.6'), 2.7, 2])
[1.5, 2.7000000000000002, Decimal("1.6"), 2, 2]

Where's my nickel? :p
 
R

Rhamphoryncus

I don't doubt that this is correct, but I think the argument that sorting
in Python 2.x has silent bugs would be much stronger if somebody could
demonstrate arrays that sort wrongly.
A shiny wooden nickel for the first person to show such an example!

[1.5, 2.7000000000000002, Decimal("1.6"), 2, 2]

Where's my nickel? :p

Ahh, I knew I had a copy of the time machine keys burried in that
drawer..

http://mail.python.org/pipermail/python-dev/2005-December/059166.html
 
D

Duncan Grisby

I have an object database written in Python. It, like Python, is
dynamically typed. It heavily relies on being able to sort lists where
some of the members are None. To some extent, it also sorts lists of
other mixed types. It will be very hard to migrate this aspect of it
to Python 3.

Very Hard? Several key functions have been suggested on this thread.
Given that 2.x only sorts most but not all types and that the sort is
only guaranteed to be consistent within a session, as I remember, I
suspect you can choose or write something at least as good for your
purposes.[/QUOTE]

Yes, very hard. There are only ever simple types in the lists --
strings, integers, Nones, very occasionally floats, and lists of those
things. The sort is always predictable with those types. Just because
you can contrive situations to demonstrate unpredictable sorts doesn't
mean that all sorts with mixed types are unpredictable.

The sorting is in a performance-critical part of the system, so the
overhead of evaluating a key function is not insignificant. A key
function that returns objects that contrive to emulate the
functionality of a comparison function is definitely not appropriate.
That area of the system already builds the lists using C++ for speed,
so if we ever migrate to Python 3 it will probably be easier to do the
whole thing in C++ rather than jump through hoops to make the Python
sort work efficiently enough.

Cheers,

Duncan.
 
M

M.-A. Lemburg

No, it is "very hard" to sort *arbitrary* objects consistently. If it
appears to work in Python 2.x that's because you've been lucky to never
need to sort objects that cause it to break.

If you read Duncan's email, he isn't talking about arbitrary objects
at all. He's just referring to being able to sort lists that contain
None elements.

That's far from arbitrary and does work consistently in Python 2.x -
simply because None is a singleton which is special cased in Python:
None compares smaller to any other object in Python.

I'm not sure why this special case was dropped in Python 3.0. None
is generally used to be a place holder for a n/a-value and as
such will pop up in lists on a regular basis.

I think the special case for None should be readded to Python 3.0.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Nov 11 2008)________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
 
G

George Sakkis

If you read Duncan's email, he isn't talking about arbitrary objects
at all. He's just referring to being able to sort lists that contain
None elements.

That's far from arbitrary and does work consistently in Python 2.x -
simply because None is a singleton which is special cased in Python:
None compares smaller to any other object in Python.

I'm not sure why this special case was dropped in Python 3.0. None
is generally used to be a place holder for a n/a-value and as
such will pop up in lists on a regular basis.

I think the special case for None should be readded to Python 3.0.

On python-ideas I proposed adding two new builtin singletons instead,
Smallest and Largest, since the behavior of None wrt comparisons was
never officially part of the language.

George
 
R

Robin Becker

M.-A. Lemburg said:
If you read Duncan's email, he isn't talking about arbitrary objects
at all. He's just referring to being able to sort lists that contain
None elements.

That's far from arbitrary and does work consistently in Python 2.x -
simply because None is a singleton which is special cased in Python:
None compares smaller to any other object in Python.

I'm not sure why this special case was dropped in Python 3.0. None
is generally used to be a place holder for a n/a-value and as
such will pop up in lists on a regular basis.

I think the special case for None should be readded to Python 3.0.
I agree here, it seems strange that cmp(None,None) is exceptional. Clearly the
is relation applies to None so does ==. Do we not have a sorting order for sets
with one element? My maths is now shot, but I seem to remember there are
automatic orders for such simple sets.
 
T

Terry Reedy

Duncan said:
Yes, very hard.

There is a difference between 'very hard' (to get 'right') and 'to slow'
(for a particular application). I accept the latter.
There are only ever simple types in the lists --
strings, integers, Nones, very occasionally floats, and lists of those
things. The sort is always predictable with those types. Just because
you can contrive situations to demonstrate unpredictable sorts doesn't
mean that all sorts with mixed types are unpredictable.

The 2.5 manual (and I sure before that) *intentially* defines the
default cross-type comparisons as unreliable.

"(This unusual definition of comparison was used to simplify the
definition of operations like sorting and the in and not in operators.
In the future, the comparison rules for objects of different types are
likely to change.)"

They have changed in the past and now they change again (yes, a little
more drastically this time, but as expected for some years).
The sorting is in a performance-critical part of the system, so the
overhead of evaluating a key function is not insignificant. A key
function that returns objects that contrive to emulate the
functionality of a comparison function is definitely not appropriate.
That area of the system already builds the lists using C++ for speed,
so if we ever migrate to Python 3 it will probably be easier to do the
whole thing in C++ rather than jump through hoops to make the Python
sort work efficiently enough.

Assuming the premises, agreed. No hurry, but you can even pull
timsort() out of the source, if you otherwise like its large-list
behavior, and hardcode the comparison function.

Terry Jan Reedy
 
T

Terry Reedy

M.-A. Lemburg said:
I think the special case for None should be readded to Python 3.0.

Quick summary of thread that MAL started on Python-3000-dev list:

Once upon a time, 0 < None was true.

When rich comparisons were added, None < 0 (and *most* other things)
become true as an intentionally undocumented implementation detail.

The None rule only applies for sure when None controls the comparison:
ob < None is true or undefined if type(ob) says so.

Guido's pronouncement: "In short, I'll have None of it."
summarizing

We're not going to add the "feature" back that None compares smaller
than everything. It's a slippery slope that ends with all operations
involving None returning None -- I've seen a proposal made in all
earnestness requesting that None+42 == None, None() == None, and so
on. This Nonesense was wisely rejected; a whole slew of
early-error-catching would have gone out of the window. It's the same
with making None smaller than everything else. For numbers, you can
already use -inf; for other types, you'll have to invent your own
Smallest if you need it.

tjr
 
M

Martin v. Löwis

The sorting is in a performance-critical part of the system, so the
overhead of evaluating a key function is not insignificant.

Can you easily produce an example? It doesn't have to be real data,
but should have the structure (typewise) of the real data. I would
like to perform some measurements. For example, I could imagine that

l = []
for i in range(1000):
x = random.randint(0,100)
if x < 4: l.append(None)
else: l.append(x)

might adequately model your problem.

Regards,
Martin
 
D

Duncan Grisby

The sorting is in a performance-critical part of the system, so the
overhead of evaluating a key function is not insignificant.

Can you easily produce an example? It doesn't have to be real data,
but should have the structure (typewise) of the real data. I would
like to perform some measurements. For example, I could imagine that

l = []
for i in range(1000):
x = random.randint(0,100)
if x < 4: l.append(None)
else: l.append(x)

might adequately model your problem.[/QUOTE]

Sorry for the delay in replying. Yes, that's not far off. Most of the
time the lists contain strings, though. A better approximation might
be to read lines from a file and randomly replace them with Nones:

l = []
for line in open("bigfile.txt"):
x = random.randint(0,100)
if x < 4: l.append(None)
else: l.append(line)

And maybe once in a while you end up with something not dissimilar to:

l = []
for line in open("bigfile.txt"):
x = random.randint(0,100)
if x < 4: l.append(None)
elif x < 5: l.append([line,line])
else: l.append(line)

In that kind of case it doesn't really matter what happens to list
items in the sort order, but it's important it doesn't fail to sort
the ones that are strings.

Cheers,

Duncan.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,835
Latest member
KetoRushACVBuy

Latest Threads

Top