Python 3.0 - is this true?

R

Roy Smith

Also, I thought that part of the python philosophy was to allow any
sort of object in a list, and to allow the same methods to work with
whatever was in list.

Not really. When the usual argument about the existence (and
justification) of lists & tuples comes along, one common distinction is
that

- tuples contain arbitrary object of varying types, so they are kind
of "records"
- lists should contain uniform objects.[/QUOTE]

I see absolutely nothing wrong with lists of heterogenous types. Or, for
that matter, iterators which generate heterogeneous types. Here's some
perfectly reasonable examples (equally applicable to lists or iterators):

* The tokens parsed out of a file (ints, floats, identifiers, keywords,
various kinds of punctuation, etc)

* The entries in a unix directory (plain files, directories, symlinks,
special files, named sockets, etc)

* The vehicles going through a toll booth (cars, trucks, motorcycles)

I don't see any reason you shouldn't be able to build lists of those things.
 
D

Diez B. Roggisch

Roy said:
Not really. When the usual argument about the existence (and
justification) of lists & tuples comes along, one common distinction is
that

- tuples contain arbitrary object of varying types, so they are kind
of "records"
- lists should contain uniform objects.

I see absolutely nothing wrong with lists of heterogenous types. Or, for
that matter, iterators which generate heterogeneous types. Here's some
perfectly reasonable examples (equally applicable to lists or iterators):

* The tokens parsed out of a file (ints, floats, identifiers, keywords,
various kinds of punctuation, etc)

* The entries in a unix directory (plain files, directories, symlinks,
special files, named sockets, etc)

* The vehicles going through a toll booth (cars, trucks, motorcycles)

I don't see any reason you shouldn't be able to build lists of those things.[/QUOTE]

When I wrote "uniform" I meant objects of the same kind. So for example
subclasses are of course ok. And all of your examples are these: I want
a Token-object, keeping file-location and possibly original string
representation. The same goes for files - they are simply strings, their
properties determined using stat-calls. And vehicles are... vehicles. So
I'd create a common base-class for them, or made them at least behave
proper through duck-typing. Which - for the case at hand - might include
creating a __cmp__-method, based on horsepower or price-tag or...

Diez
 
S

Stefan Behnel

Duncan said:
Roy said:
In 3.0, can you still order types? In 2.x, you can do:

False

If this still works in 3.0, then you can easily do something like:

def total_order(o1, o2):
"Compare any two objects of arbitrary types"
try:
return o1 <= o2
except UncomparableTypesError: # whatever the right name is
return type(o1) <= type(o2)

and get the same effect as you had in 2.x.

No, that won't work. You can compare types for equality/inequality, but
they are not orderable:
type(1)==type('a') False
sorted([1, 'a'], key=lambda x:(type(x),x))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: type() < type()

However, for an arbitrary ordering of types, id(type(o1)) <= id(type(o2)) will
work well enough.

Stefan
 
R

Roy Smith

"Diez B. Roggisch said:
When I wrote "uniform" I meant objects of the same kind. So for example
subclasses are of course ok. And all of your examples are these: I want
a Token-object, keeping file-location and possibly original string
representation.

Maybe, but only if the logic of my program says that's the right way to do
it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have to
define a Token class if using the native Python types works just as well
for what I'm doing? I'll write class Token when there's some added value
I get from it which I can't get with raw types. I don't want to be forced
into it just because a container doesn't like what I'm doing.

I spend too much time in C++ pleading with the compiler to allow me to do
what I want. I come to Python to get away from all that type bondage.

As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type. And
making a list of them is a perfectly reasonable thing to do.
 
R

Roy Smith

Duncan Booth said:
Roy said:
In 3.0, can you still order types? In 2.x, you can do:

False

If this still works in 3.0, then you can easily do something like:

def total_order(o1, o2):
"Compare any two objects of arbitrary types"
try:
return o1 <= o2
except UncomparableTypesError: # whatever the right name is
return type(o1) <= type(o2)

and get the same effect as you had in 2.x.

No, that won't work. You can compare types for equality/inequality, but
they are not orderable:
type(1)==type('a') False
sorted([1, 'a'], key=lambda x:(type(x),x))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: type() < type()

Sigh. So if I really wanted to do that, I'd be forced to write
"str(type(o1)) < str(type(o2))"? To me, that sounds suspiciously like
"sudo type(o1) < type(o2)" :)
 
M

Marc 'BlackJack' Rintsch

Maybe, but only if the logic of my program says that's the right way to
do it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have
to define a Token class if using the native Python types works just as
well for what I'm doing? I'll write class Token when there's some
added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a
`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.
I don't want to be forced into it just because a container doesn't like
what I'm doing.

The container doesn't say what *you* can do with the content, it is just
a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.
As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a
method on them they have to implement that. Not necessarily through
inheritance.
And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch
 
M

Marc 'BlackJack' Rintsch

Maybe, but only if the logic of my program says that's the right way to
do it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have
to define a Token class if using the native Python types works just as
well for what I'm doing? I'll write class Token when there's some
added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a
`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.
I don't want to be forced into it just because a container doesn't like
what I'm doing.

The container doesn't say what *you* can do with the content, it is just
a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.
As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a
method on them they have to implement that. Not necessarily through
inheritance.
And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch
 
M

Marc 'BlackJack' Rintsch

Maybe, but only if the logic of my program says that's the right way to
do it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have
to define a Token class if using the native Python types works just as
well for what I'm doing? I'll write class Token when there's some
added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a
`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.
I don't want to be forced into it just because a container doesn't like
what I'm doing.

The container doesn't say what *you* can do with the content, it is just
a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.
As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a
method on them they have to implement that. Not necessarily through
inheritance.
And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch
 
M

Marc 'BlackJack' Rintsch

Maybe, but only if the logic of my program says that's the right way to
do it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have
to define a Token class if using the native Python types works just as
well for what I'm doing? I'll write class Token when there's some
added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a
`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.
I don't want to be forced into it just because a container doesn't like
what I'm doing.

The container doesn't say what *you* can do with the content, it is just
a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.
As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a
method on them they have to implement that. Not necessarily through
inheritance.
And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch
 
T

Terry Reedy

Kay said:

I was asking the OP ;-)
if len(L1) == len(L2):
return sorted(L1) == sorted(L2) # check whether two lists contain
the same elements
else:
return False

It doesn't really matter here what the result of the sorts actually is
as long as the algorithm leads to the same result for all permutations
on L1 ( and L2 ).

Leaving aside the O(n) alternative for hashable items, which only
matters for 'long' lists....
If you literally mean 'the same elements', then key=id would work.

For some mixtures, such as strings and numbers, key=id will work better
than in 2.x since complex numbers can be included.

If you want to duplicate 2.x behavior, which does *not* work for all
types...

def py2key(item): return (str(type(item)), item)

Guido is aware that universal compare was occasionally useful, but
decided that it also caused problems. So he broke universal compare
when he introduced complex numbers without arbitrary comparisons.
Decimal numbers are even worse. So anyone who objects to the change in
3.0 *should* have objected when complex numbers were introduced.
 
H

Hallvard B Furuseth

Steven said:
How often do you care about equality ignoring order for lists containing
arbitrary, heterogeneous types?

Arbitrary, I never have. Different types of my choice, a few times.
I was only interested in there being some _sort_ order (and the same in
different runs of the program), not in what the sort order was.

However string sorting is partly arbitrary too. It depends on the
code points in the string's character set. u'\u00c5' < u'\u00d8'
(u'Å' < u'Ø') is wrong in Norwegian. '0' < 'B' < 'a' < '~' can be
wrong depending on the application, as can '10' < '5'.

So I don't quite buy the argument about arbitrary sorting. If you have
different types, at least you likely know that you are doing something
weird. OTOH it's quite common - even normal - to produce poor sort
results because one depends on the built-in arbitrary string sort order.


In any case, would it be possible to add a cmp= function which did more
or less what the current default cmp does now? Then it's available when
someone wants it, and the "arbitrariness" of different types is not
inflicted on people who don't. (In my case I _think_ it's just a "nice
to have" to me and nothing more, but I don't remember for sure.)
 
H

Hallvard B Furuseth

Terry said:
If you want to duplicate 2.x behavior, which does *not* work for all
types...

def py2key(item): return (str(type(item)), item)

Nope.
sorted((-1, 2, True, False)) == [-1, False, True, 2]
sorted((-1, 2, True, False), key=py2key) == [False, True, -1, 2]
Might often be good enough though. But uses more memory.
 
A

Arnaud Delobelle

Roy Smith said:
I spend too much time in C++ pleading with the compiler to allow me to do
what I want. I come to Python to get away from all that type bondage.

As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type. And
making a list of them is a perfectly reasonable thing to do.

You do realise that all these objects (like any object in Python) are
all instances of the object type, don't you?
 
A

Aahz

I see absolutely nothing wrong with lists of heterogenous types. Or, for
that matter, iterators which generate heterogeneous types. Here's some
perfectly reasonable examples (equally applicable to lists or iterators):

* The tokens parsed out of a file (ints, floats, identifiers, keywords,
various kinds of punctuation, etc)

* The entries in a unix directory (plain files, directories, symlinks,
special files, named sockets, etc)

* The vehicles going through a toll booth (cars, trucks, motorcycles)

I don't see any reason you shouldn't be able to build lists of those things.

Overall agreed, but I think your reasoning breaks down when you talk
about sorting such a list: generally speaking, when you create a list
like that, it's because you specifically want to preserve ordering.

The case where it's mostly like you'd want to sort (entries in a Unix
directory), you can arguably reduce the discussion to string sorting
quite easily.
 
M

Martin v. Löwis

Yes, key= lets you sort anything anyway you want.
l=[1, '2', 3j]
sorted(l, key = str)
[1, '2', 3j]

The problem here is that str() is degenerate, i.e. a != b does not imply
str(a) != str(b).

So why is this a problem? The sort algorithm is stable, so it gives a
deterministic result. Not a particularly useful one - but I don't see
that any order on these three values could be considered "useful".
sorted(l, key = id)
['2', 3j, 1]

And of course, this has to opposite problem, a == b does not imply id(a) ==
id(b). Whether either of these "problems" is really a problem obviously
depends on what you're trying to do.

And that's Terry's message entirely. Is is YOU who decides how you want
these things sorted (if you want to sort them). Python refuses the
temptation to guess.
In 3.0, can you still order types? In 2.x, you can do:

False

By what criteria? When is a type smaller than another one?
def total_order(o1, o2):
"Compare any two objects of arbitrary types"
try:
return o1 <= o2
except UncomparableTypesError: # whatever the right name is
return type(o1) <= type(o2)

and get the same effect as you had in 2.x.

You can still do that - but you also need to define the order that
types have in 2.x (namely, by name).

Completely emulating the sorting of 2.x is nearly impossible,
considering how complex that is.

Regards,
Martin
 
M

Martin v. Löwis

Sure:
if len(L1) == len(L2):
return sorted(L1) == sorted(L2) # check whether two lists contain
the same elements
else:
return False

It doesn't really matter here what the result of the sorts actually is
as long as the algorithm leads to the same result for all permutations
on L1 ( and L2 ).

Unfortunately, for many releases, the list's sort algorithm would not
provide that property. Release for release, new cases where found where
the builtin ordering was not transitive (i.e. you get a < b and b < c,
but not a < c). With funny definitions of __cmp__ in some classes, you
can still get this today.

Regards,
Martin
 
M

Martin v. Löwis

def comp(x1, x2):
try:
if x1<x2:
return -1
else:
return 1
except TypeError:
if str(x1)<str(x2):
return -1
else:
return 1

Please correct me if I'm wrong, but I think this is not transitive. If
strings and ints are uncomparable, this will give you 20 < "5" and
"5" < 8, but not 20 < 8. As a result, quicksort will do nonsense
with that comparison function (i.e. it won't guarantee that things
are sorted in increasing order)
Not sure how to transform it into a search key that is efficient and
reliable.

Depending on what your notion of "sameness of lists" is, the following
might do:

def key(o):
return id(type(o)),o

This only requires that all objects of the same type in the list can
be sorted. If this is not guaranteed, then

def key(o)
try:
o<o
return id(type(o)),o
except TypeError:
return id(type(o)),id(o)

might do.

Regards,
Martin
 
M

Martin v. Löwis

In any case, would it be possible to add a cmp= function which did more
or less what the current default cmp does now?

As somebody pointed out, it's possible to create a key= function that
provides arbitrary ordering: just return an object custom type whose
__lt__ never fails:

class AlwaysOrdered:
def __init__(self, o):
self.o = o
def __lt__(self, other):
o1 = self.o
o2 = other.o
try:
return o1 < o2
except TypeError:
return cmp((type(o1).__name__, id(o1)),
(type(o2).__name__, id(o2)))<0

L.sort(key=AlwaysOrdered)

(This is not quite the current implementation, and not even
transitive, but you get the idea)

Regards,
Martni
 
A

Arnaud Delobelle

Martin v. Löwis said:
Please correct me if I'm wrong, but I think this is not transitive. If
strings and ints are uncomparable, this will give you 20 < "5" and
"5" < 8, but not 20 < 8. As a result, quicksort will do nonsense
with that comparison function (i.e. it won't guarantee that things
are sorted in increasing order)

Even in 2.x it doesn't work (I think I posted this earlier but I'm not
sure anymore) as this example shows:

2 < 3j and 3j < True, but True < 2
 
M

Martin v. Löwis

Hallvard said:
Terry said:
If you want to duplicate 2.x behavior, which does *not* work for all
types...

def py2key(item): return (str(type(item)), item)

Nope.
sorted((-1, 2, True, False)) == [-1, False, True, 2]
sorted((-1, 2, True, False), key=py2key) == [False, True, -1, 2]
Might often be good enough though. But uses more memory.

Compared to what? To the current 2.x default implementation? Or compared
to an emulation of that, in 2.x, in a pure-Python function, passed
as a cmp= argument?

I think this one is *more* memory-efficient than a cmp function. Calling
a cmp function will always allocate a frame object, for each comparison,
giving you O(n log n) frame objects being created. OTOH, the key
function will only be called O(n) times, causing the allocation of only
O(n) objects. Of course, the peak memory usage is higher with the key
function (O(n), compared to O(1) for the cmp= function, assuming the
frame objects get deallocated and reused immediately).

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top