# Perl-Python-a-Day: Sorting

Discussion in 'Python' started by Xah Lee, Oct 10, 2005.

1. ### Xah LeeGuest

Sort a List

Xah Lee, 200510

In this page, we show how to sort a list in Python & Perl and also
discuss some math of sort.

To sort a list in Python, use the â€œsortâ€ method. For example:

li=[1,9,2,3];
li.sort();
print li;

Note that sort is a method, and the list is changed in place.

Suppose you have a matrix, and you want to sort by second column.
Example Solution:

li=[[2,6],[1,3],[5,4]]
li.sort(lambda x, y: cmp(x[1],y[1]))
print li; # prints [[1, 3], [5, 4], [2, 6]]

The line â€œli.sort(lambda x, y: cmp(x[1],y[1]))â€ can also be written
as â€œli.sort(cmp=lambda x, y: cmp(x[1],y[1]))â€

The argument to sort is a function of two arguments, that returns -1,
0, 1. This function is a decision function that tells sort() how to
decide the order of any two elements in your list. If the first
argument is â€œlessâ€ then second argument, the function should return
-1. If equal, then 0. Else, 1.

Here's a more complex example. Suppose you have a list of strings.

'my283.jpg'
'my23i.jpg'
'web7-s.jpg'
'fris88large.jpg'
....

You want to sort them by the number embedded in them. What you have to
do, is to provide sort() method a function, that takes two strings, and
compares the integer inside the string. Here's the solution:

li=[
'my283.jpg',
'my23i.jpg',
'web7-s.jpg',
'fris88large.jpg',
]

def myComp (x,y):
import re
def getNum(str): return float(re.findall(r'\d+',str)[0])
return cmp(getNum(x),getNum(y))

li.sort(myComp)
print li # returns ['web7-s.jpg', 'my23i.jpg', 'fris88large.jpg',
'my283.jpg']

Here, we defined a function myComp to tell sort about the ordering.
Normally, one would use the â€œlambdaâ€ construct, but Python's lambda
construct can only represent the simplest functions.

In general, the function f used to determine the order of any two
element must satisfy some constraints:

â€¢ f(a,a)==0
â€¢ if f(a,b)==0 then f(b,a)==0
â€¢ if f(a,b)==0 and f(b,c)==0, then f(a,c)==0.
â€¢ if f(a,b)==-1 and f(b,c)==-1, then f(a,c)==-1.
â€¢ if f(a,b)==-1, then f(b,a)==1.

If the comparison function does not behave as the above, then it is not
consistent, meaning that the result â€œorderedâ€ list is may actually
be different depending how the language happens to implement sort.

The significance of all these is that in real software you may want to
sort a list of non-simple entities by a specialized ordering. For
example, you may want to sort a list of polygonal surfaces in 3D space,
for particular reasons in implementing some computer graphics features.
Say, you want to sort these polygons by their spacial orientations. It
ordering is important. Otherwise, you might have a bewildering result
yet unable to locate any flaws in your code.

Python's â€œsortâ€ method's optional parameters: â€œkeyâ€ and
â€œreverseâ€

Most of the time, sorting is done for a list of atomic element such as
[3,2,4]. This is simply done by myList.sort() without any argument.
Other than simple list, sort is frequently used on matrixes (e.g.
[[2,6],[1,3],[5,4]]). For matrixes, almost always a particular column
is used for the basis of ordering. For example, if we want to sort by
second column, we do: â€œli.sort(lambda x, y: cmp(x[1],y[1]))â€. Since
this is frequently used, Python provides a somewhat shorter syntax for
it, by specifying the column used as the ordering â€œkeyâ€. For
example:

li=[[2,6],[1,3],[5,4]]
li.sort(key=lambda x:x[1] ) # is equivalent to the following
#li.sort(lambda x, y: cmp(x[1],y[1]))
print li; # prints [[1, 3], [5, 4], [2, 6]]

Because Python's implementation is not very refined , this specialized
syntax is actually much speedier than the general form â€œlambda x, y:
cmp(x[1],y[1])â€. It is a burden on the programer to always use the
â€œkeyâ€ syntax idiosyncrasy if he is sorting a large matrix.

Another idiosyncratic provision is the optional â€œreverseâ€ argument.
This parameter is somewhat necessary when using the â€œkeyâ€
parameter. One can reverse the ordering by using the â€œreverseâ€
keyword as a argument to sort. Example:

The following are equivalent:

li.sort(key=lambda x:x[1], reverse=True )
li.sort(lambda x, y: cmp(x[1],y[1]), reverse=True)
li.sort(lambda x, y: cmp(y[1],x[1]))

The official doc on Python's sort method is at (bottom):
http://python.org/doc/2.4/lib/typesseq-mutable.html

Sorting in Perl

(to be posted in a couple of days)

This post is archived at:
http://xahlee.org/perl-python/sort_list.html

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 10, 2005

2. ### Marcin 'Qrczak' KowalczykGuest

Followup-To: comp.lang.scheme

"Xah Lee" <> writes:

> Since this is frequently used, Python provides a somewhat shorter
> syntax for it, by specifying the column used as the ordering â€œkeyâ€.

[...]
> Because Python's implementation is not very refined , this specialized
> syntax is actually much speedier than the general form â€œlambda x, y:
> cmp(x[1],y[1])â€. It is a burden on the programer to always use the
> â€œkeyâ€ syntax idiosyncrasy if he is sorting a large matrix.

It's not only clearer for a human, but also faster in all good
implementations of all languages which support that, except when the
ordering function is very simple. It's called Schwartzian transform
and I wish more language designers and programmers knew about it.

http://en.wikipedia.org/wiki/Schwartzian_transform

I urge future SRFI authors to include it. The withdrawn SRFI-32 for
sorting didn't do that, and I can't find any other SRFI which deals
with sorting.

--
__("< Marcin Kowalczyk
\__/
^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk, Oct 10, 2005

3. ### Ulrich HobelmannGuest

Xah Lee wrote:
> To sort a list in Python, use the â€œsortâ€ method. For example:
>
> li=[1,9,2,3];
> li.sort();
> print li;

Likewise in Common Lisp. In Scheme there are probably packages for that
as well. My apologies for not being very fluent anymore.

CL-USER> (setf list (sort '(1 9 2 3) #'<)) ; input
(1 2 3 9) ; output

The second argument is mandatory too (comparison function).

> Note that sort is a method, and the list is changed in place.

Same here. To be safe, assign the result to "list".

> Suppose you have a matrix, and you want to sort by second column.
> Example Solution:
>
> li=[[2,6],[1,3],[5,4]]
> li.sort(lambda x, y: cmp(x[1],y[1]))
> print li; # prints [[1, 3], [5, 4], [2, 6]]

CL-USER> (setf list (sort '((2 6) (1 3) (5 4))
#'(lambda (x y) (< (second x) (second y)))))
((1 3) (5 4) (2 6)) ; output

> The argument to sort is a function of two arguments, that returns -1,
> 0, 1. This function is a decision function that tells sort() how to
> decide the order of any two elements in your list. If the first
> argument is â€œlessâ€ then second argument, the function should return
> -1. If equal, then 0. Else, 1.

In CL you only need a smaller-than function. I guess if elements are
"equal", they don't need sorting anyway.

> li=[
> 'my283.jpg',
> 'my23i.jpg',
> 'web7-s.jpg',
> 'fris88large.jpg',
> ]

CL-USER> (setf list '("my283.jpg" "my23i.jpg" "web7-s.jpg"
"fris88large.jpg"))

> def myComp (x,y):
> import re
> def getNum(str): return float(re.findall(r'\d+',str)[0])
> return cmp(getNum(x),getNum(y))

CL-USER> (defun my-comp (x y)
(flet ((getnum (s)
(parse-integer s :start (position-if #'digit-char-p s)
:junk-allowed t)))
(< (getnum x) (getnum y))))

> li.sort(myComp)
> print li # returns ['web7-s.jpg', 'my23i.jpg', 'fris88large.jpg',
> 'my283.jpg']

CL-USER> (setf list (sort list #'my-comp))
("web7-s.jpg" "my23i.jpg" "fris88large.jpg" "my283.jpg") ; output

> li=[[2,6],[1,3],[5,4]]
> li.sort(key=lambda x:x[1] ) # is equivalent to the following
> #li.sort(lambda x, y: cmp(x[1],y[1]))
> print li; # prints [[1, 3], [5, 4], [2, 6]]

CL-USER> (setf list (sort '((2 6) (1 3) (5 4)) #'< :key #'second))
((1 3) (5 4) (2 6)) ; output

Here some people might jump in and say "lists might be more readable
than vectors, but lists are slow."
If they are slow for your data set, just use vectors instead

--
State, the new religion from the friendly guys who brought you fascism.

Ulrich Hobelmann, Oct 10, 2005
4. ### Pascal CostanzaGuest

Ulrich Hobelmann wrote:
> Xah Lee wrote:
>
>> To sort a list in Python, use the â€œsortâ€ method. For example:
>>
>> li=[1,9,2,3];
>> li.sort();
>> print li;

>
> Likewise in Common Lisp. In Scheme there are probably packages for that
> as well. My apologies for not being very fluent anymore.
>
> CL-USER> (setf list (sort '(1 9 2 3) #'<)) ; input
> (1 2 3 9) ; output

Careful. Common Lisp's sort function is specified to be destructive, so
you shouldn't use it on literal constants. So don't say (sort '(1 9 2 3)
....), say (sort (list 1 9 2 3) ...), etc.

Pascal

--
OOPSLA'05 tutorial on generic functions & the CLOS Metaobject Protocol
++++ see http://p-cos.net/oopsla05-tutorial.html for more details ++++

Pascal Costanza, Oct 10, 2005
5. ### Xah LeeGuest

Python Doc Problem Example: sort() (reprise)

Python Doc Problem Example: sort()

Xah Lee, 200503
Exhibit: Incompletion & Imprecision

Python doc â€œ3.6.4 Mutable Sequence Typesâ€ at
http://python.org/doc/2.4/lib/typesseq-mutable.html

in which contains the documentation of the â€œsortâ€ method of a list.
Quote:

Â«
Operation Result Notes
s.sort([cmp[, key[, reverse]]]) sort the items of s in place (7),
(8), (9), (10)

(7) The sort() and reverse() methods modify the list in place for
economy of space when sorting or reversing a large list. To remind you
that they operate by side effect, they don't return the sorted or
reversed list.

(8) The sort() method takes optional arguments for controlling the
comparisons.

cmp specifies a custom comparison function of two arguments (list
items) which should return a negative, zero or positive number
depending on whether the first argument is considered smaller than,
equal to, or larger than the second argument: "cmp=lambda x,y:
cmp(x.lower(), y.lower())"

key specifies a function of one argument that is used to extract a
comparison key from each list element: "cmp=str.lower"

reverse is a boolean value. If set to True, then the list elements
are sorted as if each comparison were reversed.

In general, the key and reverse conversion processes are much
faster than specifying an equivalent cmp function. This is because cmp
is called multiple times for each list element while key and reverse
touch each element only once.

Changed in version 2.3: Support for None as an equivalent to

Changed in version 2.4: Support for key and reverse was added.

(9) Starting with Python 2.3, the sort() method is guaranteed to be
stable. A sort is stable if it guarantees not to change the relative
order of elements that compare equal -- this is helpful for sorting in
multiple passes (for example, sort by department, then by salary

(10) While a list is being sorted, the effect of attempting to
mutate, or even inspect, the list is undefined. The C implementation of
Python 2.3 and newer makes the list appear empty for the duration, and
raises ValueError if it can detect that the list has been mutated
during a sort.
Â»

As a piece of documentation, this is a lousy one.

The question Python doc writers need to ask when evaluating this piece
of doc are these:

â€¢ can a experienced programer who is expert at several languages but
new to Python, and also have read the official Python tutorial, can he,
read this doc, and know exactly how to use sort with all the options?

â€¢ can this piece of documentation be rewritten fairly easily, so that
the answer to the previous question is a resounding yes?

To me, the answers to the above questions are No and Yes. Here are some
issues with the doc:

â€¢ In the paragraph about the â€œkeyâ€ parameter, the illustration
given is: â€œcmp=str.lowerâ€. It should be be â€œkey=str.lowerâ€

â€¢ This doc lacks examples. One or two examples will help a lot,
especially to less experienced programers. (which comprises the
majority of readers) In particular, it should give a full example of
using the comparison function and one with the â€œkeyâ€ parameter.
Examples are particularly needed here because these parameteres are
functions, often with the â€œlambdaâ€ construct. These are unusual and

â€¢ This doc fails to mention what happens when the predicate and the
shortcut version conflicts. e.g. â€œmyList.sort(cmp=lambda x,y:
cmp(x[0], y[0]), key=lambda x: str(x[1]) )â€

â€¢ The syntax notation Python doc have adopted for indicating optional
parameters, does not give a clear view just exactly what combination of
optional parameters can be omitted. The notation: â€œs.sort([cmp[,
key[, reverse]]])â€ gives the impression that only trailing arguments
can be omitted, which is not true.

â€¢ The doc gives no indication of how to omit a optional arg. Should
it be â€œnulâ€, â€œNullâ€, 0, or left empty? Since it doesn't give
any examples, doc reader who isn't Python experts is left to guess at
how true/false values are presented in Python.

â€¢ On the whole, the way this doc is written does not give a clear
picture of the roles of the supplied options, nor how to use them.

Suggested Quick Remedy: add a example of using the cmp function. And a
example using the â€œkeyâ€ function. Add a example of Using one of
them and with reverse. (the examples need not to come with much
explanations. One sentence annotation is better than none.)

Other than that, the way the doc is layed out with a terse table and
run-on footnotes (employed in several places in Python doc) is not
inductive. For a better improvement, there needs to be a overhaul of
the organization and the attitude of the entire doc. The organization
needs to be programing based, as opposed to implementation or computer
science based. (in this regard, one can learn from the Perl folks). As
to attitude, the writing needs to be Python-as-is, as opposed to
computer science framework, as indicated in the early parts of this
critique series.
----------------
This post is archived at:
http://xahlee.org/perl-python/python_doc_sort.html

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 11, 2005
6. ### Xah LeeGuest

Re: Python Doc Problem Example: sort() (reprise)

Here's further example of Python's extreme low quality of
documentation. In particular, what follows focuses on the bad writing
skill aspect, and comments on some language design and quality issues
of Python.

>From the Official Python documentation of the sort() method, at:

http://python.org/doc/2.4.2/lib/typesseq-mutable.html, Quote:

Â«The sort() method takes optional arguments for controlling the
comparisons.Â»

It should be â€œoptional parameterâ€ not â€œoptional argumentâ€.
Their difference is that â€œparameterâ€ indicates the variable, while
â€œargumentâ€ indicates the actual value.

Â«... for controlling the comparisons.Â»

This is a bad writing caused by lack of understanding. No, it doesn't
â€œcontrol the comparisonâ€. The proper way to say it is that â€œthe
comparison function specifies an orderâ€.

Â«The sort() and reverse() methods modify the list in place for economy
of space when sorting or reversing a large list. To remind you that
they operate by side effect, they don't return the sorted or reversed
list. Â»

This is a example of tech-geeking drivel. The sort() and reverse()
methods are just the way they are. Their design and behavior are really
not for some economy or remind programers of something. The Python doc
is bulked with these irrelevant drivels. These littered inanities
dragged down the whole quality and effectiveness of the doc implicitly.

Â«Changed in version 2.4: Support for key and reverse was added.Â»

Â«In general, the key and reverse conversion processes are much faster
than specifying an equivalent cmp function. This is because cmp is
called multiple times for each list element while key and reverse touch
each element only once.Â»

When sorting something, one needs to specify a order. The easiest way
is to simply list all the elements as a sequence. That way, their order
is clearly laid out. However, this is in general not feasible and
impractical. Therefore, we devised a mathematically condensed way to
specify the order, by defining a function f(x,y) that can take any two
elements and tell us which one comes first. This, is the gist of
sorting a list in any programing language.

The ordering function, being a mathematically condensed way of
specifying the order, has some constraints. For example, the function
should not tell us x < y and y < x. (For a complete list of these
constraints, see http://xahlee.org/perl-python/sort_list.html )

With this ordering function, it is all sort needed to sort a list.
Anything more is interface complexity.

The optional parameters â€œkeyâ€ and â€œreverseâ€ in Python's sort
method is a interface complexity. What happened here is that a compiler
optimization problem is evaded by moving it into the language syntax
for programers to worry about. If the programer does not use the
â€œkeyâ€ syntax when sorting a large matrix (provided that he knew in
advance of the list to be sorted or the ordering function), then he is
penalized by a severe inefficiency by a order of magnitude of execution
time.

This situation, of moving compiler problems to the syntax surface is
common in imperative languages.

Â«Changed in version 2.3: Support for None as an equivalent to omitting

This is a epitome of catering towards morons. â€œmyList.sort()â€ is
complexity just because idiots need it.

The motivation here is simple: a explicit â€œNoneâ€ gives coding
monkeys a direct sensory input of the fact that â€œthere is no
comparison functionâ€. This is like the double negative in black
English â€œI ain't no gonna do it!â€. Logically, â€œNoneâ€ is not
even correct and leads to bad thinking. What really should be stated in
the doc, is that â€œthe default ordering function to sort() is the
â€˜cmpâ€™ function.â€.

Â«Starting with Python 2.3, the sort() method is guaranteed to be
stable. A sort is stable if it guarantees not to change the relative
order of elements that compare equal -- this is helpful for sorting in
multiple passes (for example, sort by department, then by salary

existence, its sort functionality is not smart enough to preserve
order?? A sort that preserves original order isn't something difficult
to implement. What we have here is sloppiness and poor quality common
in OpenSource projects.

Also note the extreme low quality of the writing. It employes the
jargon â€œstable sortâ€ then proceed to explain what it is, and the
latch on of â€œmultiple passesâ€ and the mysterious â€œby department,
by salaryâ€.

Here's a suggested rewrite: â€œSince Python 2.3, the result of sort()
no longer rearrange elements where the comparison function returns
0.â€
-----------
This post is archived at:
http://xahlee.org/perl-python/python_doc_sort.html

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 11, 2005
7. ### BryanGuest

bizarro world (was Re: Python Doc Problem Example: sort() (reprise))

Xah Lee wrote:
>
> Here's further example of Python's extreme low quality of
> documentation. In particular, what follows focuses on the bad writing
> skill aspect, and comments on some language design and quality issues
> of Python.
>
>>From the Official Python documentation of the sort() method, at:

> http://python.org/doc/2.4.2/lib/typesseq-mutable.html, Quote:
>
> Â«The sort() method takes optional arguments for controlling the
> comparisons.Â»
>
> It should be â€œoptional parameterâ€ not â€œoptional argumentâ€.
> Their difference is that â€œparameterâ€ indicates the variable, while
> â€œargumentâ€ indicates the actual value.
>
> Â«... for controlling the comparisons.Â»
>
> This is a bad writing caused by lack of understanding. No, it doesn't
> â€œcontrol the comparisonâ€. The proper way to say it is that â€œthe
> comparison function specifies an orderâ€.
>
> Â«The sort() and reverse() methods modify the list in place for economy
> of space when sorting or reversing a large list. To remind you that
> they operate by side effect, they don't return the sorted or reversed
> list. Â»
>
> This is a example of tech-geeking drivel. The sort() and reverse()
> methods are just the way they are. Their design and behavior are really
> not for some economy or remind programers of something. The Python doc
> is bulked with these irrelevant drivels. These littered inanities
> dragged down the whole quality and effectiveness of the doc implicitly.
>
> Â«Changed in version 2.4: Support for key and reverse was added.Â»
>
> Â«In general, the key and reverse conversion processes are much faster
> than specifying an equivalent cmp function. This is because cmp is
> called multiple times for each list element while key and reverse touch
> each element only once.Â»
>
> When sorting something, one needs to specify a order. The easiest way
> is to simply list all the elements as a sequence. That way, their order
> is clearly laid out. However, this is in general not feasible and
> impractical. Therefore, we devised a mathematically condensed way to
> specify the order, by defining a function f(x,y) that can take any two
> elements and tell us which one comes first. This, is the gist of
> sorting a list in any programing language.
>
> The ordering function, being a mathematically condensed way of
> specifying the order, has some constraints. For example, the function
> should not tell us x < y and y < x. (For a complete list of these
> constraints, see http://xahlee.org/perl-python/sort_list.html )
>
> With this ordering function, it is all sort needed to sort a list.
> Anything more is interface complexity.
>
> The optional parameters â€œkeyâ€ and â€œreverseâ€ in Python's sort
> method is a interface complexity. What happened here is that a compiler
> optimization problem is evaded by moving it into the language syntax
> for programers to worry about. If the programer does not use the
> â€œkeyâ€ syntax when sorting a large matrix (provided that he knew in
> advance of the list to be sorted or the ordering function), then he is
> penalized by a severe inefficiency by a order of magnitude of execution
> time.
>
> This situation, of moving compiler problems to the syntax surface is
> common in imperative languages.
>
> Â«Changed in version 2.3: Support for None as an equivalent to omitting
>
> This is a epitome of catering towards morons. â€œmyList.sort()â€ is
> complexity just because idiots need it.
>
> The motivation here is simple: a explicit â€œNoneâ€ gives coding
> monkeys a direct sensory input of the fact that â€œthere is no
> comparison functionâ€. This is like the double negative in black
> English â€œI ain't no gonna do it!â€. Logically, â€œNoneâ€ is not
> even correct and leads to bad thinking. What really should be stated in
> the doc, is that â€œthe default ordering function to sort() is the
> â€˜cmpâ€™ function.â€.
>
> Â«Starting with Python 2.3, the sort() method is guaranteed to be
> stable. A sort is stable if it guarantees not to change the relative
> order of elements that compare equal -- this is helpful for sorting in
> multiple passes (for example, sort by department, then by salary
>
> existence, its sort functionality is not smart enough to preserve
> order?? A sort that preserves original order isn't something difficult
> to implement. What we have here is sloppiness and poor quality common
> in OpenSource projects.
>
> Also note the extreme low quality of the writing. It employes the
> jargon â€œstable sortâ€ then proceed to explain what it is, and the
> latch on of â€œmultiple passesâ€ and the mysterious â€œby department,
> by salaryâ€.
>
> Here's a suggested rewrite: â€œSince Python 2.3, the result of sort()
> no longer rearrange elements where the comparison function returns
> 0.â€
> -----------
> This post is archived at:
> http://xahlee.org/perl-python/python_doc_sort.html
>
> Xah
>
> âˆ‘ http://xahlee.org/
>

omg!!! wow!!! after reading this i feel like i just stepped in to some bizarro
world. this entire posting is like something you would read in the onion.
unfortunately this posting has just enough real words mixed with BS that a
newbie might actually fall for this stuff. i think mr. xah has published enough
python satire by now that we could probably make a nice bathroom reading book
"python untechgeeked". sorry folks, i'm still laughing... i just don't get how
someone can go on and on and on, day after day, month after month writing this
stuff that is so completely off base that you wonder if he's looking at the same
python as we are. mr. xah, why do you spend so much time agonizing over a
language you obviously don't like. why is python so important to you that you
are willing to waste so much of your life on this? there is obviously something
else i'm not seeing here. is there a pschologist in the house? can someone
explain xah to me? is he clinically depressed? suicidal? does he have signs of
a serial murderer? does he need a girl friend and a social life? or maybe just
take a yoga class? could these ramblings of his simply be on days that he
doesn't take his medication? i'm starting to think he is not simply just a
troll. i think there might be something seriously wrong with him and this is
just his way of asking for help. maybe i'm wrong, there was just something is
his last writings that made think he's hurting inside.

Bryan, Oct 12, 2005
8. ### BryanGuest

Re: bizarro world (was Re: Python Doc Problem Example: sort()(reprise))

mr. xah... would you be willing to give a lecture at pycon 2006? i'm sure you
would draw a huge crowd and a lot of people would like to meet you in person...

thanks.

Bryan, Oct 12, 2005
9. ### =?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=Guest

Re: bizarro world (was Re: Python Doc Problem Example: sort() (reprise))

Bryan wrote:
> mr. xah... would you be willing to give a lecture at pycon 2006? i'm
> sure you would draw a huge crowd and a lot of people would like to meet
> you in person...
>
> thanks.
>

I think that would be a highly un-pythonesque crowd. Python isn't much
in the sense of limitations, but that crowd probably needs to be limited
in one way or another, like "only 2 rotten fruits per person" or similar.

--
Lasse VÃ¥gsÃ¦ther Karlsen
http://usinglvkblog.blogspot.com/
mailto:
PGP KeyID: 0x2A42A1C2

=?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=, Oct 12, 2005
10. ### Piet van OostrumGuest

Re: bizarro world

>>>>> Bryan <> (B) wrote:

>B> omg!!! wow!!! after reading this i feel like i just stepped in to some
>B> bizarro world.

So why do you have to repeat the whole thing?? I have kill-filed XL, and
now you put the message in my face. Please don't react to this drivel.
--
Piet van Oostrum <>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email:

Piet van Oostrum, Oct 12, 2005
11. ### Xah LeeGuest

Sorting in Perl

In Perl, to sort a list, do like this:

@li=(1,9,2,3);
@li2 = sort {\$a <=> \$b} @li;
print join(' ', @li2);

In Perl, sort is a function, not some Object Oriented thing. It returns
the sorted result as another list. This is very simple and nice.

It works like this: sort takes the form â€œsort {...} @myListâ€.
Inside the enclosing braces is the body of the ordering function, where
variables \$a and \$b inside are predefined by the language to represent
two elements in the list. The operator â€œ<=>â€ returns -1 if left
element is less than right element. If equal, it returns 0, else 1. It
is equivalent to Python's â€œcmpâ€ function.

Perl provides another comparison operator â€œcmpâ€, which treats its
operands as strings. So, for example:

print "3" <=> "20"; # prints -1
print "3" cmp "20"; # prints 1

In Perl, numbers and strings are mutually automatically converted if
needed.

Another form of sort is â€œsort orderFunctionName @listâ€, which uses
a function name in place of the comparison block

Just for completeness, let's define a Perl equivalent of a Python
example above.

@li=(
'my283.jpg',
'my23i.jpg',
'web7-s.jpg',
'fris88large.jpg',
);

# sorts a list of strings by their embedded number
@li2 = sort { (\$a=~m/(\d+)/)[0] <=> (\$b=~m/(\d+)/)[0];} @li;
print join(' ', @li2); # prints web7-s.jpg my23i.jpg fris88large.jpg
my283.jpg

Note, that Perl also has pure functions (lambda) construct. In Perl, a
pure function is simply done as â€œdef {...}â€, and applied to
argument by the form â€œpureFunction->(args)â€. Unlike Python, Perl
has no limitation in its pure function construct. Because of this, Perl
supports a limited form of functional programing. Here's a simple
example:

\$result = sub(\$) {\$_[0]+1}->(3);
print \$result; # prints 4
# a value obtained by a pure function that adds 1 to its argument,
# applied to a argument of 3.

Perl, like Python, whose compiler is not very smart. In sorting, the
ordering function is called unnecessarily repetitiously. Like Python,
Perlers have sought means to avoid this penalty. If the programer knew
in advance that his matrix is huge or knew in advance his ordering
function, then he can code his sort by writing it using a very complex
workaround.:

Here's how this work around works: Suppose you want to sort a huge list
with a expensive ordering function. If you simply do â€œsort
orderingFunction @myListâ€, then Perl is going to call your
orderingFunction wantonly, and you lose. To beat Perl, you first
generate a copy newList, whose elements are pairs (x,y), where x is
from the original list, and y is the sorting key extracted from x.
Then, you call Perl's sort function on this newList and using a
ordering function based on the second list element (the y). Once this
newList is sorted, you construct your original list by extracting the
first element from this sorted newList. Here's the code example:

# the following two lines are equivalent
@li2 = sort { (\$a=~m/(\d+)/)[0] <=> (\$b=~m/(\d+)/)[0];} @li;
@li2 = map { \$_->[0] } sort { \$a->[1] <=> \$b->[1] } map { [ \$_,
(\$_=~m/(\d+)/)[0] ] } @li;

In the above Perl code, the part â€œmap { [ \$_, (\$_=~m/(\d+)/)[0] ] }
@li;â€ generates the intermediate newList. Then, sort is applied to
it, then, another map â€œmap { \$_->[0] }â€ gets the original list
sorted.

In this way, the cost of the internals of the ordering function is
avoided. (it runs on each list element once) However, your huge list is
copied 1 extra time. So, there are pros and cons. Because this work
around is very complex in both its semantics and syntax, it has
acquired a coolness factor among Perl coders, and is given the name
Schwartzian Transform.

It is interesting to note what compiler flaws can do to imperative
languages and its people. In Python, the language syntax is tainted. In
Perl, a complex construct is invented. In both camps, the basic
mathematics of sorting and its implementation aspects are completely
belied.

For the official doc of Perl's sort, type on the command line:
â€œperldoc -f sortâ€.
---------
this post is archived at:
http://xahlee.org/perl-python/sort_list.html

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 13, 2005
12. ### =?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=Guest

Xah Lee wrote:
> Sorting in Perl
>
> In Perl, to sort a list, do like this:
>
> @li=(1,9,2,3);
> @li2 = sort {\$a <=> \$b} @li;
> print join(' ', @li2);
>
>
> In Perl, sort is a function, not some Object Oriented thing. It returns
> the sorted result as another list. This is very simple and nice.

Like the sorted function in Python ?

li2 = sorted(li)

you can also specify a key and a cmp function if you need to.

<snip>
> In this way, the cost of the internals of the ordering function is
> avoided. (it runs on each list element once) However, your huge list is
> copied 1 extra time. So, there are pros and cons. Because this work
> around is very complex in both its semantics and syntax, it has
> acquired a coolness factor among Perl coders, and is given the name
> Schwartzian Transform.

That's about the worst explanation I've seen of Scwartzian Transform
anywhere. Why deviate from what *every other source* does? Give an
example, like the typical one of sorting filenames based on the size of
the file they refer to. If you get the size in the comparison function,
you can potentially get the size (which is a rather costly operation)
many times for each file. Instead, you prefetch it once for each
filename and store that in the list, and use the size as the key for
sorting.

So while your explanation is technically correct, you lost some of the
explaining factors. In other words, using Google lets anyone wanting to
know about "Schwartzian Transform" find *much* better explanations than

But that's no surprise, you're Xah Lee anyway, loosing out is your
middle name.

BTW, this way of doing the sort is nothing special for Perl, that
construct can be seen many places simply because it does not fight any
of Perls "problems" with sorting, instead it overcomes a common problem
with sorting in any language.

>
> It is interesting to note what compiler flaws can do to imperative
> languages and its people. In Python, the language syntax is tainted. In

tainted how?

> Perl, a complex construct is invented. In both camps, the basic

invented how?

> mathematics of sorting and its implementation aspects are completely
> belied.

belied how?

It's interesting to note that these "fact posts" of yours are nothing

--
Lasse VÃ¥gsÃ¦ther Karlsen
http://usinglvkblog.blogspot.com/
mailto:
PGP KeyID: 0x2A42A1C2

=?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=, Oct 13, 2005
13. ### Xah LeeGuest

Lasse VÃ¥gsÃ¦ther Karlsen wrote:

> Like the sorted function in Python ?
>
> li2 = sorted(li)
>
> you can also specify a key and a cmp function if you need to.

Thanks. I didn't know there's also a sort function in Python (2.4),
besides the method. (i've mentioned your name as acknowledgement at my
website essay)

The Python doc really should mention it at the place where the sort
method is documented.

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 13, 2005
14. ### Diez B. RoggischGuest

> In Perl, sort is a function, not some Object Oriented thing. It returns
> the sorted result as another list. This is very simple and nice.

And sometimes exteremely stupid - if your list is large, and making a
copy just form sorting it (when you don't have to keep a referenece to
the old list) is highly inefficient.

Calling this stupid tells us just a little bit more of your total lack
of programming fundamentals.

Not to mention that writing a helper function

def sorted(l):
l = l[:]
l.sort()
return l

to have sorted in versions below 2.4 isn't exactly "magic." But hey,
we're talking about Xah Lee, a guy who believes his hair could have
saved the world [1]....

Diez

[1] http://xahlee.org/PageTwo_dir/Personal_dir/mi_pixra.html

Diez B. Roggisch, Oct 13, 2005
15. ### =?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=Guest

Diez B. Roggisch wrote:
<snip>

Oh man... Talk about ... bummer.

Seriously, who do we call to get someone with a straightjacket to show
up at his home?

--
Lasse VÃ¥gsÃ¦ther Karlsen
http://usinglvkblog.blogspot.com/
mailto:
PGP KeyID: 0x2A42A1C2

=?UTF-8?B?TGFzc2UgVsOlZ3PDpnRoZXIgS2FybHNlbg==?=, Oct 13, 2005
16. ### Marcin 'Qrczak' KowalczykGuest

Marcin 'Qrczak' Kowalczyk, Oct 13, 2005
17. ### Tim RobertsGuest

Marcin 'Qrczak' Kowalczyk <> wrote:

>Abdulaziz Ghuloum <> writes:
>
>> Python FAQs contain an entry to the schwartzian transform.
>>
>> http://www.python.org/doc/faq/progr...-can-you-do-a-schwartzian-transform-in-python

>
>This entry is obsolete: it should mention the 'key' option of the
>standard sort method.

It should mention it, but not necessarily recommend it.

I haven't run the numbers in Python, but in Perl, the undecorated sort is
so well-optimized that the Schwartzian transform is almost always faster
than passing a custom comparator to the sort function.
--
- Tim Roberts,
Providenza & Boekelheide, Inc.

Tim Roberts, Oct 15, 2005
18. ### Fredrik LundhGuest

Tim Roberts wrote:

> >This entry is obsolete: it should mention the 'key' option of the
> >standard sort method.

>
> It should mention it, but not necessarily recommend it.
>
> I haven't run the numbers in Python, but in Perl, the undecorated sort is
> so well-optimized that the Schwartzian transform is almost always faster
> than passing a custom comparator to the sort function.

the "key" option is used for decoration. from the documentation:

key specifies a function of one argument that is used to extract
a comparison key from each list element: "key=str.lower"

In general, the key and reverse conversion processes are
much faster than specifying an equivalent cmp function. This
is because cmp is called multiple times for each list element
while key and reverse touch each element only once.

</F>

Fredrik Lundh, Oct 15, 2005
19. ### Xah LeeGuest

Re: bizarro world (was Re: Python Doc Problem Example: sort() (reprise))

Bryan wrote:
> mr. xah... would you be willing to give a lecture at pycon 2006? i'm sure you
> would draw a huge crowd and a lot of people would like to meet you in person...
>
> thanks.

I'd be delight to.

My requirements are: 1 cup of fat-free milk, free, and free pizza.

Xah

âˆ‘ http://xahlee.org/

Xah Lee, Oct 17, 2005
20. ### Terry HancockGuest

Re: bizarro world (was Re: Python Doc Problem Example: sort()(reprise))

print """\
On Monday 17 October 2005 02:59 am, Xah Lee wrote:
>
> Bryan wrote:
> > mr. xah... would you be willing to give a lecture at pycon 2006? i'm sure you
> > would draw a huge crowd and a lot of people would like to meet you in person...
> >
> > thanks.

>
> I'd be delight to.
>
> My requirements are: 1 cup of fat-free milk, free, and free pizza.

""".replace('crowd', 'angry mob')

;-D
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com

Terry Hancock, Oct 17, 2005