Sorting troubles

S

seyensubs

I have the following implementations of quicksort and insertion sort:

def qSort(List):
if List == []: return []
return qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])

def insertSort(List):
for i in range(1,len(List)):
value=List
j=i
while List[j-1]>value and j>0:
List[j]=List[j-1]
j-=1
List[j]=value

Now, the quickSort does not modify the original list, mostly because
it works on products and concatenations, rather than alterations.
The insertion sort, however, does modify the list. Now, to give
results, they should be called in such a way( A is an unsorted list)
A=qSort(A)
# the insertion sort does not require this,
insertSort(A)

I would like to know how can I modify the qSort function such that it
affects the list directly inside
I have tried doing it like this

def qSort(List):
if List == []: return []
List = qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])
return List

while processing, the list changes, but those changes remain inside
the function, so that once it's over, if nothis catches the return,
the original List remains unchanged.

If not a solution, I would like to at least know why it does what it
does. I so understand that List(above) is only a reference to the
actual data(a list), but I'm not passing a copy of the data to the
function, but the actual reference(hence, insertSort does
modifications). But then how can I change, inside the function, the
object List is referencing to? If I can't modify the original list,
maybe I can make the variable List point to another list? But changes
inside the function are local. Sorry if this is a bit confusing.
 
T

Thomas Nelson

I have the following implementations of quicksort and insertion sort:

def qSort(List):
if List == []: return []
return qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])

def insertSort(List):
for i in range(1,len(List)):
value=List
j=i
while List[j-1]>value and j>0:
List[j]=List[j-1]
j-=1
List[j]=value

Now, the quickSort does not modify the original list, mostly because
it works on products and concatenations, rather than alterations.
The insertion sort, however, does modify the list. Now, to give
results, they should be called in such a way( A is an unsorted list)
A=qSort(A)
# the insertion sort does not require this,
insertSort(A)

I would like to know how can I modify the qSort function such that it
affects the list directly inside
I have tried doing it like this

def qSort(List):
if List == []: return []
List = qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])
return List

while processing, the list changes, but those changes remain inside
the function, so that once it's over, if nothis catches the return,
the original List remains unchanged.

If not a solution, I would like to at least know why it does what it
does. I so understand that List(above) is only a reference to the
actual data(a list), but I'm not passing a copy of the data to the
function, but the actual reference(hence, insertSort does
modifications). But then how can I change, inside the function, the
object List is referencing to? If I can't modify the original list,
maybe I can make the variable List point to another list? But changes
inside the function are local. Sorry if this is a bit confusing.


The thing is that [x for x in List[1:]...] is a brand new list created
by iterating over the old one.
How about:
qSortHelp(List):
newlist = qSort(List)
for i, val in enumerate(newlist):
List = val
You have to iterate over one more time, but this sorts the list in
place.
HTH,
Tom
 
N

Nick Vatamaniuc

I have the following implementations of quicksort and insertion sort:

def qSort(List):
if List == []: return []
return qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])

def insertSort(List):
for i in range(1,len(List)):
value=List
j=i
while List[j-1]>value and j>0:
List[j]=List[j-1]
j-=1
List[j]=value

Now, the quickSort does not modify the original list, mostly because
it works on products and concatenations, rather than alterations.
The insertion sort, however, does modify the list. Now, to give
results, they should be called in such a way( A is an unsorted list)
A=qSort(A)
# the insertion sort does not require this,
insertSort(A)

I would like to know how can I modify the qSort function such that it
affects the list directly inside
I have tried doing it like this

def qSort(List):
if List == []: return []
List = qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
qSort([x for x in List[1:] if x>=List[0]])
return List

while processing, the list changes, but those changes remain inside
the function, so that once it's over, if nothis catches the return,
the original List remains unchanged.

If not a solution, I would like to at least know why it does what it
does. I so understand that List(above) is only a reference to the
actual data(a list), but I'm not passing a copy of the data to the
function, but the actual reference(hence, insertSort does
modifications). But then how can I change, inside the function, the
object List is referencing to? If I can't modify the original list,
maybe I can make the variable List point to another list? But changes
inside the function are local. Sorry if this is a bit confusing.


It does what it does because in the return statement when you
concatenate qsort(...x<..)+List[0:1]+qsort(...x>=..) you create a new
list. In the insertion sort you modify the values of the list directly
by doing List[j]=List[j-1] or List[j]=value.

If you just have to have the list modified in place, create another
wrapper function that calls your qsort and then will copy all data
from the result into the original list and you are done. Something
like:
def qsort_in_place(L):
sortedL=qsort(L)
for (i,x) in enumerate(sortedL):
L=x

Cheers,
-Nick Vatamaniuc
 
S

seyensubs

I see. I figured that list comprehensions made another list(duh), but
I thought I could relink the object(List) to the new list and keep it
once the function ended.

Is it possible to pass a reference(to an object.. Like 'List',
basically) to a function and change the reference to point to
something created inside a function? Or all data unreturned from a
function is lost and no references kept?(The second, I'd guess, since
it's local temporary scope, but you never know, maybe Python can :) )
 
T

Terry Reedy

|I have the following implementations of quicksort and insertion sort:
|
| def qSort(List):
| if List == []: return []
| return qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
| qSort([x for x in List[1:] if x>=List[0]])
|
| def insertSort(List):
| for i in range(1,len(List)):
| value=List
| j=i
| while List[j-1]>value and j>0:
| List[j]=List[j-1]
| j-=1
| List[j]=value
|
| Now, the quickSort does not modify the original list, mostly because
| it works on products and concatenations, rather than alterations.
| The insertion sort, however, does modify the list. Now, to give
| results, they should be called in such a way( A is an unsorted list)
| A=qSort(A)
| # the insertion sort does not require this,
| insertSort(A)
|
| I would like to know how can I modify the qSort function such that it
| affects the list directly inside
| I have tried doing it like this
|
| def qSort(List):
| if List == []: return []
| List = qSort([x for x in List[1:] if x< List[0]]) + List[0:1] + \
| qSort([x for x in List[1:] if x>=List[0]])
| return List
|
| while processing, the list changes, but those changes remain inside
| the function, so that once it's over, if nothis catches the return,
| the original List remains unchanged.
|
| If not a solution, I would like to at least know why it does what it
| does. I so understand that List(above) is only a reference to the
| actual data(a list), but I'm not passing a copy of the data to the
| function, but the actual reference(hence, insertSort does
| modifications). But then how can I change, inside the function, the
| object List is referencing to? If I can't modify the original list,
| maybe I can make the variable List point to another list? But changes
| inside the function are local. Sorry if this is a bit confusing.

The traditional way to do qsort in place (ignoring possible variations):

def qsort(List, start=0, stop=None):
if start >= stop: return
if stop == None: stop = len(List)
p = partition(List, start, stop) # p = index of pivot/partition item
qsort(List, start, p)
qsort(List, p+1, stop)

where partition puts elements less that pivot value before the pivot value
and greater values after.

You could instead call your function qSorted to indicate that it returns a
sorted copy ;-)

Terry Jan Reedy
 
A

Anton Vredegoor

I see. I figured that list comprehensions made another list(duh), but
I thought I could relink the object(List) to the new list and keep it
once the function ended.

Is it possible to pass a reference(to an object.. Like 'List',
basically) to a function and change the reference to point to
something created inside a function? Or all data unreturned from a
function is lost and no references kept?(The second, I'd guess, since
it's local temporary scope, but you never know, maybe Python can :) )

Maybe this (untested):

def qsort(seq):
if seq:
pivot = seq.pop()
Q = L,R = [],[]
for x in seq:
Q[x>=pivot].append(x)
qsort(R)
seq[:] = L+[pivot]+R

A.
 
S

Steven D'Aprano

The thing is that [x for x in List[1:]...] is a brand new list created
by iterating over the old one.
How about:
qSortHelp(List):
newlist = qSort(List)
for i, val in enumerate(newlist):
List = val
You have to iterate over one more time, but this sorts the list in
place.


A better way of spelling that would be:

def qSortHelp(List):
List[:] = qSort(List)
return List


but the helper function is redundant, as it is easy enough to make the
qSort function behave the same way. We can also make the code a smidgen
more concise by reversing the sense of the test at the start of the code:


def qSort(List):
if List:
List[:] = qSort([x for x in List[1:] if x< List[0]]) \
+ List[0:1] + qSort([x for x in List[1:] if x>=List[0]])
return List
 
S

seyensubs

The thing is that [x for x in List[1:]...] is a brand new list created
by iterating over the old one.
How about:
qSortHelp(List):
newlist = qSort(List)
for i, val in enumerate(newlist):
List = val
You have to iterate over one more time, but this sorts the list in
place.


A better way of spelling that would be:

def qSortHelp(List):
List[:] = qSort(List)
return List

but the helper function is redundant, as it is easy enough to make the
qSort function behave the same way. We can also make the code a smidgen
more concise by reversing the sense of the test at the start of the code:

def qSort(List):
if List:
List[:] = qSort([x for x in List[1:] if x< List[0]]) \
+ List[0:1] + qSort([x for x in List[1:] if x>=List[0]])
return List


Ah, I see, just slicing it like that.. nice!
But after doing some timing tests, the version that's in place and
using partitions is about twice faster than the non hybrid qSort.
The hybrid one, with insertionsSort used for smaller lists works
faster, but in a weird way. When given lists of 2000, the best bet to
is to set the threshold to 14, but when given a list of 40000, 14 is
slow, but a threshold of 200(less or more is slower, again) makes it
about 8 times faster than a normal qSort, and 4 times faster than an
in-place qSort, using a self -defined partitioning alg.

Making a hybrid out of the in-place partitioned qSort makes it a bit
faster, but not by much compared to the other hybrid which uses list
comprehensions.

Teach said that the optimal threshold in hybrids is 14-16, but guess
he wasn't so right after all =\\ The overhead of using insertion sort
on a longer list turns out to be faster than just piling on
recursions, when confronted with bigger lists.

I should probably try and make it iterative now, see how it goes.
Gonna need a stack though, I think.

Thanks for all the answers up to now! :)
 
T

Terry Reedy

| Teach said that the optimal threshold in hybrids is 14-16, but guess
| he wasn't so right after all =\\ The overhead of using insertion sort
| on a longer list turns out to be faster than just piling on
| recursions, when confronted with bigger lists.

The current list.sort (is C, of course, not Python) is a hybrid
insert/merge sort with a threshhold, last I knew, of 64. I believe there
are explanatory comments in the source.

tjr
 
S

Steven D'Aprano

Ah, I see, just slicing it like that.. nice! But after doing some timing
tests, the version that's in place and using partitions is about twice
faster than the non hybrid qSort. The hybrid one, with insertionsSort
used for smaller lists works faster, but in a weird way. When given
lists of 2000, the best bet to is to set the threshold to 14, but when
given a list of 40000, 14 is slow, but a threshold of 200(less or more
is slower, again) makes it about 8 times faster than a normal qSort, and
4 times faster than an in-place qSort, using a self -defined
partitioning alg.

Making a hybrid out of the in-place partitioned qSort makes it a bit
faster, but not by much compared to the other hybrid which uses list
comprehensions.

Teach said that the optimal threshold in hybrids is 14-16, but guess he
wasn't so right after all =\\ The overhead of using insertion sort on a
longer list turns out to be faster than just piling on recursions, when
confronted with bigger lists.

Teach may have been thinking of languages where comparing items is fast
and moving data is slow; Python is the opposite, comparisons invoke a
whole bucket-full of object-oriented mechanism, while moving data just
means moving pointers.

It needs to be said, just in case... this is a good learning exercise,
but don't use this in real code. You aren't going to get within a bull's
roar of the performance of the built-in sort method. Tim Peter's
"timsort" is amazingly powerful; you can read about it here:

http://pythonowns.blogspot.com/2002_07_28_pythonowns_archive.html#79780508

http://svn.python.org/projects/python/trunk/Objects/listsort.txt
 
A

Aether.Singularity

Teach may have been thinking of languages where comparing items is fast
and moving data is slow; Python is the opposite, comparisons invoke a
whole bucket-full of object-oriented mechanism, while moving data just
means moving pointers.

It needs to be said, just in case... this is a good learning exercise,
but don't use this in real code. You aren't going to get within a bull's
roar of the performance of the built-in sort method. Tim Peter's
"timsort" is amazingly powerful; you can read about it here:

http://pythonowns.blogspot.com/2002_07_28_pythonowns_archive.html#797...

http://svn.python.org/projects/python/trunk/Objects/listsort.txt

Yeah, I know any stuff your average CS student will crank out is way
below anything in-built in the core language and libraries. After all,
smart, experienced people worked on those a lot more than me on
these :)

The blog post is dead, but the .txt is alive(it's the svn of
python.org, after all). Gonna read it but looks a bit beyond my level
of comprehension, for now anyway.

Thanks for all the answers! ..oh, got a 10(out of 10) points in class
for the program. Writing stuff in a language no one else learns at the
university and doing 5 versions of quickSort to test performance
must've paid off. Thanks again :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top