reversed heapification?

S

Stefan Behnel

Hi!

I need a general-purpose best-k sorting algorithm and I'd like to use heapq
for that (heapify + heappop*k). The problem is that heapify and heappop do not
support the "reverse" keyword as known from sorted() and list.sort(). While
the decorate-sort-undecorate pattern allows me to replace the "key" option, I
do not see an obvious way to have heapq work in a reverse way without making
assumptions on the data.

Any ideas?

Stefan
 
K

Kent Johnson

Stefan said:
Hi!

I need a general-purpose best-k sorting algorithm and I'd like to use heapq
for that (heapify + heappop*k). The problem is that heapify and heappop
do not
support the "reverse" keyword as known from sorted() and list.sort(). While
the decorate-sort-undecorate pattern allows me to replace the "key"
option, I
do not see an obvious way to have heapq work in a reverse way without
making
assumptions on the data.

heapq.nlargest()
heapq.nsmallest()
?

Python 2.4 only

Kent
 
S

Stefan Behnel

Kent said:
heapq.nlargest()
heapq.nsmallest()
?

Python 2.4 only

Thanks!

Those are *very* well hidden in the documentation. Maybe I already read that
page too often...

Stefan
 
S

Stefan Behnel

Kent said:
heapq.nlargest()
heapq.nsmallest()

On second thought, that doesn't actually get me very far. I do not know in
advance how many I must select since I need to remove duplicates *after*
sorting (they are not necessarily 'duplicate' enough to fall into the same
sort bucket). What I'd like to do is heapify and then create an iterator for
the result. But since heapify doesn't support "reverse" ...

Any other ideas?

Stefan
 
J

Jeff Epler

Can you use something like (untested)
class ComparisonReverser:
def __init__(self, s): self.s = s
def __cmp__(self, o): return cmp(o, self.s)
def __lt__... # or whichever operation hashes use
then use (ComparisonReverser(f(x)), i, x) as the decorated item
instead of (f(x), i, x)

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCLFhRJd01MZaTXX0RAuYZAKCsgFERoFijEbvsSHPFa9jByIXjVwCdEBT3
n9/eaVmz5jxW5Uo4uDmaG8E=
=35bo
-----END PGP SIGNATURE-----
 
K

Kent Johnson

Stefan said:
On second thought, that doesn't actually get me very far. I do not know
in advance how many I must select since I need to remove duplicates
*after* sorting (they are not necessarily 'duplicate' enough to fall
into the same sort bucket). What I'd like to do is heapify and then
create an iterator for the result. But since heapify doesn't support
"reverse" ...

Any other ideas?

Wrap your data in a class that defines __cmp__ as the inverse of __cmp__ on the underlying data,
then use heapq?
Just sort the list?

Kent
 
S

Stefan Behnel

Jeff said:
Can you use something like (untested)
class ComparisonReverser:
def __init__(self, s): self.s = s
def __cmp__(self, o): return cmp(o, self.s)
def __lt__... # or whichever operation hashes use
then use (ComparisonReverser(f(x)), i, x) as the decorated item
instead of (f(x), i, x)

Thanks!

That *was* untested ;). To avoid infinite recusion (and other bizarre errors),
you'd have to compare "o.s" and "self.s".

Heapq uses "<=" internally for all comparisons, so it's enough to implement
the __le__ method:

def __le__(self, other): return other.s <= self.s

I actually looked at the C-implementation of heapq in 2.4 and saw that it even
provides distinct implementations for min-heaps and max-heaps. It would be
sooooo convinient if both of them became part of the module...

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top