heapq.merge with key=

K

Kevin D. Smith

I need the behavior of heapq.merge to merge a bunch of results from a
database. I was doing this with sorted(itertools.chain(...), key=...),
but I would prefer to do this with generators. My issue is that I need
the key= argument to sort on the correct field in the database.
heapq.merge doesn't have this argument and I don't understand the code
enough to know if it's possible to add it. Is this enhancement
possible without drastically changing the current code?
 
C

Chris Rebert

I need the behavior of heapq.merge to merge a bunch of results from a
database.  I was doing this with sorted(itertools.chain(...), key=....), but
I would prefer to do this with generators.  My issue is that I need the key=
argument to sort on the correct field in the database.  heapq.merge doesn't
have this argument and I don't understand the code enough to know if it's
possible to add it.  Is this enhancement possible without drastically
changing the current code?

I think so. Completely untested code:

def key(ob):
#code here

class Keyed(object):
def __init__(self, obj):
self.obj = obj
def __cmp__(self, other):
return cmp(key(self.obj), key(other.obj))

def keyify(gen):
for item in gen:
yield Keyed(item)

def stripify(gen):
for keyed in gen:
yield keyed.obj

merged = stripify(merge(keyify(A), keyify(B), keyify(C))) #A,B,C being
the iterables


Cheers,
Chris
 
K

Kevin D. Smith

I think so. Completely untested code:

def key(ob):
#code here

class Keyed(object):
def __init__(self, obj):
self.obj = obj
def __cmp__(self, other):
return cmp(key(self.obj), key(other.obj))

def keyify(gen):
for item in gen:
yield Keyed(item)

def stripify(gen):
for keyed in gen:
yield keyed.obj

merged = stripify(merge(keyify(A), keyify(B), keyify(C))) #A,B,C being
the iterables

Ah, that's not a bad idea. I think it could be simplified by letting
Python do the comparison work as follows (also untested).

def keyify(gen, key=lamda x:x):
for item in gen:
yield (key(item), item)

def stripify(gen):
for item in gen:
yield item[1]

After looking at the heapq.merge code, it seems like something like
this could easily be added to that code. If the next() calls were
wrapped with the tuple creating code above and the yield simply
returned the item. It would, of course, have to assume that the
iterables were sorted using the same key, but that's better than not
having the key option at all.
 
P

Peter Otten

Kevin said:
I think so. Completely untested code:

def key(ob):
#code here

class Keyed(object):
def __init__(self, obj):
self.obj = obj
def __cmp__(self, other):
return cmp(key(self.obj), key(other.obj))

def keyify(gen):
for item in gen:
yield Keyed(item)

def stripify(gen):
for keyed in gen:
yield keyed.obj

merged = stripify(merge(keyify(A), keyify(B), keyify(C))) #A,B,C being
the iterables

Ah, that's not a bad idea. I think it could be simplified by letting
Python do the comparison work as follows (also untested).

def keyify(gen, key=lamda x:x):
for item in gen:
yield (key(item), item)

def stripify(gen):
for item in gen:
yield item[1]

This is not as general. It makes sorting unstable and can even fail if the
items are unorderable:
.... for item in gen:
.... yield key(item), item
........ for item in gen:
.... yield item[-1]
....
sorted(decorate([1j, -1j, 1+1j], key=lambda c: c.real))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

One way to fix it:
.... for index, item in enumerate(gen):
.... yield key(item), index, item
....
sorted(decorate([1j, -1j, 1+1j], key=lambda c: c.real)) [(0.0, 0, 1j), (0.0, 1, -1j), (1.0, 2, (1+1j))]
list(undecorate(_))
[1j, -1j, (1+1j)]

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,169
Latest member
ArturoOlne
Top