common problem - elegant solution sought

H

Helmut Jarausch

Hi,

I'm looking for an elegant solution of the following tiny but common problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]
pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

Most probably there are much more elegant solutions.
Unfortunately, the index-list-method doesn't take an
additional function argument for the comparisons.

Many thanks for your hints,

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
C

cokofreedom

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]

Do they have to be tuples? Seems to me if you are using a Unique_ID
that a dictionary would be perfect. Indeed when you are looking to
delete an entry you can simply look for the Unique_ID in the
dictionary and it will be a very fast look up.

if key in "my_dictionary":

However, if you must use a list of tuples, your current method is very
inefficient. Why not use the del within the if Key == "b":?

I cannot provide extremely elegant solutions with your current system,
but I can tell you with a large enough list your way is going to take
a very long time since you will iterate over the whole list
sequentially to find an entry...
 
T

Tim Golden

Helmut said:
I'm looking for an elegant solution of the following tiny but common problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]
pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

Most probably there are much more elegant solutions.
Unfortunately, the index-list-method doesn't take an
additional function argument for the comparisons.

Probably the most common solution to this in Python
is to produce a second list which has all the items
in the first except for the one(s) you wish to delete:

<code>
L=[("a","070501"),("b","080115"),("c","071231")]
L2 = [(uniqid, date) for (uniqid, date) in L if not uniqid == 'b']
</code>

It might look a little wasteful, but since Python lists
are supremely fast and since the tuples themselves aren't
copied, only their references, the result is probably what
you need.

Obviously you've given us a toy example, which is fine for
demonstrating the problem. Suggestions might vary if, for
example, your data set were much bigger or if the tuples
were more complex.

TJG
 
D

Diez B. Roggisch

Helmut said:
Hi,

I'm looking for an elegant solution of the following tiny but common
problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]
pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

Most probably there are much more elegant solutions.
Unfortunately, the index-list-method doesn't take an
additional function argument for the comparisons.

Many thanks for your hints,

Several solutions:

- use a different datastructure, as others suggested

- use a database. SQLite for example.

- replace the list in-place like this:

L[:] = [(k, d) for k, d in L if k != "b"]

Diez
 
H

Helmut Jarausch

Thanks to you all for your help.

The solution to regenerate the list skipping
the one to be deleted is fine for me since my
lists are of moderate size and the operation
is infrequent.

Helmut.


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
B

Bruno Desthuilliers

Helmut Jarausch a écrit :
Hi,

I'm looking for an elegant solution of the following tiny but common
problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

If you don't care about ordering, you could use a dict:

L=[("a","070501"),("b","080115"),("c","071231")]
d = dict(L)
del d['b']
L = d.items()

But I guess you care about ordering !-)

Anyway, you can still use a dict to find out the date part:

L=[("a","070501"),("b","080115"),("c","071231")]
d = dict(L)
t = ('b', d['b'])
L.remove(t)

Or you could build an 'index' list to find out the appropriate index:
L=[("a","070501"),("b","080115"),("c","071231")]
index = [key for key, val in L]
pos = index.index('b')
del L[pos]
My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]
pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

You could have used the enumerate(seq) function here. And since you're
breaking out once the element deleted, you could as well delete it from
within the loop:

for pos, (key, date) in enumerate(L):
if key == 'b':
del L[pos]
break


Benchmarking left as an exercice to the reader !-)

Also note that the "best" solution may depend on your real use case and
dataset.
 
S

Steven D'Aprano

Hi,

I'm looking for an elegant solution of the following tiny but common
problem.

I have a list of tuples (Unique_ID,Date) both of which are strings. I
want to delete the tuple (element) with a given Unique_ID, but I don't
known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")] pos=-1
found=-1
for (Key,Date) in L :
pos+= 1
if Key == "b" :
found= pos
break

if found >= 0 :
del L[found]

print L

Most probably there are much more elegant solutions. Unfortunately, the
index-list-method doesn't take an additional function argument for the
comparisons.


Your code mixes two distinct steps into one, and therefore becomes a
little messy. I suggest you change your problem to the two step problem
(1) find an item with the given key; and (2) delete it.

Here's a solution to the first part:

def find_by_key(L, key, where=0):
"""Search the 'where'th position of elements of L."""
for i, x in enumerate(L):
if x[where] == key:
return i
return -1 # or raise an exception


And the second:

pos = find_by_key(L, 'b')
if pos != -1:
del L[pos]

This too could be made into a function, if necessarily.



How else could we solve this problem?

class Tester(object):
def __init__(self, key, where=0):
self.key = key
self.where = where
def __eq__(self, other):
return other[self.where] == self.key


L = [("a", "070501"), ("b", "080115"), ("c", "071231")]
L.index(Tester("c"))


But beware, this technique might not generalize to arbitrary elements in
the list L. It should work if the elements are tuples, but may not work
if they are a custom class.
 
N

Neil Cerutti

Hi,

I'm looking for an elegant solution of the following tiny but common problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]

If the data is truly sorted by Unique_ID, then binary search may be
feasible (though not actually recommended for such small list).

import bisect

i = bisect.bisect_left(L, ('b', '000000'))
if i[0] == 'b':
del L
 
H

Helmut Jarausch

Neil said:
Hi,

I'm looking for an elegant solution of the following tiny but common problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

My straight forward solution is a bit lengthy, e.g.

L=[("a","070501"),("b","080115"),("c","071231")]

If the data is truly sorted by Unique_ID, then binary search may be
feasible (though not actually recommended for such small list).

import bisect

i = bisect.bisect_left(L, ('b', '000000'))
if i[0] == 'b':
del L


Thanks,
unfortunately, they are sorted on 'Date'
Helmut.


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
H

Helmut Jarausch

Again, many thanks to all who provide their solution.
I have timed these (though on my old P3(0.9GHz)) - see below
Helmut.

Helmut said:
Hi,

I'm looking for an elegant solution of the following tiny but common
problem.

I have a list of tuples (Unique_ID,Date) both of which are strings.
I want to delete the tuple (element) with a given Unique_ID, but
I don't known the corresponding Date.

#!/usr/bin/python

import random
import timeit

Lorg=[]

def genList(L) :
for f in range(ord('A'),ord('z')+1) :
for s in range(ord('A'),ord('z')+1) :
L.append((chr(f)+chr(s),str(random.randrange(0,1000000))))

genList(Lorg)
Times= 1000
T0= timeit.Timer('L=list(Lorg)','from __main__ import Lorg').timeit(Times)
print T0

SetUp1=r'''from __main__ import Lorg
def del_by_key(L,key) :
d= dict(L)
del d[key]
L[:]= d.items()
'''

SetUp2=r'''from __main__ import Lorg
def del_by_key(L,key) :
d= dict(L)
t= (key,d[key])
L.remove(t)
'''

SetUp3=r'''from __main__ import Lorg
def del_by_key(L,key) :
index= [k for k,val in L]
pos = index.index(key)
del L[pos]
'''

SetUp4=r'''from __main__ import Lorg
def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break
'''

SetUp5=r'''from __main__ import Lorg
def del_by_key(L,key) :
L[:]= [(k,d) for (k,d) in L if k !=key]
'''

SetUp6=r'''from __main__ import Lorg
class Tester(object) :
def __init__(self,key) :
self.key= key
def __eq__(self,other) :
return other[0] == self.key

def del_by_key(L,key) :
del L[L.index(Tester(key))]
'''

print '*** ready ***'


T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp1).timeit(Times)
print "Method 1 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp2).timeit(Times)
print "Method 2 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp3).timeit(Times)
print "Method 3 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp4).timeit(Times)
print "Method 4 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp5).timeit(Times)
print "Method 5 :",T-T0

T= timeit.Timer("L=list(Lorg);del_by_key(L,'Zz')",SetUp6).timeit(Times)
print "Method 6 :",T-T0

# Results on an old P3 (0.9 GHz)
# *** ready ***
# Method 1 : 10.9850928783
# Method 2 : 5.96455168724
# Method 3 : 3.97821164131
# Method 4 : 1.66151881218
# Method 5 : 8.90886187553
# Method 6 : 6.2503888607


The clear winner is

def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break


--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
B

bearophileHUGS

Helmut Jarausch:
The clear winner is

def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break

If you use Psyco this is faster:

def del_by_key(L,key):
pos = 0
for pair in L:
if pair[0] == key :
del L[pos]
break
pos += 1

Bye,
bearophile
 
D

Dennis Lee Bieber

Thanks to you all for your help.

The solution to regenerate the list skipping
the one to be deleted is fine for me since my
lists are of moderate size and the operation
is infrequent.
If the list were maintained in sorted order, you could possibly
hijack the bisect module... In effect asking where you would insert a
tuple of (key, ""), and then just checking the list elements to either
side of this insert point for a key match.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
P

Paul Rubin

Helmut Jarausch said:
def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break

This looks very dangerous, mutating L while iterating over it.
 
H

Helmut Jarausch

Paul said:
Helmut Jarausch said:
def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break

This looks very dangerous, mutating L while iterating over it.

No, as Bruno Desthuilliers has pointed out, because one
breaks out of the loop immediately.

Helmut.

--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
B

bruno.desthuilliers

Helmut Jarausch said:
def del_by_key(L,key) :
for pos, (k,d) in enumerate(L):
if k == key :
del L[pos]
break

This looks very dangerous, mutating L while iterating over it.

This would indeed be dangerous *without* the break statement !-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top