I'd like to be able to compare set 1 with set 2 and have it match
filename1 and filename3, or compare set 1 with 3 and get back
filename1, filename2. etc.
Is there a way for me to do this inside the compare function, rather
than having to make duplicate copies of each set?
Is there a will?
Inevitably there is a way! Whether you should take it is another
question entirely.
Assuming by 'compare' function you mean such methods as 'difference,'
'symetric_difference', 'intersection' and the like... here's a nasty
little hack (using the old-school Set from sets.py) It's not to spec
(you get the tails back in the result, but that's easily fixed), and
it only implements a replacement method for 'difference' (called
'tailess_difference).
I apologise if the google groups mailer kludges the indentation ...
-----
from sets import Set
from itertools import ifilterfalse
from os.path import splitext
class BodgySet (Set) :
def tailess_difference (self, other) :
"""Return, as a new BodgySet, the difference of two
sets, where element identity ignores all characters
from the last stop (period).
NOTE: As currently implemented all elements of said
sets must be strings (fix this in self.has_key)!!!
"""
assert other.__class__ is self.__class__
result = self.__class__()
data = result._data
value = True
for elt in ifilterfalse(other.has_key, self) :
data[elt] = value
return result
def has_key (self, target) :
thead, ttail = splitext(target)
for key in self._data.keys() :
khead, ktail = splitext(key)
if thead == khead :
return True
-----
Using this hacked set:
a = BodgySet(['a1.txt', 'a2.txt'])
b = BodgySet(['a1.xml', 'a2.xml', 'a3.xml'])
b.tailess_difference(a)
BodgySet(['a3.xml'])
Is that the kind of thing you had in mind?
While it can be done, I would prefer to make copies of the sets, with
a cast list comprehension something like: set([os.path.splitext(x)[0]
for x in orig_set]). Much better readibility and probably greater
efficiency (I haven't bothered timing or dissing it mind).