EXOR or symmetric difference for the Counter class

Discussion in 'Python' started by Paddy, Aug 12, 2010.

  1. Paddy

    Paddy Guest

    I find myself needing to calculate the difference between two Counters
    or multisets or bags.

    I want those items that are unique to each bag. I know how to
    calculate it:

    >>> b = Counter(a=1, b=2)
    >>> c = Counter(a=3, b=1)
    >>> diff = (b - c) + (c - b)
    >>> (b - c)

    Counter({'b': 1})
    >>> (c - b)

    Counter({'a': 2})
    >>> diff

    Counter({'a': 2, 'b': 1})

    But thought why doesn't this operation appear already as a method of
    the class?

    - Paddy.
     
    Paddy, Aug 12, 2010
    #1
    1. Advertising

  2. On Thu, 12 Aug 2010 13:20:19 -0700, Paddy wrote:

    > I find myself needing to calculate the difference between two Counters
    > or multisets or bags.


    Is this collections.Counter from Python 3.1? If so, you should say so,
    and if not, you should tell us which Counter class this is. It will save
    people (by which I mean *me*) from spending an unrewarding few minutes
    trying to import Counter in Python 2.5 and writing up a sarcastic
    response ... :)


    > I want those items that are unique to each bag. I know how to calculate
    > it:
    >
    > >>> b = Counter(a=1, b=2)
    > >>> c = Counter(a=3, b=1)
    > >>> diff = (b - c) + (c - b)
    > >>> (b - c)

    > Counter({'b': 1})
    > >>> (c - b)

    > Counter({'a': 2})
    > >>> diff

    > Counter({'a': 2, 'b': 1})
    >
    > But thought why doesn't this operation appear already as a method of the
    > class?


    Don't know. Perhaps you should put in a feature request.



    --
    Steven
     
    Steven D'Aprano, Aug 13, 2010
    #2
    1. Advertising

  3. Paddy

    Chris Rebert Guest

    On Thu, Aug 12, 2010 at 10:36 PM, Steven D'Aprano
    <> wrote:
    > On Thu, 12 Aug 2010 13:20:19 -0700, Paddy wrote:
    >
    >> I find myself needing to calculate the difference between two Counters
    >> or multisets or bags.

    >
    > Is this collections.Counter from Python 3.1?


    "Changed in version **2.7**: Added Counter and OrderedDict."

    That is still ahead of the curve though.

    Cheers,
    Chris
     
    Chris Rebert, Aug 13, 2010
    #3
  4. Paddy

    Paddy Guest

    On Aug 13, 6:36 am, Steven D'Aprano <st...@REMOVE-THIS-
    cybersource.com.au> wrote:
    > On Thu, 12 Aug 2010 13:20:19 -0700, Paddy wrote:
    > > I find myself needing to calculate the difference between two Counters
    > > or multisets or bags.

    >
    > Is this collections.Counter from Python 3.1? If so, you should say so,
    > and if not, you should tell us which Counter class this is. It will save
    > people (by which I mean *me*) from spending an unrewarding few minutes
    > trying to import Counter in Python 2.5 and writing up a sarcastic
    > response ... :)
    >
    > > I want those items that are unique to each bag. I know how to calculate
    > > it:

    >
    > >     >>> b = Counter(a=1, b=2)
    > >     >>> c = Counter(a=3, b=1)
    > >     >>> diff = (b - c) + (c - b)
    > >     >>> (b - c)
    > >     Counter({'b': 1})
    > >     >>> (c - b)
    > >     Counter({'a': 2})
    > >     >>> diff
    > >     Counter({'a': 2, 'b': 1})

    >
    > > But thought why doesn't this operation appear already as a method of the
    > > class?

    >
    > Don't know. Perhaps you should put in a feature request.
    >
    > --
    > Steven


    Yes it is Counter in both 3.1 and 2.7 (And somewhere on Activestate
    too).

    Before I put in a feature request, I wanted to know if other have seen
    the need.
     
    Paddy, Aug 13, 2010
    #4
  5. On Aug 12, 1:20 pm, Paddy <> wrote:
    > I find myself needing to calculate the difference between two Counters
    > or multisets or bags.
    >
    > I want those items that are unique to each bag.


    Tell us about your use cases. I'm curious how a program would ascribe
    semantic meaning to the result. The phrase "unique to each bag"
    doesn't quite cover it, perhaps something like "number in either
    source above the minimum held in common".

    AFAICT, I've never needed something like this as a primitive. Even
    the xor operation for regular sets is rarely used.


    > I know how to
    > calculate it:
    >
    >     >>> b = Counter(a=1, b=2)
    >     >>> c = Counter(a=3, b=1)
    >     >>> diff = (b - c) + (c - b)
    > >>> diff

    > Counter({'a': 2, 'b': 1})


    That seems simple enough.
    You could also use:

    diff = (b | c) - (b & c) # max(b,c) - min(b,c)


    Raymond
     
    Raymond Hettinger, Aug 14, 2010
    #5
  6. Paddy

    Paddy Guest

    On 14 Aug, 18:14, Raymond Hettinger <> wrote:
    > On Aug 12, 1:20 pm, Paddy <> wrote:
    >
    > > I find myself needing to calculate the difference between two Counters
    > > or multisets or bags.

    >
    > > I want those items that are unique to each bag.

    >
    > Tell us about your use cases.  I'm curious how a program would ascribe
    > semantic meaning to the result.  The phrase "unique to each bag"
    > doesn't quite cover it, perhaps something like "number in either
    > source above the minimum held in common".
    >
    > AFAICT, I've never needed something like this as a primitive.  Even
    > the xor operation for regular sets is rarely used.
    >
    > > I know how to
    > > calculate it:

    >
    > >     >>> b = Counter(a=1, b=2)
    > >     >>> c = Counter(a=3, b=1)
    > >     >>> diff = (b - c) + (c - b)
    > >     >>> diff
    > >     Counter({'a': 2, 'b': 1})

    >
    > That seems simple enough.
    > You could also use:
    >
    >  diff = (b | c) - (b & c)   # max(b,c) - min(b,c)
    >
    > Raymond


    Hi Raymond and others,

    Lets say you have two *sets* of integers representing two near-copies
    of some system, then a measure of their difference could be calculated
    as:

    len(X.symmetric_difference(Y)) / (len(X) + len(Y)) * 100 %

    If the two collections of integers are allowed duplicates then you
    need a Counter/bag/multi-set type and the diff calculation I gave
    originally.

    Thanks.
     
    Paddy, Aug 16, 2010
    #6
  7. [Paddy]
    > Lets say you have two *sets* of integers representing two near-copies
    > of some system, then a measure of their difference could be calculated
    > as:
    >
    > len(X.symmetric_difference(Y)) / (len(X) + len(Y)) * 100 %
    >
    > If the two collections of integers are allowed duplicates then you
    > need a Counter/bag/multi-set type and the diff calculation I gave
    > originally.


    Thanks for sharing your use case.

    It's unlikely that I will add this method to the Counter API because
    the rarity of use case does not warrant the added API complexity.
    IMO, adding a method like this makes the class harder to learn,
    understand and remember. It doesn't seem like much of a win over
    using the existing alternatives:

    * (b - c) + (c - b)
    * (b | c) - (b & c)
    * DIY using the two counters as simple dicts
    * writing a subclass providing additional binary operations

    I would like to see someone post a subclass to the ASPN Cookbook that
    adds a number of interesting, though not common operations. Your
    symmetric_difference() method could be one. A dot_product() operation
    could be another. Elementwise arithmetic is another option (we
    already have add and subtract, but could possibly use multiply,
    divide, etc). Scaling operations are another possibility (multiple
    all elements by five, for example).

    The Counter() class has low aspirations. It is a dictionary that
    fills-in missing values with zero and is augmented by a handful of
    basic methods for managing the counts.


    Raymond
     
    Raymond Hettinger, Aug 17, 2010
    #7
  8. Paddy

    Paddy Guest

    On 17 Aug, 02:29, Raymond Hettinger <> wrote:
    > [Paddy]
    >
    > > Lets say you have two *sets* of integers representing two near-copies
    > > of some system, then a measure of their difference could be calculated
    > > as:

    >
    > > len(X.symmetric_difference(Y)) / (len(X) + len(Y)) * 100 %

    >
    > > If the two collections of integers are allowed duplicates then you
    > > need a Counter/bag/multi-set type and the diff calculation I gave
    > > originally.

    >
    > Thanks for sharing your use case.
    >
    > It's unlikely that I will add this method to the Counter API because
    > the rarity of use case does not warrant the added API complexity.
    > IMO, adding a method like this makes the class harder to learn,
    > understand and remember.  It doesn't seem like much of a win over
    > using the existing alternatives:
    >
    >  * (b - c) + (c - b)
    >  * (b | c) - (b & c)
    >  * DIY using the two counters as simple dicts
    >  * writing a subclass providing additional binary operations
    >
    > I would like to see someone post a subclass to the ASPN Cookbook that
    > adds a number of interesting, though not common operations.  Your
    > symmetric_difference() method could be one.  A dot_product() operation
    > could be another.  Elementwise arithmetic is another option (we
    > already have add and subtract, but could possibly use multiply,
    > divide, etc).  Scaling operations are another possibility (multiple
    > all elements by five, for example).
    >
    > The Counter() class has low aspirations.  It is a dictionary that
    > fills-in missing values with zero and is augmented by a handful of
    > basic methods for managing the counts.
    >
    > Raymond


    I created this that could be an addition to the bottom of the Python 3
    collections.Counter class definition:

    def __xor__(self, other):
    ''' symmetric difference: Subtract count, but keep only abs
    results with non-zero counts.

    >>> Counter('abbbc') ^ Counter('bccd')

    Counter({'b': 2, 'a': 1, 'c': 1, 'd': 1})
    >>> a, b = Counter('abbbc'), Counter('bccd')
    >>> (a-b) + (b - a) == a ^ b

    True

    '''
    if not isinstance(other, Counter):
    return NotImplemented
    result = Counter()
    for elem in set(self) | set(other):
    newcount = self[elem] - other[elem]
    if newcount != 0:
    result[elem] = newcount if newcount > 0 else -newcount
    return result

    - Paddy.
     
    Paddy, Aug 17, 2010
    #8
  9. Paddy

    Paddy Guest

    On Aug 17, 10:47 pm, Paddy <> wrote:
    > On 17 Aug, 02:29, Raymond Hettinger <> wrote:
    >
    >
    >
    > > [Paddy]

    >
    > > > Lets say you have two *sets* of integers representing two near-copies
    > > > of some system, then a measure of their difference could be calculated
    > > > as:

    >
    > > > len(X.symmetric_difference(Y)) / (len(X) + len(Y)) * 100 %

    >
    > > > If the two collections of integers are allowed duplicates then you
    > > > need a Counter/bag/multi-set type and the diff calculation I gave
    > > > originally.

    >
    > > Thanks for sharing your use case.

    >
    > > It's unlikely that I will add this method to the Counter API because
    > > the rarity of use case does not warrant the added API complexity.
    > > IMO, adding a method like this makes the class harder to learn,
    > > understand and remember.  It doesn't seem like much of a win over
    > > using the existing alternatives:

    >
    > >  * (b - c) + (c - b)
    > >  * (b | c) - (b & c)
    > >  * DIY using the two counters as simple dicts
    > >  * writing a subclass providing additional binary operations

    >
    > > I would like to see someone post a subclass to the ASPN Cookbook that
    > > adds a number of interesting, though not common operations.  Your
    > > symmetric_difference() method could be one.  A dot_product() operation
    > > could be another.  Elementwise arithmetic is another option (we
    > > already have add and subtract, but could possibly use multiply,
    > > divide, etc).  Scaling operations are another possibility (multiple
    > > all elements by five, for example).

    >
    > > The Counter() class has low aspirations.  It is a dictionary that
    > > fills-in missing values with zero and is augmented by a handful of
    > > basic methods for managing the counts.

    >
    > > Raymond

    >
    > I created this that could be an addition to the bottom of the Python 3
    > collections.Counter class definition:
    >
    >     def __xor__(self, other):
    >         ''' symmetric difference: Subtract count, but keep only abs
    > results with non-zero counts.
    >
    >         >>> Counter('abbbc') ^ Counter('bccd')
    >         Counter({'b': 2, 'a': 1, 'c': 1, 'd': 1})
    >         >>> a, b = Counter('abbbc'), Counter('bccd')
    >         >>> (a-b) + (b - a) == a ^ b
    >         True
    >
    >         '''
    >         if not isinstance(other, Counter):
    >             return NotImplemented
    >         result = Counter()
    >         for elem in set(self) | set(other):
    >             newcount = self[elem] - other[elem]
    >             if newcount != 0:
    >                 result[elem] = newcount if newcount > 0 else -newcount
    >         return result
    >
    > - Paddy.


    And heres the cartesian product/multiply:

    def __mul__(self, other):
    '''Multiply counts by an integer; or cartesioan product
    of two counters.

    >>> Counter('abbb') * 3

    Counter({'b': 9, 'a': 3})
    >>> Counter('12') * Counter('21')

    Counter({('2', '1'): 1, ('1', '2'): 1, ('1', '1'): 1, ('2',
    '2'): 1})
    >>> Counter('122') * Counter('211')

    Counter({('2', '1'): 4, ('1', '1'): 2, ('2', '2'): 2, ('1',
    '2'): 1})
    '''
    if isinstance(other, int):
    return Counter(**dict((k, v*other)
    for k,v in self.items()))
    elif isinstance(other, Counter):
    return Counter( (x, y)
    for x in self.elements()
    for y in other.elements() )
    else:
    return NotImplemented

    (Although I don't have a use case for this one).

    - Paddy.
     
    Paddy, Aug 17, 2010
    #9
  10. Paddy

    Paddy Guest

    On Aug 17, 2:29 am, Raymond Hettinger <> wrote:

    > I would like to see someone post a subclass to the ASPN Cookbook that
    > adds a number of interesting, though not common operations.  Your
    > symmetric_difference() method could be one.  A dot_product() operation
    > could be another.  Elementwise arithmetic is another option (we
    > already have add and subtract, but could possibly use multiply,
    > divide, etc).  Scaling operations are another possibility (multiple
    > all elements by five, for example).
    >


    >
    > Raymond


    Sample code is at http://code.activestate.com/recipes/577362-extension-to-python-3-counter-class/
     
    Paddy, Aug 17, 2010
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nick Timkovich
    Replies:
    0
    Views:
    84
    Nick Timkovich
    Feb 25, 2014
  2. Skip Montanaro
    Replies:
    0
    Views:
    84
    Skip Montanaro
    Feb 25, 2014
  3. Peter Otten
    Replies:
    1
    Views:
    103
    Mark Lawrence
    Feb 25, 2014
  4. Peter Otten
    Replies:
    0
    Views:
    75
    Peter Otten
    Feb 25, 2014
  5. Tim Chase
    Replies:
    5
    Views:
    78
    Tim Chase
    Feb 26, 2014
Loading...

Share This Page