Request for useful functions on dicts

Discussion in 'Python' started by Leif, Jul 15, 2012.

  1. Leif

    Leif Guest

    Hi, everybody. I am trying to collect all the functions I've found useful for working with dicts into a library:

    https://github.com/leifp/dictutil

    If you have a favorite dict-related func / class, or know of similar projects, please let me know (or open an issue on github). Bear in mind that the functions don't even have to come from python; if you have a favorite PHP /APL / COBOL / etc associative array function, that's fair game.

    Thanks,
    Leif

    P. S. I'm participating in Julython [ http://www.julython.org/ ], and it'sfun. You should all join and show everyone that your city has the most open source python activity.
     
    Leif, Jul 15, 2012
    #1
    1. Advertising

  2. Leif

    Ian Kelly Guest

    On Sat, Jul 14, 2012 at 5:56 PM, Leif <> wrote:
    > Hi, everybody. I am trying to collect all the functions I've found usefulfor working with dicts into a library:
    >
    > https://github.com/leifp/dictutil
    >
    > If you have a favorite dict-related func / class, or know of similar projects, please let me know (or open an issue on github). Bear in mind that the functions don't even have to come from python; if you have a favorite PHP/ APL / COBOL / etc associative array function, that's fair game.


    Nothing in particular comes to mind, but I'll be happy to critique
    what you've already got.

    One thing that bothers me a bit is that get_in will accept any
    iterable, but set_in and update_in can only accept a sequence in order
    to do the slicing. Here's an alternate implementation for set_in that
    takes any iterable:

    def set_in(d, ks, v):
    tmp = d
    i = None
    for i, next_key in enumerate(ks):
    if i > 0:
    tmp = tmp.setdefault(current_key, {})
    current_key = next_key
    if i is None:
    raise KeyError("Empty keys iterable")
    tmp[current_key] = v

    update_in could be rewritten similarly. You might also notice that
    I'm not returning d here. That's because the Pythonic way is to
    either create a new object and return it, or mutate the existing
    object and return None. If you mutate the existing object and return
    it, that can lead to confusion about which style the method takes.
    This is why list.append (for example) always returns None.

    In Python 2.7+, intersection and difference could be written using
    dictviews, which act like sets.

    partition_on_value, partition_on_key: The only difference between
    these is whether it passes the key or the value to the predicate. You
    could just pass both and let the predicate decide which one (or both)
    to use, and then you only need a single function. Also, why a
    predicate specifically? This could be generalized to partition any
    equivalence relation, not just those with only two equivalence
    classes:

    def partition(f, d):
    """Partition the dict according to an equivalence relation.

    Calls f(key, value) for all (key, value) pairs in the dict d. The return
    value of f must be hashable.
    Returns a new dict where the keys are distinct return values of f, and the
    values are dicts containing the equivalence classes distinguished by those
    return values.
    """
    partition = defaultdict(dict)
    for k, v in d.iteritems():
    partition[f(k, v)][k] = v
    return partition

    If you still wanted the predicate semantics, you could then define
    that as a wrapper:

    def partition_pred(f, d):
    p = partition(lambda k,v: bool(f(k,v)), d)
    return p[True], p[False]

    issubdict could be implemented as a subset operation. I haven't timed
    it so it may not really be any faster, but this way the looping is in
    C instead of in Python:

    def issubdict(d1, d2):
    return set(d1.items()).issubset(d2.items())

    Finally, isempty(d) is basically equivalent to "not d", which is both
    shorter and faster.

    Cheers,
    Ian
     
    Ian Kelly, Jul 15, 2012
    #2
    1. Advertising

  3. Leif

    Leif Guest

    Thanks for the suggestions, Ian! I implemented most of them and pushed thecode.

    > That's because the Pythonic way is to either create a new object and
    > return it, or mutate the existing object and return None.


    You're saying what I was already thinking.

    > In Python 2.7+, intersection and difference could be written using
    > dictviews, which act like sets.


    I'm debating with myself right now whether to support 2.5 and 2.6.

    > def partition(f, d):


    Good idea, thanks.

    > issubdict could be implemented as a subset operation. I haven't timed


    The current implementation is a bit faster, and it's quite a bit faster on the common (for me) case, where one argument is much smaller than the other.. I wrote it that way because if m=len(first), n=len(second), the amount of work is O(min(m,n)). Your implementation is O(m + n). Not really a big deal, unless e.g. m << n.

    --Leif
     
    Leif, Jul 15, 2012
    #3
  4. Leif

    Leif Guest

    Thanks for the suggestions, Ian! I implemented most of them and pushed thecode.

    > That's because the Pythonic way is to either create a new object and
    > return it, or mutate the existing object and return None.


    You're saying what I was already thinking.

    > In Python 2.7+, intersection and difference could be written using
    > dictviews, which act like sets.


    I'm debating with myself right now whether to support 2.5 and 2.6.

    > def partition(f, d):


    Good idea, thanks.

    > issubdict could be implemented as a subset operation. I haven't timed


    The current implementation is a bit faster, and it's quite a bit faster on the common (for me) case, where one argument is much smaller than the other.. I wrote it that way because if m=len(first), n=len(second), the amount of work is O(min(m,n)). Your implementation is O(m + n). Not really a big deal, unless e.g. m << n.

    --Leif
     
    Leif, Jul 15, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oplec
    Replies:
    0
    Views:
    402
    Oplec
    Oct 23, 2003
  2. Nick Jacobson

    Useful? __slots__ for functions

    Nick Jacobson, May 25, 2004, in forum: Python
    Replies:
    3
    Views:
    290
    A. Lloyd Flanagan
    May 26, 2004
  3. MJL
    Replies:
    2
    Views:
    316
    Gordon Burditt
    Aug 15, 2004
  4. bruce
    Replies:
    0
    Views:
    277
    bruce
    Jan 10, 2012
  5. Kirk Haines
    Replies:
    11
    Views:
    244
    Gavin Sinclair
    Sep 7, 2004
Loading...

Share This Page