expression form of one-to-many dict?

Discussion in 'Python' started by Steven Bethard, Dec 17, 2004.

  1. So I end up writing code like this a fair bit:

    map = {}
    for key, value in sequence:
    map.setdefault(key, []).append(value)

    This code basically constructs a one-to-many mapping -- each value that
    a key occurs with is stored in the list for that key.

    This code's fine, and seems pretty simple, but thanks to generator
    expressions, I'm getting kinda spoiled. ;) I like being able to do
    something like the following for one-to-one mappings:

    dict(sequence)

    or a more likely scenario for me:

    dict((get_key(item), get_value(item) for item in sequence)

    The point here is that there's a simple sequence or GE that I can throw
    to the dict constructor that spits out a dict with my one-to-one mapping.

    Is there a similar expression form that would spit out my one-to-many
    mapping?

    Steve
    Steven Bethard, Dec 17, 2004
    #1
    1. Advertising

  2. Steven Bethard

    Tim Peters Guest

    [Steven Bethard]
    > So I end up writing code like this a fair bit:
    >
    > map = {}
    > for key, value in sequence:
    > map.setdefault(key, []).append(value)
    >
    > This code basically constructs a one-to-many mapping -- each
    > value that a key occurs with is stored in the list for that key.
    >
    > This code's fine, and seems pretty simple, but thanks to generator
    > expressions, I'm getting kinda spoiled. ;) I like being able to do
    > something like the following for one-to-one mappings:
    >
    > dict(sequence)
    >
    > or a more likely scenario for me:
    >
    > dict((get_key(item), get_value(item) for item in sequence)
    >
    > The point here is that there's a simple sequence or GE that I can
    > throw to the dict constructor that spits out a dict with my one-to-
    > one mapping.


    It's a simple sequence because it's a simple task. It's even simpler
    than perhaps it "should be", since it arbitrarily decides that, if
    more than one instance of some key is seen, it will only remember the
    value associated with the last instance of that key.

    > Is there a similar expression form that would spit out my one-to-
    > manymapping?


    There's a straightforward one-liner in 2.4 (but not notably concise),
    if your keys are totally ordered:

    from itertools import groupby
    d = dict((k, map(get_value, g))
    for k, g in groupby(sorted(sequence, key=get_key),
    key=get_key))

    The real purpose of that is to increase your appreciation for your
    original spelling <0.2 wink>.
    Tim Peters, Dec 17, 2004
    #2
    1. Advertising

  3. Tim Peters wrote:
    >>The point here is that there's a simple sequence or GE that I can
    >>throw to the dict constructor that spits out a dict with my one-to-
    >>one mapping.

    >
    >
    > It's a simple sequence because it's a simple task. It's even simpler
    > than perhaps it "should be", since it arbitrarily decides that, if
    > more than one instance of some key is seen, it will only remember the
    > value associated with the last instance of that key.


    Good point. That's sort of an arbitrary behavior decision, which, while
    reasonable in most situations is not desirable in mine. In thinking
    about this, it struck me that one solution is to change that behavior:

    >>> class OneToMany(object, UserDict.DictMixin):

    .... def __init__(*args, **kwds):
    .... self, args = args[0], args[1:]
    .... self._dict = {}
    .... self.update(*args, **kwds)
    .... def __getitem__(self, key):
    .... if not key in self._dict:
    .... self._dict[key] = []
    .... return self._dict[key]
    .... def __setitem__(self, key, value):
    .... self[key].append(value)
    .... def keys(self):
    .... return self._dict.keys()
    ....
    >>> d = OneToMany((pow(i, 13, 4), i) for i in range(10))
    >>> d[0]

    [0, 2, 4, 6, 8]
    >>> d[1]

    [1, 5, 9]
    >>> d[3]

    [3, 7]

    I'll have to think about whether it would be worth keeping a class like
    this around... It might make working with this kind of data
    interactively simpler...

    Thanks for the comments!

    Steve
    Steven Bethard, Dec 17, 2004
    #3
  4. Steven Bethard

    Larry Bates Guest

    Steven,

    Suggestion: It is a bad idea to name any variable
    "map". When you do, you destroy your ability to call
    Python's map function. Same goes for "list", "str",
    or any other built-in function.

    If you haven't been bitten by this you will, I was.

    Larry Bates

    Steven Bethard wrote:
    > So I end up writing code like this a fair bit:
    >
    > map = {}
    > for key, value in sequence:
    > map.setdefault(key, []).append(value)
    >
    > This code basically constructs a one-to-many mapping -- each value that
    > a key occurs with is stored in the list for that key.
    >
    > This code's fine, and seems pretty simple, but thanks to generator
    > expressions, I'm getting kinda spoiled. ;) I like being able to do
    > something like the following for one-to-one mappings:
    >
    > dict(sequence)
    >
    > or a more likely scenario for me:
    >
    > dict((get_key(item), get_value(item) for item in sequence)
    >
    > The point here is that there's a simple sequence or GE that I can throw
    > to the dict constructor that spits out a dict with my one-to-one mapping.
    >
    > Is there a similar expression form that would spit out my one-to-many
    > mapping?
    >
    > Steve
    Larry Bates, Dec 17, 2004
    #4
  5. Larry Bates wrote:
    > Suggestion: It is a bad idea to name any variable
    > "map". When you do, you destroy your ability to call
    > Python's map function. Same goes for "list", "str",
    > or any other built-in function.
    >
    > If you haven't been bitten by this you will, I was.


    A good reminder for all the newbies out there.

    Sorry, I renamed my variables to simplify the example -- my names
    usually look like '<key>_<value>_map' where <key> and <value> describe
    the items in the dict. Since the generic example didn't really have a
    description for the keys or values, I stripped it down to map. Fine for
    the example, but I should have realized it would draw this comment
    (mainly because I probably would have posted a similar one myself if I
    had seen the example). ;)

    Fortunately, after 2+ years with Python, the risk of me being "bitten"
    again by this is pretty small. ;)

    Actually, it's even smaller now, because I've pretty much removed map
    from all my code in favor of list comprehensions, which I find much
    easier to read.

    Steve
    Steven Bethard, Dec 17, 2004
    #5
  6. Steven Bethard wrote:

    > Actually, it's even smaller now, because I've pretty much removed map
    > from all my code in favor of list comprehensions, which I find much
    > easier to read.


    While I agree that listcomps are more readable in most cases (and certainly for
    all cases with any amount of complexity in the listcomp), I still find that
    map is hard to beat for the simple case of a callable foo:

    outlist = map(foo,inlist)

    is still better in my book, and far more readable, than

    outlist = [foo(x) for x in inlist]

    The map form, in this case, parses instantly in my brain, while the listcomp
    certainly takes a few cycles. And note that I'm not talking about the typing
    conciseness, but about the effort for my brain. But maybe I'm just wired
    funny :)

    Since I tend to have a lot of code like this, I really cringe when I hear of a
    desire to do away with map altogether. I know I could rewrite it myself in
    all my code, but the beauty of these builtins is that they are always there...

    Cheers,

    f
    Fernando Perez, Dec 18, 2004
    #6
  7. Fernando Perez wrote:
    >
    > outlist = map(foo,inlist)
    >
    > is still better in my book, and far more readable, than
    >
    > outlist = [foo(x) for x in inlist]
    >
    > The map form, in this case, parses instantly in my brain, while the listcomp
    > certainly takes a few cycles. And note that I'm not talking about the typing
    > conciseness, but about the effort for my brain. But maybe I'm just wired
    > funny :)


    Well, different at least. I find the map one harder to parse mentally.
    And not for lack of experience with functional programming. But to
    each their own. =)

    Steve
    Steven Bethard, Dec 18, 2004
    #7
  8. Steven Bethard wrote:

    >> The map form, in this case, parses instantly in my brain, while the listcomp
    >> certainly takes a few cycles. And note that I'm not talking about the typing
    >> conciseness, but about the effort for my brain. But maybe I'm just wired
    >> funny :)

    >
    > Well, different at least. I find the map one harder to parse mentally.


    if you have trouble parsing a function call, I'm glad I don't have to maintain
    your Python programs...

    </F>
    Fredrik Lundh, Dec 18, 2004
    #8
  9. Fredrik Lundh wrote:
    > Steven Bethard wrote:
    >
    >>>The map form, in this case, parses instantly in my brain, while the listcomp
    >>>certainly takes a few cycles. And note that I'm not talking about the typing
    >>>conciseness, but about the effort for my brain. But maybe I'm just wired
    >>>funny :)

    >>
    >>Well, different at least. I find the map one harder to parse mentally.

    >
    > if you have trouble parsing a function call, I'm glad I don't have to maintain
    > your Python programs...


    I don't know what I said to upset you so, but I do apologize.

    If you could tell me what it was about my statement (that I find a list
    comprehension to be a clearer description of program flow than a map
    application[1]) that so insulted you, I would be glad to avoid such
    comments in the future, if it would avoid such vicious replies from you.

    Steve

    [1] In my mind, the program flow is spelled out explicitly in the list
    comprehension and only implicitly in the map application. Thus the list
    comprehension is clearer to me.
    Steven Bethard, Dec 18, 2004
    #9
  10. Steven Bethard

    Terry Reedy Guest

    In respect to map(func, seq) versus [func(i) for i in seq], for
    pre-existing func, 'OP' wrote

    >>>>The map form, in this case, parses instantly in my brain, while the
    >>>>listcomp
    >>>>certainly takes a few cycles. And note that I'm not talking about the
    >>>>typing
    >>>>conciseness, but about the effort for my brain. But maybe I'm just
    >>>>wired
    >>>>funny :)


    >> Steven Bethard wrote:
    >>>Well, different at least. I find the map one harder to parse mentally.

    [and]
    > [1] In my mind, the program flow is spelled out explicitly in the list
    > comprehension and only implicitly in the map application. Thus the list
    > comprehension is clearer to me.


    For pre-existing func, I slightly prefer the map form because the
    sequential program flow is *not* spelled out. So I can more easily see the
    sequence items mapped in parallel. On the other hand, I probably prefer
    [i-expression for i in seq] to map(lambda i: i-expression, seq) because I
    find the (unnecessary) mental construction of a function object to be a
    compensating burden.

    I can easily imagine that others will find themselves more comfortably on
    one side or the other of the mental fence I find myself sitting on ;-).

    Terry J. Reedy
    Terry Reedy, Dec 18, 2004
    #10
  11. Steven Bethard <> wrote:
    > map = {}
    > for key, value in sequence:
    > map.setdefault(key, []).append(value)


    I was thinking about exactly that the other day, when converting some
    perl to python.

    [ digression: In perl, you do

    push @{$map->{$key}}, $value

    If $map->{$key} doesn't exist is is autovivified into an array (since
    its in array context). Now thats exactly the sort of magic that gives
    perl a bad name ;-) ]

    However, one thing I noticed is that a list is created and destroyed
    as the second parameter to setdefault whether or not it is used in
    map, which strikes me as a bit wasteful. You obviously can't use the
    same list there either. If it was an object with a more expensive
    constructor then you'd notice it more.

    It would be nice if setdefault didn't evaluate the second argument
    unless it needed it. However I guess it would have to be a language
    feature to do that.

    Here are some timings

    $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),range(1000))' 'map = {}
    for key, value in sequence:
    map.setdefault(key, []).append(value)'
    1000 loops, best of 3: 1.42e+03 usec per loop

    $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),[ i%11 for i in range(1000)])' 'map = {}
    for key, value in sequence:
    map.setdefault(key, []).append(value)'
    1000 loops, best of 3: 1.57e+03 usec per loop

    $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),range(1000))' 'map = {}
    for key, value in sequence:
    if map.has_key(key):
    map[key].append(value)
    else:
    map[key] = [ value ]'
    1000 loops, best of 3: 1.1e+03 usec per loop

    $ /usr/lib/python2.3/timeit.py -s 'sequence=zip(range(1000),[ i%11 for i in range(1000)])' 'map = {}
    for key, value in sequence:
    if map.has_key(key):
    map[key].append(value)
    else:
    map[key] = [ value ]'
    1000 loops, best of 3: 1.11e+03 usec per loop


    Not that timing is everything of course ;-)
    --
    Nick Craig-Wood <> -- http://www.craig-wood.com/nick
    Nick Craig-Wood, Dec 19, 2004
    #11
  12. Steven Bethard

    Mike Meyer Guest

    Nick Craig-Wood <> writes:

    > It would be nice if setdefault didn't evaluate the second argument
    > unless it needed it. However I guess it would have to be a language
    > feature to do that.


    Personally, I'd love a language feature that let you create a function
    that didn't evaluate arguments until they were actually used - lazy
    evaluation. That lets you write the C ?: operator as a function, for
    a start.

    Hmmm. No, iterators can't be used to fake it. Oh well.

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
    Mike Meyer, Dec 19, 2004
    #12
  13. Mike Meyer wrote:

    > Personally, I'd love a language feature that let you create a function
    > that didn't evaluate arguments until they were actually used - lazy
    > evaluation. That lets you write the C ?: operator as a function, for
    > a start.
    >
    > Hmmm. No, iterators can't be used to fake it. Oh well.


    if you can get some collaboration from the function itself, you can fake
    anything:

    def foo_default():
    while 1:
    print "MAKE A FOO"
    yield "foo"
    foo_default = foo_default()

    class mydict(dict):
    def setdefault(self, key, default=None):
    try:
    return self[key]
    except KeyError:
    if hasattr(default, "next"):
    default = default.next()
    self[key] = default
    return default

    d = mydict()
    d["spam"] = "bacon"

    print d.setdefault("spam", foo_default)
    print d.setdefault("egg", foo_default)

    </F>
    Fredrik Lundh, Dec 19, 2004
    #13
  14. late evaluation of function arguments (WAS: expression form of one-to-manydict?)

    Mike Meyer wrote:
    > Personally, I'd love a language feature that let you create a function
    > that didn't evaluate arguments until they were actually used - lazy
    > evaluation. That lets you write the C ?: operator as a function, for
    > a start.


    You can fake it with generator expresions:

    .. >>> class C(object):
    .. ... def __init__(self, x):
    .. ... print x
    .. ...
    .. >>> def unpack(x):
    .. ... try:
    .. ... x, = x
    .. ... except TypeError:
    .. ... pass
    .. ... return x
    .. ...
    .. >>> def f(x):
    .. ... print "f"
    .. ... print unpack(x)
    .. ...
    .. >>> f(C("created"))
    .. created
    .. f
    .. <__main__.C object at 0x01140790>
    .. >>> f(C("created") for _ in [1])
    .. f
    .. created
    .. <__main__.C object at 0x0114F810>

    Of course, the syntax here isn't great -- create a generator expression
    that yields only one item, and call unpack on all function arguments to
    convert them from generators to items as necessary... But it does
    achieve the stated goal -- with the GE, the C object isn't created until
    unpack is called from inside f.

    Faking ?: this way could look like:

    .. >>> def cond(test, true_exp, false_exp):
    .. ... if test:
    .. ... return unpack(true_exp)
    .. ... else:
    .. ... return unpack(false_exp)
    .. ...
    .. >>> cond(True,
    .. ... (C("true exp") for _ in [1]),
    .. ... (C("false exp") for _ in [1]))
    .. true exp
    .. <__main__.C object at 0x0114F730>
    .. >>> cond(False,
    .. ... (C("true exp") for _ in [1]),
    .. ... (C("false exp") for _ in [1]))
    .. false exp
    .. <__main__.C object at 0x0114F8B0>

    I don't know how often this is really necessary, but as it is
    implementable in current Python, if you really liked it, you could argue
    for some syntactic sugar for the construct...

    Steve
    Steven Bethard, Dec 19, 2004
    #14
  15. On Sun, 19 Dec 2004 21:29:27 +0100, "Fredrik Lundh" <> wrote:

    >Mike Meyer wrote:
    >
    >> Personally, I'd love a language feature that let you create a function
    >> that didn't evaluate arguments until they were actually used - lazy
    >> evaluation. That lets you write the C ?: operator as a function, for
    >> a start.
    >>
    >> Hmmm. No, iterators can't be used to fake it. Oh well.

    >
    >if you can get some collaboration from the function itself, you can fake
    >anything:
    >
    >def foo_default():
    > while 1:
    > print "MAKE A FOO"
    > yield "foo"
    >foo_default = foo_default()
    >
    >class mydict(dict):
    > def setdefault(self, key, default=None):
    > try:
    > return self[key]
    > except KeyError:
    > if hasattr(default, "next"):
    > default = default.next()
    > self[key] = default
    > return default
    >
    >d = mydict()
    >d["spam"] = "bacon"
    >
    >print d.setdefault("spam", foo_default)
    >print d.setdefault("egg", foo_default)
    >

    Just the same, it would be interesting to be able to create special-form style functions easily
    by designating arbitrary arguments for deferred evaluation, to which eval could safely be applied
    (or maybe a restricted unquote_arg function for better safety).
    E.g., double back-tick is a syntax error now, so you could write

    def ternary(c, ``t, ``f):
    if c: return eval(t)
    else: return eval(f)

    (Of course, it's also interesting to speculate about what kind of first class object
    would be returned by
    def foo(``x): return x
    and whether ``x should be a generally valid expression, not just specialized for arg list setup ;-)

    (note that ``x is not quite the same as automated lambda:x -- nor lambda x=x:x -- wrapping,
    since the latter would evaluate x for the lambda and the former accesses the last value set
    in the closure cell:

    >>> def foo():

    ... v = 'a'
    ... x = lambda:v
    ... v = 'b'
    ... y = lambda:v
    ... return x,y
    ...
    >>> [foo()() for i in (0,1)]

    ['b', 'b']

    Anyway, with a suitable `` quoting operator your setdefault could be

    def setdefault(self, key, ``default=None):
    ...
    default = eval(default)
    ... etc

    BTW, note that

    def foo(``default=expr): ...

    and

    def foo(``default=``expr): ...

    could mean different things, depending on preferred semantics and ``expr as
    a generally valid expression.

    Regards,
    Bengt Richter
    Bengt Richter, Dec 20, 2004
    #15
  16. Steven Bethard

    Mike Meyer Guest

    (Bengt Richter) writes:

    > On Sun, 19 Dec 2004 21:29:27 +0100, "Fredrik Lundh" <> wrote:
    > (or maybe a restricted unquote_arg function for better safety).
    > E.g., double back-tick is a syntax error now, so you could write
    >
    > def ternary(c, ``t, ``f):
    > if c: return eval(t)
    > else: return eval(f)


    Actually, I think it would be more pythonic if the indication of
    non-evaluation happened at the function invocation instead of the
    function definition. Having it at the function definition makes it
    implicit at the function invocation point. Having functions that
    expect objects of certain types (closures in this case, right?) is
    pretty much standard operating procedure.

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
    Mike Meyer, Dec 20, 2004
    #16
  17. Steven Bethard

    Doug Holton Guest

    Mike Meyer wrote:

    > Personally, I'd love a language feature that let you create a function
    > that didn't evaluate arguments until they were actually used - lazy
    > evaluation. That lets you write the C ?: operator as a function, for
    > a start.
    >
    > Hmmm. No, iterators can't be used to fake it. Oh well.


    That is a brilliant idea. I would suggest requesting such a feature for
    Python 3.0: http://www.python.org/cgi-bin/moinmoin/Python3.0

    However, Python 3.0 is likely years away. If you want to know how to
    use this feature today, consult Fredrik Lundh.
    Doug Holton, Dec 21, 2004
    #17
  18. Steven Bethard

    Paul Rubin Guest

    Mike Meyer <> writes:
    > Personally, I'd love a language feature that let you create a function
    > that didn't evaluate arguments until they were actually used - lazy
    > evaluation. That lets you write the C ?: operator as a function, for
    > a start.
    >
    > Hmmm. No, iterators can't be used to fake it. Oh well.


    You can fake it with lambda, but the syntax is contorted.

    a = condition ? expr1 : expr2;

    becomes

    a = ((lambda: expr2, lambda: expr1)[bool(condition)])()
    Paul Rubin, Dec 21, 2004
    #18
  19. Mike Meyer <> wrote:

    > (Bengt Richter) writes:
    >
    > > On Sun, 19 Dec 2004 21:29:27 +0100, "Fredrik Lundh"

    <> wrote:
    > > (or maybe a restricted unquote_arg function for better safety).
    > > E.g., double back-tick is a syntax error now, so you could write
    > >
    > > def ternary(c, ``t, ``f):
    > > if c: return eval(t)
    > > else: return eval(f)

    >
    > Actually, I think it would be more pythonic if the indication of
    > non-evaluation happened at the function invocation instead of the
    > function definition. Having it at the function definition makes it


    As in, say, calling
    x = ternary(c, lambda:t, lambda:f)
    ? The 'lambda:' is a (not nice-looking, but...) "indication of
    non-evaluation"... or am I misundertanding what you're saying?

    Of course, the implementation of ternary could then use 'apply' rather
    than 'eval' (or, simply call t() or f() as appropriate, identically;-).


    Alex
    Alex Martelli, Dec 21, 2004
    #19
  20. Doug Holton wrote:

    > Mike Meyer wrote:
    >
    >> Personally, I'd love a language feature that let you create a function
    >> that didn't evaluate arguments until they were actually used - lazy
    >> evaluation. That lets you write the C ?: operator as a function, for
    >> a start.
    >>
    >> Hmmm. No, iterators can't be used to fake it. Oh well.

    >
    > That is a brilliant idea. I would suggest requesting such a feature for
    > Python 3.0: http://www.python.org/cgi-bin/moinmoin/Python3.0


    Just as a reference, Mathematica does have such a feature, in the form of the
    HoldAll, HoldFirst, etc. function attributes. It can come in quite handy. I
    actually used it to write a little routine to auto-generate python modules for
    mathematica variables of certain types, without having to specify a list of
    strings for their names. The evaluation control allowed me to give it a very
    clean interface, while the equivalent python2mathematica routine requires a
    list of variable names (strings) as input, and plays sys._getframe tricks with
    it. Not very pleasant.

    So yes, I think it would be a really nifty feature to have, though I haven't
    really thought about how well it meshes (or not) with the rest of python's
    design.

    Cheers,

    f
    Fernando Perez, Dec 21, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dee
    Replies:
    2
    Views:
    390
  2. Skip Montanaro
    Replies:
    0
    Views:
    403
    Skip Montanaro
    Aug 15, 2003
  3. Replies:
    3
    Views:
    309
  4. jason
    Replies:
    6
    Views:
    156
  5. Evolution
    Replies:
    1
    Views:
    100
    Evolution
    Apr 15, 2011
Loading...

Share This Page