Re: dict: retrieve the original key by key

Discussion in 'Python' started by Christoph Groth, May 15, 2011.

  1. Chris Rebert <> writes:

    > On Sun, May 15, 2011 at 1:28 AM, Christoph Groth <> wrote:
    >> I use a huge python dictionary where the values are lists of that
    >> dictionary's keys (yes, a graph).  Each key is thus referenced
    >> several times.
    >>
    >> As the keys are rather large objects, I would like to save memory by
    >> re-using key objects wherever possible, instead of having several
    >> equal objects in memory.
    >>
    >> There does not seem to be a way to retrieve the original key from a
    >> python dictionary.  Is there a technical reason for this?  (Other
    >> than that such functionality was not considered to be useful enough.)

    >
    > Define "original key".


    def original_key(dictionary, key):
    for k in dictionary:
    if k == key:
    return k
    raise KeyError(key)

    But this is not efficient.

    I would like to avoid having _multiple_ objects which are equal (a == b)
    but not the same (a is not b). This would save a lot of memory.
     
    Christoph Groth, May 15, 2011
    #1
    1. Advertising

  2. On Sun, 15 May 2011 11:11:41 +0200, Christoph Groth wrote:

    > I would like to avoid having _multiple_ objects which are equal (a == b)
    > but not the same (a is not b). This would save a lot of memory.


    Based on the idea of interning, which is used for Python strings:

    cache = {}
    def my_intern(obj):
    return cache.setdefault(obj, obj)


    x = make_some_object()
    x = my_intern(x)

    This ensures that equal objects in the graph are not just equal, but the
    same cached object.


    --
    Steven
     
    Steven D'Aprano, May 15, 2011
    #2
    1. Advertising

  3. Steven D'Aprano <> writes:

    > On Sun, 15 May 2011 11:11:41 +0200, Christoph Groth wrote:
    >
    >> I would like to avoid having _multiple_ objects which are equal (a ==
    >> b) but not the same (a is not b). This would save a lot of memory.

    >
    > Based on the idea of interning, which is used for Python strings:
    >
    > cache = {} def my_intern(obj):
    > return cache.setdefault(obj, obj)
    >
    >
    > x = make_some_object() x = my_intern(x)
    >
    > This ensures that equal objects in the graph are not just equal, but
    > the same cached object.


    This requires another dictionary, though.

    But hey, they keys of my dictionary are actually strings, so I can use
    the built-in intern. Somehow, I have never stumbled accross this
    built-in function so far.

    Thanks a lot for the hint!

    Christoph
     
    Christoph Groth, May 15, 2011
    #3
  4. Christoph Groth

    Terry Reedy Guest

    On 5/15/2011 6:46 AM, Christoph Groth wrote:
    > Steven D'Aprano<> writes:
    >
    >> On Sun, 15 May 2011 11:11:41 +0200, Christoph Groth wrote:
    >>
    >>> I would like to avoid having _multiple_ objects which are equal (a ==
    >>> b) but not the same (a is not b). This would save a lot of memory.


    Python hashed collections have methods used to test if the collection
    has an item/key that is equal to some object. They do not currently have
    a method to return the equal item/key already there. This has been
    proposed and, I believe, rejected due to lack of sufficient presented
    use cases or because, conceptually, one wants to map key values to an
    object with the key value and Stephen's identity dict does precisely that.

    In any case, if you put an object into a collection and you want to use
    the object for other purposes without accessing the collection, you must
    keep a reference to it outside of the collection.

    >> Based on the idea of interning, which is used for Python strings:
    >>
    >> cache = {} def my_intern(obj):
    >> return cache.setdefault(obj, obj)
    >>
    >>
    >> x = make_some_object() x = my_intern(x)
    >>
    >> This ensures that equal objects in the graph are not just equal, but
    >> the same cached object.

    >
    > This requires another dictionary, though.


    It does, however, twice reuse the key already in your graph dict, so
    each entry is minimal extra memory.

    It is typical in graph algorithms to have both a graph map (nodes to set
    of nodes) and a properties map (nodes to property structure). Some
    properties are fixed, others are changed during particular algoritms. It
    is also typical to use counts as node identifiers, so that both maps are
    implemented as sequences, but string indentifiers and dict for maps work
    too.

    > But hey, they keys of my dictionary are actually strings, so I can use
    > the built-in intern. Somehow, I have never stumbled accross this
    > built-in function so far.


    It was, however, removed in 3.x as a seldom-externally-used internal
    implementation detail.

    --
    Terry Jan Reedy
     
    Terry Reedy, May 15, 2011
    #4
  5. Christoph Groth

    Chris Rebert Guest

    On Sun, May 15, 2011 at 1:53 PM, Terry Reedy <> wrote:
    > On 5/15/2011 6:46 AM, Christoph Groth wrote:

    <snip>
    >> But hey, they keys of my dictionary are actually strings, so I can use
    >> the built-in intern.  Somehow, I have never stumbled accross this
    >> built-in function so far.

    >
    > It was, however, removed in 3.x as a seldom-externally-used internal
    > implementation detail.


    Still exists in sys module though:
    http://docs.python.org/dev/library/sys.html#sys.intern

    Cheers,
    Chris
     
    Chris Rebert, May 15, 2011
    #5
  6. Christoph Groth

    Ian Kelly Guest

    On Sun, May 15, 2011 at 4:18 AM, Steven D'Aprano
    <> wrote:
    > On Sun, 15 May 2011 11:11:41 +0200, Christoph Groth wrote:
    >
    >> I would like to avoid having _multiple_ objects which are equal (a == b)
    >> but not the same (a is not b).  This would save a lot of memory.

    >
    > Based on the idea of interning, which is used for Python strings:
    >
    > cache = {}
    > def my_intern(obj):
    >    return cache.setdefault(obj, obj)


    And if the idea of a dictionary with identical keys and values
    offends, you can also use a set:

    cache = set()
    def my_intern(obj):
    cache.add(obj)
    return cache.intersection([obj]).pop()

    Cheers,
    Ian
     
    Ian Kelly, May 16, 2011
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Skip Montanaro
    Replies:
    0
    Views:
    418
    Skip Montanaro
    Aug 15, 2003
  2. Alexander Kozlovsky

    dict!ident as equivalent of dict["ident"]

    Alexander Kozlovsky, May 21, 2006, in forum: Python
    Replies:
    5
    Views:
    365
    Alexander Kozlovsky
    May 22, 2006
  3. Albert van der Horst

    dict's as dict's key.

    Albert van der Horst, Jan 13, 2010, in forum: Python
    Replies:
    5
    Views:
    260
    Lie Ryan
    Jan 17, 2010
  4. Christoph Groth

    dict: retrieve the original key by key

    Christoph Groth, May 15, 2011, in forum: Python
    Replies:
    0
    Views:
    196
    Christoph Groth
    May 15, 2011
  5. Victor Hooi
    Replies:
    1
    Views:
    115
    Victor Hooi
    Oct 29, 2013
Loading...

Share This Page