generating unique set of dicts from a list of dicts

Discussion in 'Python' started by bruce, Jan 10, 2012.

  1. bruce

    bruce Guest

    trying to figure out how to generate a unique set of dicts from a
    json/list of dicts.

    initial list :::
    [{"pStart1a": {"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
    "instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
    "pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
    "pSearch1a":
    {"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
    {"pStart1":""},
    {"pStart1a":{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
    "instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
    "pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
    "pSearch1a":
    {"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
    {"pStart1":""}]



    As an exmple, the following is the test list:

    [{"pStart1a": {"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
    "instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
    "pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
    "pSearch1a":
    {"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
    {"pStart1":""},
    {"pStart1a":{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
    "instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
    "pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
    "pSearch1a":
    {"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
    {"pStart1":""}]

    Trying to get the following, list of unique dicts, so there aren't
    duplicate dicts.
    Searched various sites/SO.. and still have a mental block.

    [
    {"pStart1a":
    {"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM","instVal":"OSUSI",
    "instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":"CLASS_SRCH_WRK2_SSR_PB_SRCH",
    pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CLASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
    "pSearch1a":
    {"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CLSRCH_SSR_EXPAND_COLLAPS"}},
    {"pStart1":""}]

    I was considering iterating through the initial list, copying each
    dict into a new list, and doing a basic comparison, adding the next
    dict if it's not in the new list.. is there another/better way?

    posted this to StackOverflow as well. >>>>
    http://stackoverflow.com/questions/8808286/simplifying-a-json-list-to-the-unique-dict-items
    <<<

    There was a potential soln that I couldn't understand.


    -------------------------
    The simplest approach -- using list(set(your_list_of_dicts)) won't
    work because Python dictionaries are mutable and not hashable (that
    is, they don't implement __hash__). This is because Python can't
    guarantee that the hash of a dictionary won't change after you insert
    it into a set or dict.

    However, in your case, since you (don't seem to be) modifying the data
    at all, you can compute your own hash, and use this along with a
    dictionary to relatively easily find the unique JSON objects without
    having to do a full recursive comparison of each dictionary to the
    others.

    First, we need a function to compute a hash of the dictionary. Rather
    than trying to build our own hash function, let's use one of the
    built-in ones from hashlib:

    def dict_hash(d):
    out = hashlib.md5()
    for key, value in d.iteritems():
    out.update(unicode(key))
    out.update(unicode(value))
    return out.hexdigest()

    (Note that this relies on unicode(...) for each of your values
    returning something unique -- if you have custom classes in the
    dictionaries whose __unicode__ returns something like "MyClass
    instance", this will fail or will require modification. Also, in your
    example, your dictionaries are flat, but I'll leave it as an exercise
    to the reader how to expand this solution to work with dictionaries
    that contain other dicts or lists.)

    Since dict_hash returns a string, which is immutable, you can now use
    a dictionary to find the unique elements:

    uniques_map = {}
    for d in list_of_dicts:
    uniques[dict_hash(d)] = d
    unique_dicts = uniques_map.values()

    >>>>*** not sure what the "uniqes" is, or what/how it should be defined....



    thoughts/comments are welcome

    thanks
    bruce, Jan 10, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Curt_C [MVP]

    Re: Generating 8 digit unique ID

    Curt_C [MVP], Apr 20, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    477
    Curt_C [MVP]
    Apr 20, 2004
  2. Martin Dechev

    Re: Generating 8 digit unique ID

    Martin Dechev, Apr 20, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    449
    Martin Dechev
    Apr 20, 2004
  3. Patrice

    Re: Generating 8 digit unique ID

    Patrice, Apr 20, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    442
    Patrice
    Apr 20, 2004
  4. ToshiBoy
    Replies:
    6
    Views:
    840
    ToshiBoy
    Aug 12, 2008
  5. Token Type
    Replies:
    9
    Views:
    349
    Chris Angelico
    Sep 9, 2012
Loading...

Share This Page