Delete items in nested dictionary based on value.

B

Brian L. Troutwine

I've got a problem that I can't seem to get my head around and hoped
somebody might help me out a bit:

I've got a dictionary, A, that is arbitarily large and may contains
ints, None and more dictionaries which themselves may contain ints,
None and more dictionaries. Each of the sub-dictionaries is also
arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:

def dict_sweep(dictionary, key_value):
"""Sweep through dictionary, deleting any key:value where
value is None.

"""
# Find key at position key_value.
key = dictionary.keys()[key_value]

# If the key value is a dictionary we want to move into it.
# Therefore we call rec_rem on dictionary[key] with a
# key_value of 0
if type(dictionary[key]) is type({}):
dict_sweep(dictionary[key], 0)

# If the value isn't a dictionary we care only that it's not None.
# If it is None we nuke it and call rec_rem on the current
# dictionary with key_value incrimented by one.
elif dictionary[key] is None:
del dictionary[key]
dict_sweep(dictionary, key_value+1)

# If the value isn't a dictionary and the value isn't None then
# we still want to walk through to the next value in the
# dictionary, so call rec_rem with key_value+1
else:
dict_sweep(dictionary, key_value+1)

Assuming dict_sweep worked perfectly it would take input like this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}

and output this:

A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}

B_out = {2:2}

This dict_sweep above obviously doesn't work and I'm rather afraid of
hitting python's recursion limit. Does anyone see a way to modify
dict_sweep in it's current state to perform dictionary sweeps like
this? What about a non-recursive implementation of dict_sweep?
 
J

John McMonagle

Assuming dict_sweep worked perfectly it would take input like this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}

and output this:

A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}

B_out = {2:2}

This dict_sweep above obviously doesn't work and I'm rather afraid of
hitting python's recursion limit. Does anyone see a way to modify
dict_sweep in it's current state to perform dictionary sweeps like
this? What about a non-recursive implementation of dict_sweep?

How about this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

def dict_sweep(dictin):

for key in dictin.keys():
if dictin.get(key) == None:
del dictin[key]
elif type(dictin[key]) is dict:
dictin[key] = dict_sweep(dictin[key])
return dictin

A_out = dict_sweep(A_in)
print A_out




Running the above returns:

{1: {2: 2, 3: {2: 2}}, 2: 2}

Regards,

John
 
B

bearophileHUGS

My first try, not much tested:

def clean(d):
for key,val in d.items():
if isinstance(val, dict):
val = clean(val)
if not val:
del d[key]
return d

a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
print clean(b) # Out: {2: 2}

Recursivity overflow problem: you can increase the recursivity limit,
of if you think you need it after some tests on your real data, then
you can use a stack (a python list, using append and pop methods) to
simulate recursivity. You probably don't have 300+ levels of nesting.

Bye,
bearophile
 
B

bearophileHUGS

Better:

def clean(d):
for key,val in d.items():
if isinstance(val, dict):
val = clean(val)
if val is None or val == {}:
del d[key]
return d

a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
print clean(b) # Out: {2: 2}
c = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: 0}
print clean(c) # Out: {2: 2, 3: 0}
 
D

Dale Strickland-Clark

Brian said:
I've got a problem that I can't seem to get my head around and hoped
somebody might help me out a bit:

I've got a dictionary, A, that is arbitarily large and may contains
ints, None and more dictionaries which themselves may contain ints,
None and more dictionaries. Each of the sub-dictionaries is also
arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:

How about this:

def dict_sweep(dictionary):
for k, v in dictionary.items():
if v is None:
del dictionary[k]
if type(v) == type({}):
dict_sweep(v)
#if len(v) == 0:
# del dictionary[k]


A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}}

dict_sweep(A_in)
dict_sweep(B_in)
print A_in
print B_in


It doesn't produce the output you expect for B_in, but your brackets didn't
match and I'm not sure where the extra close should be. Also, I suspect
you're wanting to delete empty dictionaries but I don't see that mentioned
in the text. If that is so, include the two commented statements.
 
B

bearophileHUGS

Fuzzyman:
Can you delete values from a dictionary whilst iterating over its items ?

Try the code, it works. I am not iterating on the dict, I am not using
dict.iteritems(), that produces a lazy iterable, I am using
dict.items() that produces a list of (key,value) pairs. And I am not
removing elements from that list :)

Hugs,
bearophile
 
D

Dennis Lee Bieber

arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:
Would it not be easier to modify the pretty-printer to ignore such
entries rather than have to make a duplicate (I presume you have reason
to need the original dictionary unchanged) dictionary and then delete
entries that match your "ignore" criteria?
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
B

Brian L. Troutwine

Would it not be easier to modify the pretty-printer to ignore such
entries rather than have to make a duplicate (I presume you have reason
to need the original dictionary unchanged) dictionary and then delete
entries that match your "ignore" criteria?

Good question. The data I'm handling relates to books: their ISBN,
pricing information, weight, statistical data for each page, citations
in other books to certain pages, those books statistical data, so on
and so forth. It's too much data, really, but this is what I'm tasked
with handling.

Currently I'm just pretty printing the data, but I'm really cheating
right now. The tool I'm writing is supposed to just collect the data,
strip out all the extranious data based on filters defined at the start
of the collection process and then output all that data to stdout,
where it will be caught by a tool to properly format and display it.

Yesterday morning I got into work, went into deep hack mode, wrote the
entire tool and at the end of the day could not, for the life of me,
figure out the recursion I needed. I suppose I'd spent myself. Who
knows?

Anyway, you are right in that if I did indeed need the original
dictionary unchanged it would be much, much easier to modify the
pretty-printer.
 
B

Brian L. Troutwine

This is a general reply to all.

Thanks for all the suggestions. I hadn't really thought about filtering
empty dictionaries because the data I'm processing won't have them, but
it does make for a much nicer, more general filter. I did end up using
bearophileH's code, but that's mostly because he got there first. Each
solution functioned as advertised.

Again, thanks for all the help.
 
D

Dennis Lee Bieber

Good question. The data I'm handling relates to books: their ISBN,
pricing information, weight, statistical data for each page, citations
in other books to certain pages, those books statistical data, so on
and so forth. It's too much data, really, but this is what I'm tasked
with handling.

Currently I'm just pretty printing the data, but I'm really cheating
right now. The tool I'm writing is supposed to just collect the data,
strip out all the extranious data based on filters defined at the start
of the collection process and then output all that data to stdout,
where it will be caught by a tool to properly format and display it.
Begins to sound like something that should be fed to a DBMS...
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top