Delete dict and subdict items of some name

G

Gnarlodious

Hello. What I want to do is delete every dictionary key/value of the name 'Favicon' regardless of depth in subdicts, of which there are many. What is the best way to do it?

-- Gnarlie
 
O

Oscar Benjamin

Hello. What I want to do is delete every dictionary key/value of the name 'Favicon' regardless of depth in subdicts, of which there are many. What is the best way to do it?

You might need to be a bit clearer about what you mean by subdicts. I
don't really know what you mean.

Could you perhaps post some short code that creates the kind of data
structure you are referring to?

e.g. Do you mean something like this?

d = {
'a': {'b': 'Favicon'},
'b': {'c': 'Favicon'},
}


Oscar
 
M

Mitya Sirenef

Hello. What I want to do is delete every dictionary key/value of the name 'Favicon' regardless of depth in subdicts, of which there are many. What is the best way to do it?

-- Gnarlie

Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
 
P

Paul Rubin

Gnarlodious said:
Hello. What I want to do is delete every dictionary key/value of the
name 'Favicon' regardless of depth in subdicts, of which there are
many. What is the best way to do it?

Untested:

def unfav(x):
if type(x) != dict: return x
return dict((k,unfav(v)) for k,v in x.iteritems() if k != 'favicon')
 
D

Dave Angel

Hello. What I want to do is delete every dictionary key/value of the name 'Favicon' regardless of depth in subdicts, of which there are many. What is the best way to do it?

-- Gnarlie
I would write a recursive function that accepts a dict.

In that function, if a key "Favicon" exists, then remove it. Then loop
through the dictionary key/value pairs, and for any value that's an
instance of dict, call yourself recursively.

Give it a try, and if it won't work, supply us with a bit more
information, starting with a sample dict, and the python version you're
aiming at.
 
T

Tim Chase

Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?

Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)

Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)

However, assuming the initial structure is tree-ish (acyclic),
Mitya's function should do the trick

-tkc
 
M

Mitya Sirenef

Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?
Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)

True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?

-m
 
C

Chris Angelico

On 12/17/2012 12:27 PM, Gnarlodious wrote:

Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?

Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)

Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)


True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?

Nope, recursion could occur anywhere. You'd have to maintain a set of
"visited nodes" (or their id()s, same diff), and skip any that are in
it.

ChrisA
 
D

Dave Angel

On 12/17/2012 12:27 PM, Gnarlodious wrote:
Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?
Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)

True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?
No, that would only cover the self-recursive case. If there's a dict
which contains another one, which contains the first, then the recursion
is indirect, and much harder to check for.

Checking reliably for arbitrary recursion patterns is tricky, but
do-able. Most people degenerate into just setting an arbitrary max
depth. But I can describe two approaches to this kind of problem.

1) build a list of the recursion path at present, and compare against
the whole path, rather than just the tail. If there are any matches, quit.

2) make the iterator an object, and instantiate two of them. Then each
recursive level, iterate the main one once, and the secondary one
twice. If the two ever match, you have a loop. Deciding what to do at
that point is tricky because you may have processed some nodes multiple
times already. But at least it'll terminate, and it doesn't use linear
memory to do so. I call this one the lollypop algorithm.
 
M

MRAB

On 12/17/12 11:43, Mitya Sirenef wrote:
On 12/17/2012 12:27 PM, Gnarlodious wrote:
Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?
Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)

True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?
No, that would only cover the self-recursive case. If there's a dict
which contains another one, which contains the first, then the recursion
is indirect, and much harder to check for.

Checking reliably for arbitrary recursion patterns is tricky, but
do-able. Most people degenerate into just setting an arbitrary max
depth. But I can describe two approaches to this kind of problem.
Wouldn't a set of the id of the visited objects work?
 
O

Oscar Benjamin

On 12/17/2012 01:30 PM, Tim Chase wrote:
On 12/17/12 11:43, Mitya Sirenef wrote:
On 12/17/2012 12:27 PM, Gnarlodious wrote:

Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?

Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)

Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)


True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?
No, that would only cover the self-recursive case. If there's a dict
which contains another one, which contains the first, then the recursion
is indirect, and much harder to check for.

Checking reliably for arbitrary recursion patterns is tricky, but
do-able. Most people degenerate into just setting an arbitrary max
depth. But I can describe two approaches to this kind of problem.
Wouldn't a set of the id of the visited objects work?

Of course it would. This is just a tree search.

Here's a depth-first-search function:

def dfs(root, childfunc, func):
'''depth first search on a tree
calls func(node) once for each node'''
visited = set()
visiting = OrderedDict()
visiting[id(root)] = it = iter([root])

while True:
try:
node = next(it)
except StopIteration:
try:
node, it = visiting.popitem()
except KeyError:
return
key = id(node)
if isinstance(node, dict) and key not in visited:
func(node)
visiting[key] = it = iter(childfunc(node))
visited.add(key)

Now you can do:

dfs(my_dict_tree, lambda x: x.pop('Favicon', None))


Although I wouldn't bother with the above unless I had some reason to
expect possible cycles.


Oscar
 
S

Steven D'Aprano

Hello. What I want to do is delete every dictionary key/value of the
name 'Favicon' regardless of depth in subdicts, of which there are many.
What is the best way to do it?

Firstly, you should assume we know what you are talking about, rather
than explain in clear terms. Never show us an example of your data
structure. That way we can have the entertainment of guessing what the
structure of the data structure is, and coding for data that you don't
care about.

Secondly, never tell us what (if anything) you have already tried, so
that we can share in the same dead-ends and failed attempts. Share and
share alike.

Thirdly, make sure we don't have a clear idea of what you consider
"best", e.g. fastest to write, fastest to execute, most readable, most
easily maintainable, most memory efficient, short enough to be used as a
one-liner, or something else. For bonus points, this shouldn't be clear
in your own mind either.

Fourth, never be clear about the functional requirements. For example, do
you want to mutate the data in place? Or should the original data remain
untouched, and a modified copy be made? By leaving this unstated or
subtly implied, you can ensure that our answers have a 50% chance of
getting it wrong.

If you keep these points in mind, you too can ask open-ended, poor
questions that generate much discussion but don't solve your actual
problem.

Oh wait, I see you already know this! Sorry for the noise.


*wink*
 
O

Oscar Benjamin

Wouldn't a set of the id of the visited objects work?

Of course it would. This is just a tree search.

Here's a depth-first-search function:

def dfs(root, childfunc, func):
'''depth first search on a tree
calls func(node) once for each node'''
visited = set()
visiting = OrderedDict()
visiting[id(root)] = it = iter([root])

while True:
try:
node = next(it)
except StopIteration:
try:
node, it = visiting.popitem()
except KeyError:
return
key = id(node)
if isinstance(node, dict) and key not in visited:
func(node)
visiting[key] = it = iter(childfunc(node))
visited.add(key)

Now you can do:

dfs(my_dict_tree, lambda x: x.pop('Favicon', None))

Slight correction:
dfs(g, lambda n: n.values(), lambda x: x.pop('Favicon', None))


Oscar
 
M

Mitya Sirenef

On 12/17/12 11:43, Mitya Sirenef wrote:
On 12/17/2012 12:27 PM, Gnarlodious wrote:
Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?
Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)
True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?
No, that would only cover the self-recursive case. If there's a dict
which contains another one, which contains the first, then the recursion
is indirect, and much harder to check for.

Checking reliably for arbitrary recursion patterns is tricky, but
do-able. Most people degenerate into just setting an arbitrary max
depth. But I can describe two approaches to this kind of problem.

1) build a list of the recursion path at present, and compare against
the whole path, rather than just the tail. If there are any matches, quit.

2) make the iterator an object, and instantiate two of them. Then each
recursive level, iterate the main one once, and the secondary one
twice. If the two ever match, you have a loop. Deciding what to do at
that point is tricky because you may have processed some nodes multiple
times already. But at least it'll terminate, and it doesn't use linear
memory to do so. I call this one the lollypop algorithm.

Thanks, this is quite interesting..
 
D

Dave Angel

On 12/17/2012 01:30 PM, Tim Chase wrote:
On 12/17/12 11:43, Mitya Sirenef wrote:
On 12/17/2012 12:27 PM, Gnarlodious wrote:
Hello. What I want to do is delete every dictionary key/value
of the name 'Favicon' regardless of depth in subdicts, of which
there are many. What is the best way to do it?
Something like this should work:

def delkey(d, key):
if isinstance(d, dict):
if key in d: del d[key]
for val in d.values():
delkey(val, key)
Unless you have something hatefully recursive like

d = {}
d["hello"] = d

:)

True -- didn't think of that..!

I guess then adding a check 'if val is not d: delkey(val, key)'
would take care of it?
No, that would only cover the self-recursive case. If there's a dict
which contains another one, which contains the first, then the recursion
is indirect, and much harder to check for.

Checking reliably for arbitrary recursion patterns is tricky, but
do-able. Most people degenerate into just setting an arbitrary max
depth. But I can describe two approaches to this kind of problem.
Wouldn't a set of the id of the visited objects work?

Sure. But the set will get lots larger than a list, which is limited to
the depth of max recursion. It also locks a lot more objects in memory,
where the list only locks one per level.

Now, maybe if the search is depth-first, and if you prune the set on the
way back up, then it'll be space efficient.
 
G

Gnarlodious

This problem is solved, I am so proud of myself for figuring it out! After reading some of these ideas I discovered the plist is really lists underneath any "Children" key:


from plistlib import readPlist

def explicate(listDicts):
for dict in listDicts:
if 'FavIcon' in dict:
del dict['FavIcon']
if 'Children' in dict:
dict['Children']=explicate(dict['Children'])
return listDicts

listDicts=readPlist(TARGET_FILE)['Children']
explicate(listDicts)
print(listDicts)


This plist is used by the Mac browser iCab for bookmarks. Removing the Favicon data shrinks the file by about 99% and speeds uploading.

I am glad everyone had a nice discussion about my question, but it wasn't really accurate. Sorry bout that!

-- Gnarlie
 
H

Hans Mulder

This problem is solved, I am so proud of myself for figuring it out!
After reading some of these ideas I discovered the plist is really
lists underneath any "Children" key:


from plistlib import readPlist

def explicate(listDicts):
for dict in listDicts:
if 'FavIcon' in dict:
del dict['FavIcon']
if 'Children' in dict:
dict['Children']=explicate(dict['Children'])
return listDicts

It would be more Pythonic to return None, to indicate that you've
changed the list in situ.

Since None is the default return value, this means you can leave
out the return statement.


Hope this helps,

-- HansM
 
G

Gnarlodious

This problem is solved, I am so proud of myself for figuring it out!
After reading some of these ideas I discovered the plist is really
lists underneath any "Children" key:


from plistlib import readPlist

def explicate(listDicts):
for dict in listDicts:
if 'FavIcon' in dict:
del dict['FavIcon']
if 'Children' in dict:
dict['Children']=explicate(dict['Children'])

return listDicts

It would be more Pythonic to return None, to indicate that you've
changed the list in situ.

Since None is the default return value, this means you can leave
out the return statement.
But then it only operates on the outer layer, inner layers might get processed but not written. Unless I don't understand what you're saying.

--Gnarlie
 
T

Terry Reedy

I do not see this used below.
def explicate(listDicts):
for dict in listDicts:
if 'FavIcon' in dict:
del dict['FavIcon']
if 'Children' in dict:
dict['Children']=explicate(dict['Children'])
return listDicts
It would be more Pythonic to return None, to indicate that you've
changed the list in situ.

And since it is being changed at the top level (by deletion), it should
be changed in place all the way down.

dict['Children']=explicate(dict['Children'])
would then need to be
explicate(dict['Children'])
But then it only operates on the outer layer,
inner layers might get processed but not written.

I believe the above answers your concern. But to be sure it is correct,
YOU NEED TEST CASES. In fact, your original post should have contained
at least one non-trivial test case: an input dict and what you wanted it
to look like after processing. Writing at least some tests before code
is a great idea.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top