When to clear a dictionary...

B

Bill Jackson

What is the benefit of clearing a dictionary, when you can just reassign
it as empty? Similarly, suppose I generate a new dictionary b, and need
to have it accessible from a. What is the best method, under which
circumstances?
.... a[key] = b[key]
 
L

Larry Bates

Bill said:
What is the benefit of clearing a dictionary, when you can just reassign
it as empty?

If you have objects that point to the dictionary (something like a cache)
then you want to clear the existing dictionary instead of just assigning
it to empty. If nothing points to it, assigning it to empty is fast and
you can let garbage collection do the rest.
Similarly, suppose I generate a new dictionary b, and need
to have it accessible from a. What is the best method, under which
circumstances?
... a[key] = b[key]

Syntax error in the first example but if you fix that the first two are
equivalent (but I would suspect that the second would be faster for large
dictionaries).

The third example both a and b point to the same dictionary after the a=b
which you can see from:

-Larry
 
G

Gabriel Genellina

If you have objects that point to the dictionary (something like a cache)
then you want to clear the existing dictionary instead of just assigning
it to empty. If nothing points to it, assigning it to empty is fast and
you can let garbage collection do the rest.

For an actual comparision, see Alex Martelli posts a few days ago:
http://mail.python.org/pipermail/python-list/2007-March/433027.html
a = {1:2,3:4}
b = {1:2:4:3}
a.clear()
a.update(b)
a = {1:2,3:4}
b = {1:2,4:3}
for key in b:
... a[key] = b[key]
Syntax error in the first example but if you fix that the first two are
equivalent (but I would suspect that the second would be faster for large
dictionaries).

It's the other way; the first method contains a single Python function
call and most of the work is done in C code; the second does the iteration
in Python code and is about 4x slower.
python -m timeit -s "b=dict.fromkeys(range(10000));a={}" "a.update(b)"
100 loops, best of 3: 10.2 msec per loop
python -m timeit -s "b=dict.fromkeys(range(10000));a={}" "for key in b:
a[key]=b[key]"
10 loops, best of 3: 39.6 msec per loop
 
L

Larry Bates

Gabriel said:
En Fri, 20 Apr 2007 14:28:00 -0300, Larry Bates
If you have objects that point to the dictionary (something like a cache)
then you want to clear the existing dictionary instead of just assigning
it to empty. If nothing points to it, assigning it to empty is fast and
you can let garbage collection do the rest.

For an actual comparision, see Alex Martelli posts a few days ago:
http://mail.python.org/pipermail/python-list/2007-March/433027.html
a = {1:2,3:4}
b = {1:2:4:3}
a.clear()
a.update(b)

a = {1:2,3:4}
b = {1:2,4:3}
for key in b:
... a[key] = b[key]
Syntax error in the first example but if you fix that the first two are
equivalent (but I would suspect that the second would be faster for large
dictionaries).

It's the other way; the first method contains a single Python function
call and most of the work is done in C code; the second does the
iteration in Python code and is about 4x slower.
python -m timeit -s "b=dict.fromkeys(range(10000));a={}" "a.update(b)"
100 loops, best of 3: 10.2 msec per loop
python -m timeit -s "b=dict.fromkeys(range(10000));a={}" "for key in
b: a[key]=b[key]"
10 loops, best of 3: 39.6 msec per loop

--Gabriel Genellina

That is what I meant to say, thanks for catching the error.

-Larry
 
S

Steven D'Aprano

What is the benefit of clearing a dictionary, when you can just reassign
it as empty?

They are two different things. In the first place, you clear the
dictionary. In the second place, you reassign the name to a new object
(which may be an empty dictionary, or anything else) while leaving the
dictionary as-is.

Here's an example of clearing the dictionary.
{}

Because both adict and bdict point to the same dictionary object,
clearing it results in an empty dictionary no matter what name (if any!)
you use to refer to it.

Here's an example of re-assigning the name.
{1: 'parrot'}

Although adict and bdict both start off pointing to the same dictionary,
once you re-assign the name adict, they now point to different
dictionaries, only one of which is empty.

In this specific case, if bdict didn't exist, the original dictionary
would then be garbage-collected and the memory re-claimed. In the C
implementation of Python (CPython), that will happen immediately; in
the Java and (I think) .Net implementations of Python (Jython and
IronPython) it will happen "eventually", with no guarantee of how long it
will take.
Similarly, suppose I generate a new dictionary b, and need
to have it accessible from a. What is the best method, under which
circumstances?

That depends on what you are trying to do.

What is it that you want to do?

(1) "I want the name 'adict' to point to the same dict as bdict."

Solution:

adict = bdict


(2) "I want the data in bdict to update the data in adict, keeping items
in adict that are not in bdict but replacing them if they are in bdict."

Solution:

adict.update(bdict)


(3) "I want the data in bdict to be copied into adict, throwing away
whatever was already there."

Solution:

adict.clear()
adict.update(bdict)


(4) "I want the data in bdict to be copied into adict, but keeping what
was already there."

Solution:

for key in bdict:
if adict.has_key(key):
pass # ignore it
else:
adict[key] = bdict[key] # add it
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,131
Latest member
IsiahLiebe
Top