Why is dictionary.keys() a list and not a set?

Christoph Zwerschke · Nov 23, 2005

Ok, the answer is easy: For historical reasons - built-in sets exist
only since Python 2.4.

Anyway, I was thinking about whether it would be possible and desirable
to change the old behavior in future Python versions and let dict.keys()
and dict.values() both return sets instead of lists.

If d is a dict, code like:

for x in d.keys():
...

or even

for x in sorted(d.keys()):
...

would still work and do the same.

However, the older idiom

k = d.keys()
k.sort()
for x in k:
...

would break (since you cannot sort a map directly).

So it seems not possible to change the behavior since it would break old
code. Maybe keys() and values() could be made deprecated functions,
superseded by keyset() and valueset(), but probably this would not be
worth the effort.

Another related question: Actually, dicts can be considered as
subclasses of sets and thus could inherit a lot of set methods and
operators. Only intersection and union must be treated with care,
because dictionaries must be mappings (i.e. map to exactly one value).

In fact, some inheritance from sets has already been implemented:

For instance, by allowing the set operator "in" for dictionaries,
instead of "has_key".

The len(), clear(), copy() and update() functions can also be
interpreted as inherited from set. The dictionary method popitem() is
actually the inherited set method pop(). (d.pop() without arguments
could actually do the same as d.popitem(); I guess the suffix "item" was
chosen to remind you that it returns a key/value pair, not only a value.)

But could other set methods also be useful?

A dictionary could inherit the remove() and discard() method from sets.
d.remove(x) would do the same as del d[x], but d.discard(x) would not
raise a KeyError if x not in d.

Or, for instance, if

a = { 1:11, 2:12 }; b = { 2:22, 3:13 }, c = { 2:32 }

then

c.issubset(b) == c <= b == True
b.issuperset(c) == b >= c == True
a.difference(b) == a - b == { 1:11 }
a.s.symmetric_difference(b) == a ^ b == { 1:11, 3:13 }
a.update(b) == a |= b == a = { 1:11, 2:22, 3:13 }
a.intersection_update(b) == a &= b == a = { 2:22 }
a.difference_update(b) == a -= b == a = { 1:11 }
a.symmetric_difference_update(b) == a ^= b == a = { 1:11, 3:13 }

Of these, a |= b may be particularly interesting as short notation for
a.update(b).

-- Christoph

Mike Meyer · Nov 24, 2005

Christoph Zwerschke said:
Ok, the answer is easy: For historical reasons - built-in sets exist
only since Python 2.4.

Anyway, I was thinking about whether it would be possible and
desirable to change the old behavior in future Python versions and let
dict.keys() and dict.values() both return sets instead of lists.

Two (rhetorical, since you dropped the idea) questions:

Are you sure dict.values() should be a set? After all, values aren't
guaranteed to be unique, so dict(a = 1, b = 1).values() currently
returns [1, 1], but would return set([1]) under your proposal.

What about dict.items()?

For instance, by allowing the set operator "in" for dictionaries,
instead of "has_key".

"in" already works for dicdtionaries:

But could other set methods also be useful?

Looks like it.

<mike

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?= · Nov 24, 2005

Christoph said:
Anyway, I was thinking about whether it would be possible and desirable
to change the old behavior in future Python versions and let dict.keys()
and dict.values() both return sets instead of lists.

You answer the question whether it would be possible (no, because of
backwards compatibility); you don't attempt to answer the question
whether it would be desirable. Why do you think that would be
a good idea?

If you want the set of keys of a dictionary d, then do set(d):
set([1, 3])

As Mike Meyer explains, the same is meaningless for .values():
they might not be unique. For .items(), conversion into a set
does would appear to be meaningful, but doesn't actually work:
if the values contain unhashable objects, the set of tuples
cannot be constructed (as set members must be immutable).

Regards,
Martin

Peter Hansen · Nov 24, 2005

Christoph said:
Ok, the answer is easy: For historical reasons - built-in sets exist
only since Python 2.4.

Anyway, I was thinking about whether it would be possible and desirable
to change the old behavior in future Python versions and let dict.keys()
and dict.values() both return sets instead of lists.

Definitely not. I believe it's currently guaranteed that the order of
the items in dict.keys() and dict.values() will match (i.e. the index of
any key in its list will be the same as the index of the corresponding
value in its list). This property is almost certainly used in some
code, so it can't be broken without good reason.

-Peter

jepler · Nov 24, 2005

You can already get a set from a dictionary's keys in an efficient manner:Set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDhR6sJd01MZaTXX0RAvz5AJoD02CCRhNt5xsTpZI7rjiVrYUywACfYNzL
vhkMLH1qXOV1i8R1Je5cSzk=
=9crP
-----END PGP SIGNATURE-----

bonono · Nov 24, 2005

Peter said:
Definitely not. I believe it's currently guaranteed that the order of
the items in dict.keys() and dict.values() will match (i.e. the index of
any key in its list will be the same as the index of the corresponding
value in its list). This property is almost certainly used in some
code, so it can't be broken without good reason.

Interesting, but why it guaranteed this as functionality wise they are
two different things and since dict is unordered, there is no need for
such guarantee, would it be just implementation consequence ?

This to me is even more uncertain than the zip(it,it) case.

Peter Hansen · Nov 24, 2005

Interesting, but why it guaranteed this as functionality wise they are
two different things and since dict is unordered, there is no need for
such guarantee, would it be just implementation consequence ?

From the docs (http://docs.python.org/lib/typesmapping.html):

"""
Keys and values are listed in an arbitrary order which is non-random,
varies across Python implementations, and depends on the dictionary's
history of insertions and deletions. If items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no intervening
modifications to the dictionary, the lists will directly correspond.
This allows the creation of (value, key) pairs using zip(): "pairs =
zip(a.values(), a.keys())". The same relationship holds for the
iterkeys() and itervalues() methods: "pairs = zip(a.itervalues(),
a.iterkeys())" provides the same value for pairs. Another way to create
the same list is "pairs = [(v, k) for (k, v) in a.iteritems()]".
"""

Does that explain it adequately?

-Peter

bonono · Nov 24, 2005

Peter said:
Interesting, but why it guaranteed this as functionality wise they are
two different things and since dict is unordered, there is no need for
such guarantee, would it be just implementation consequence ?

Click to expand...

From the docs (http://docs.python.org/lib/typesmapping.html):

"""
Keys and values are listed in an arbitrary order which is non-random,
varies across Python implementations, and depends on the dictionary's
history of insertions and deletions. If items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no intervening
modifications to the dictionary, the lists will directly correspond.
This allows the creation of (value, key) pairs using zip(): "pairs =
zip(a.values(), a.keys())". The same relationship holds for the
iterkeys() and itervalues() methods: "pairs = zip(a.itervalues(),
a.iterkeys())" provides the same value for pairs. Another way to create
the same list is "pairs = [(v, k) for (k, v) in a.iteritems()]".
"""

Does that explain it adequately?

Thanks, but yes and no.

Which is also my initial puzzle, items() and iteritems() already gives
you the tuples, why such gurantee or the others ? Doesn't that violate
the general idiom that if we can do certain thing in one way, there
better be one and only one way.

Are there other usage scenarios that would be benefit from this as the
example given is not convincing for me.

jepler · Nov 24, 2005

One reason might be Practicality. The zip() versions handily beat the
listcomp versions on a 10kitem dict. (python2.4)

$ timeit.py -s 'd = dict.fromkeys(range(10000))' '[(v, k) for (k, v) in d.iteritems()]'
100 loops, best of 3: 5.05 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' '[(v, k) for (k, v) in d.items()]'
100 loops, best of 3: 7.14 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'zip(d.itervalues(), d.iterkeys())'
100 loops, best of 3: 3.13 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'zip(d.values(), d.keys())'
100 loops, best of 3: 3.28 msec per loop

For comparison,
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'd.items()'
100 loops, best of 3: 2.19 msec per loop

Maybe there are also other reasons to promise this property that I'm not
aware of. Certainly it seems like this property is useful and not hard
to provide for "non-perverse" implementations, much like the
now-documented stability of sort().

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDhTuAJd01MZaTXX0RAp2YAKCCaFalUL3ogLc/JKcIwuX7+N8QigCdG+28
8HTzKvaPYMZxKPamLPajBzw=
=zqeT
-----END PGP SIGNATURE-----

bonono · Nov 24, 2005

[email protected] said:
One reason might be Practicality. The zip() versions handily beat the
listcomp versions on a 10kitem dict. (python2.4)

$ timeit.py -s 'd = dict.fromkeys(range(10000))' '[(v, k) for (k, v) in d.iteritems()]'
100 loops, best of 3: 5.05 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' '[(v, k) for (k, v) in d.items()]'
100 loops, best of 3: 7.14 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'zip(d.itervalues(), d.iterkeys())'
100 loops, best of 3: 3.13 msec per loop
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'zip(d.values(), d.keys())'
100 loops, best of 3: 3.28 msec per loop

For comparison,
$ timeit.py -s 'd = dict.fromkeys(range(10000))' 'd.items()'
100 loops, best of 3: 2.19 msec per loop

Maybe there are also other reasons to promise this property that I'm not
aware of. Certainly it seems like this property is useful and not hard
to provide for "non-perverse" implementations, much like the

Thanks, but we don't need the list comprehension(or do we) ? As the
return of d.iteritems() is already a list of tuples.

Peter Hansen · Nov 24, 2005

Which is also my initial puzzle, items() and iteritems() already gives
you the tuples, why such gurantee or the others ? Doesn't that violate
the general idiom that if we can do certain thing in one way, there
better be one and only one way.

Are there other usage scenarios that would be benefit from this as the
example given is not convincing for me.

As Jeff's reply emphasizes, the "examples" show tuples with *value* and
then *key*, not the other way around which is what .items() and
..itemitems() gives you.

Anyway, you didn't ask for good examples of use, just "why is it
guaranteed", and that's all I was showing you. Whether it was a good
idea might be an interesting discussion, but probably not a particularly
useful one given once again that it's unlikely this particular feature
could be undone now that code exists (we can assume) which is dependent
on it.

-Peter

bonono · Nov 24, 2005

Peter said:
As Jeff's reply emphasizes, the "examples" show tuples with *value* and
then *key*, not the other way around which is what .items() and
.itemitems() gives you.

Anyway, you didn't ask for good examples of use, just "why is it
guaranteed", and that's all I was showing you. Whether it was a good
idea might be an interesting discussion, but probably not a particularly
useful one given once again that it's unlikely this particular feature
could be undone now that code exists (we can assume) which is dependent
on it.

Please don't take it as a challenge, it was not. If it was, it was
about the need for the guarantee, not about you not giving me the
answer I want. But it is really wondering why ?

As for the (k,v) vs (v,k), I still don't think it is a good example. I
can always use index to access the tuple elements and most other
functions expect the first element to be the key. For example :

a=d.items()
do something about a
b = dict(a)

But using the imaginary example :

a = zip(d.values(), d.keys())
do something about a
b = dict(a) # probably not what I want.

Mike Meyer · Nov 24, 2005

[email protected] said:
Which is also my initial puzzle, items() and iteritems() already gives
you the tuples, why such gurantee or the others ? Doesn't that violate
the general idiom that if we can do certain thing in one way, there
better be one and only one way.

Backwards compatability. The guarantee on the order of keys() and
values() predates items() (and iteritems()). You can't remove the
guarantee without potentially breaking old code, and that's Not Done -
at least not to code that wasn't broken to begin with.

Maybe dropping the guarantee should be considered for P3K, on the off
chance that either keys or values could be made faster at some point
in the future. But I don't see it as a big deal.

<mike

bonono · Nov 24, 2005

Mike said:
Backwards compatability. The guarantee on the order of keys() and
values() predates items() (and iteritems()). You can't remove the
guarantee without potentially breaking old code, and that's Not Done -
at least not to code that wasn't broken to begin with.

Maybe dropping the guarantee should be considered for P3K, on the off
chance that either keys or values could be made faster at some point
in the future. But I don't see it as a big deal.

Thanks, that explain it. I don't know that there was no
items()/iteritems() in earlier version.

Peter Hansen · Nov 24, 2005

Please don't take it as a challenge, it was not. If it was, it was
about the need for the guarantee, not about you not giving me the
answer I want. But it is really wondering why ?

If you really want to know merely the *need* for the guarantee, as in
why was this ever put in in the first place, you will be sure of a
correct answer only if it's provided by those who wrote the code in the
first place. Anything else is guessing or mind-reading. I wasn't
trying to do even that, but just to point to where I had seen the
afore-mentioned guarantee. My responsibility ended there, and with that
I am outta here.

-Peter

(And don't worry, I didn't take it as a challenge. If you thought I
took offense somewhere, please re-read my words without that thought in
mind and I think you'll see merely straight-forward factual responses to
your questions, as they were meant to be. Cheers.)

Steve Holden · Nov 24, 2005

(e-mail address removed) wrote:
[...]

Maybe there are also other reasons to promise this property that I'm not
aware of. Certainly it seems like this property is useful and not hard
to provide for "non-perverse" implementations, much like the

Click to expand...

Thanks, but we don't need the list comprehension(or do we) ? As the
return of d.iteritems() is already a list of tuples.

No it isn't: it's a dictionary item iterator. This can be used to
provide a list, but it's self-evidently an iterator, not a list:

>>> d = dict.fromkeys(range(5))
>>> d.iteritems

Click to expand...

>>> d.iteritems()

Click to expand...

>>> [x for x in d.iteritems()] [(0, None), (1, None), (2, None), (3, None), (4, None)]
>>>

Click to expand...

Click to expand...

regards
Steve

bonono · Nov 24, 2005

Steve said:
(e-mail address removed) wrote:
[...]

Maybe there are also other reasons to promise this property that I'm not
aware of. Certainly it seems like this property is useful and not hard
to provide for "non-perverse" implementations, much like the

Click to expand...

Thanks, but we don't need the list comprehension(or do we) ? As the
return of d.iteritems() is already a list of tuples.

Click to expand...

No it isn't: it's a dictionary item iterator. This can be used to
provide a list, but it's self-evidently an iterator, not a list:

Oops, typo. I was meant to say d.items(). stand corrected.

Fredrik Lundh · Nov 24, 2005

Mike said:
Backwards compatability. The guarantee on the order of keys() and
values() predates items() (and iteritems()).

according to the SVN repository, the type originally had "keys" and
"has_key" only. "values" and "items" were both added in the same
checkin (may 1993).

performance is of course another aspect; if you *need* two parallel
lists, creating a list full of tuples just to pull them apart and throw
them all away isn't exactly the most efficient way to do things.

(if performance didn't matter at all, we could get rid most dictionary
methods; "iterkeys", "in", and locking should be enough, right?)

Maybe dropping the guarantee should be considered for P3K, on the off
chance that either keys or values could be made faster at some point
in the future. But I don't see it as a big deal.

the guarantee only means that the type must use the same algorithm
to traverse the dictionary data structure for both "keys" and "values".
I'm not aware of any dictionary algorithm that doesn't maintain a link
between keys and values, so that's not really much of a restriction...

</F>

bonono · Nov 24, 2005

Fredrik said:
performance is of course another aspect; if you *need* two parallel
lists, creating a list full of tuples just to pull them apart and throw
them all away isn't exactly the most efficient way to do things.

(if performance didn't matter at all, we could get rid most dictionary
methods; "iterkeys", "in", and locking should be enough, right?)

If I need two parallel list(one key, one value), I can still use the
same [(k,v)] tuple, just access it as x[0], x[1].

But the history alone explains it anyway.

Steven D'Aprano · Nov 24, 2005

Peter said:
Definitely not. I believe it's currently guaranteed that the order of
the items in dict.keys() and dict.values() will match (i.e. the index of
any key in its list will be the same as the index of the corresponding
value in its list). This property is almost certainly used in some
code, so it can't be broken without good reason.

As I recall, that guarantee is only if the dict is not
modified between retrieving the keys and retrieving the
values.

Why is Python telling me variable is local not global?	3	Sep 2, 2023
Average of MultiMode of a list of a list	1	Oct 28, 2022
Not sure why drop-down is not working.	2	Mar 24, 2024
A number everyday of the month "and" a different number depending on the day of the month´s day time	2	Mar 16, 2021
Function is not worked in C	2	Jun 27, 2023
Calculate rang and derang of ordering of subsets	1	Feb 6, 2024
Where is my mistake? Why is s equal to minus infinity at some loop iterations?	0	Oct 9, 2022
Why struct not globally changed in function?	1	Aug 22, 2023

Why is dictionary.keys() a list and not a set?

Christoph Zwerschke

Mike Meyer

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=

Peter Hansen

jepler

bonono

Peter Hansen

bonono

jepler

bonono

Peter Hansen

bonono

Mike Meyer

bonono

Peter Hansen

Steve Holden

bonono

Fredrik Lundh

bonono

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads