docs patch: dicts and sets

A

Alan Isaac

This is an attempt to synthesize Bill and Carsten's proposals.
(I'm changing the subject line to better match the topic.)

http://docs.python.org/lib/typesmapping.html: for footnote (3)

Keys and values are listed in an arbitrary order. This order is
indeterminate and generally depends on factors outside the scope of
the
containing program. However, if items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no
intervening modifications to the dictionary, the lists will directly
correspond.

http://docs.python.org/lib/types-set.html: append a new sentence to 2nd par.

Iteration over a set returns elements in an indeterminate
order,which
generally depends on factors outside the scope of the containing
program.

Alan Isaac
 
R

Raymond Hettinger

This is an attempt to synthesize Bill and Carsten's proposals.
(I'm changing the subject line to better match the topic.)

http://docs.python.org/lib/typesmapping.html:for footnote (3)

Keys and values are listed in an arbitrary order. This order is
indeterminate and generally depends on factors outside the scope of
the
containing program. However, if items(), keys(), values(),
iteritems(), iterkeys(), and itervalues() are called with no
intervening modifications to the dictionary, the lists will directly
correspond.

http://docs.python.org/lib/types-set.html:append a new sentence to 2nd par.

Iteration over a set returns elements in an indeterminate
order,which
generally depends on factors outside the scope of the containing
program.

This doesn't improve the docs. It suggests some mystic forces at work
while offering nothing that is actionable or that improves
understanding. Adding this kind of muck will only make the docs less
clear.

Recommend dropping this one and moving on to solve some real problems.


Raymond
 
R

rurpy

This doesn't improve the docs. It suggests some mystic forces at work
while offering nothing that is actionable or that improves
understanding. Adding this kind of muck will only make the docs less
clear.

I too find the suggested text not very clear and would not
immediately predict from it the effects that started this
thread (or actually, the thread in
http://groups.google.com/group/comp.lang.python/msg/4dc632b476fdc6d3?hl=en&)
Recommend dropping this one and moving on to solve some real problems.

Perhaps this attitude helps explain some of the problems
in the current documentation.

Dismissing this as not a "real problem" is both wrong
and offensive to people taking the time to actually
propose improvements.

The current docs are clearly wrong. To repeat what has
already been pointed out, they say, "Keys and values are
listed in an arbitrary order which is non-random,
varies across Python implementations, and depends on the
dictionary's history of insertions and deletions."

It has been shown that even when the history of insertions
and deletions is the same, the order may be different.
Taking "history" to extend across program invocation
boundaries is unconventional to put it charitably, and
there is no reason to assume that interpretation would
occur to a reasonable reader. The whole issue can be
cleared up simply by clarifying the documentation; I
really fail to see why this should be at all controversial.

I will offer my own suggestion based on the belief that
documentation should be as explicit as possible:

"Keys and values are listed in an arbitrary but non-random
order which may vary across Python versions, implementations,
and the dictionary's history of insertions and deletions.
When the contents are objects using the default implementation
of __hash__() and __eq__(), the order will depend on the
objects' id() values which may be different even between
different invocations of a program (whether executed from
a .py or a .pyc file for example.)"

Apropos sig...
 
R

Raymond Hettinger

Dismissing this as not a "real problem" is both wrong
and offensive to people taking the time to actually
propose improvements.

I should have elaborated on what I meant by saying that there is not a
real problem. Another way to put it is that the docs are sufficient
when they say that set ordering is arbitrary. That should be a cue to
not have *any* expectations about the internal ordering of sets and
dicts.

Any further documentation of behavior would be a mistake because it
would of necessity expose implementation specific details. For
instance, there is another intentionally undocumented observable
behavior that sets and dicts change their internal order as new
members are added. It is also intentional that Python makes almost no
promises about the location of objects in memory. IIRC, the only
guarantees made about object identity are that "a is a" is always true
and None can be tested with "is".


Raymond
 
A

Alan Isaac

Raymond Hettinger said:
Another way to put it is that the docs are sufficient
when they say that set ordering is arbitrary. That should be a cue to
not have *any* expectations about the internal ordering of sets and
dicts.

You are usually more careful.

1. Please do not conflate two issues here.
It confuses people like Richard T.

Did *anyone* who participated in the initial conversation
express an expectation that set ordering is not arbitrary?
No. Not one.

What surprised people was that this ordering
could vary between two *sequential* executions of
an *unchanged* source.

Martin dismisses this by simply asserting (on what basis?)
that anyone who was surprised lacks Python experience,
and that to address this in any way would make the
reference library assume the role of a tutorial.
Not very plausible, IMO, given the rest of the library
documentation.

2. You say it the existing docs "should be a cue",
and yet they clearly did not provide enough guidance
to an ordinary user (me) and some more sophisticated users.
So the docs "should be a cue" to people who do not need a cue.
Do I understand you correctly?

3. Finally, please do not claim that the docs say that set ordering is
arbitrary.
At least not the docs we have benn talking about:
http://docs.python.org/lib/types-set.html
It is fascinating that you would confuse this, since it is the core
of the proposed documentation patch (although the proposed
language was "indeterminate" rather than arbitrary).

So it also seems you are now claiming that the patch should not be in
because of the presence of language that is in fact not there.

Look, I was just trying to help other users who might be
as surprised as I was. As I said, I am not attached to any
language, and in fact I just used the proposals of others.
I just wanted there to be some clue for users who read the docs.
If you prefer to leave such users baffled, so be it.
My effort is exhausted.

Cheers,
Alan Isaac
 
R

rurpy

I should have elaborated on what I meant by saying that there is not a
real problem. Another way to put it is that the docs are sufficient
when they say that set ordering is arbitrary. That should be a cue to
not have *any* expectations about the internal ordering of sets and
dicts.

I disagree. When reading the docs, the reader will
always have and need assumtions because the docs can't
describe all behavior from first priciples. Every
programmer will bring and apply his or her understanding
of how computers and computer programs operate under
the hood.

For example, nowhere in the "file" object documentation
does is say the files are read starting from byte 0.
It relies on the fact that the reader will have that
expectation based on previous experience with computers.

The are two basics principles that I think most
programmers apply sans explicit contradictory
information:
* That documentation about behavior applies within
the bounds of a single execution.
* That computers are fundamentaly deterministic
(with a possible exception for code running in
Microsoft OSes. :)

When I read that sets return items in arbitrary
order (and the docs aren't even that specific),
I make a natural assumption that, no information
provided to the contrary, within a single program
execution the order will be arbitrary. Since it
says nothing about between execution, the very
strong general rule applies: that if no obvious
source of volatilty or dependence on environment
exist, the same program should produce the same
results.
Any further documentation of behavior would be a mistake because it
would of necessity expose implementation specific details.

You don't need to make promises to explain surprising
behavior. (The word "may" is amazingly useful in these
cases :) A "for example" that exposes implementation
details make no promises yet can make clear non-intuative
behavior. A concise but clear noting of the surprising
behavior seen by the OP would improve the clarity of the
documentation, not harm it.
For
instance, there is another intentionally undocumented observable
behavior that sets and dicts change their internal order as new
members are added. It is also intentional that Python makes almost no
promises about the location of objects in memory. IIRC, the only
guarantees made about object identity are that "a is a" is always true
and None can be tested with "is".

One last comment. While I treat opinions from Python
experts on Python technical details with great respect
and appreciation, opinions on documentation should be
viewed with much greater skepticism. It can be difficult
for an expert to view Python with the same eyes as a
non-guru level programmer, yet the latter is (or
should be) the target audience of the documentation.
[And please, let's not start the reference vs tutorial
thing!]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top