lambda

A

Antoon Pardon

Op 2005-01-18 said:
Well, I'd definitely consider an inaccessible key as constituting a
problem, but I don't think that's a good analogy to the list case.

With the dictionary, the change can (though I do agree it does not
have to) interfere with proper operation of the dictionary, while a
list that is no longer sorted still functions perfectly well as a
list.

I think we are talking past each other. I'm talking about the concept
the programmer has in his mind. If the prgrammer has a sorted list in
his mind and an element in that list gets mutated that *does* *always*
cause problems. I'm not talking about a list in which the elements
happen to be in a particular order, but about a list that will be given
to algorithms that depend on the list being ordered.
That is, I feel "problems" are more guaranteed with a
dictionary since we have affected base object behavior, whereas sorted
is not an inherent attribute of the base list type but something the
application is imposing at a higher level.

But if the program uses one of the buildin sorts, isn't it then obvious
that this is an inherent attribute we want to impose? So is sorting
a list with mutable object not also more guaranteed to cause problems.
So shouldn't we limit the sort algorithms to containers with immutable
elements?
For example, I may choose to have an object type that is mutable (and
not worthy for use as a dictionary key) but maintains a logical
ordering so is sortable. I see no problem with sorting a list of such
objects, and then walking that list to perform some mutation to each
of the objects, even if along the way the mutation I am doing results
in the items so touched no longer being in sorted order. The act of
sorting was to provide me with a particular sequence of objects,

Personnaly I don't see what the sorting was good for in this case.
And I think such programming might be slightly dangerous if you
cooperate with other people. They may not understand the particular
nature of your mutation and rely on the fact that your list has been
sorted to conclude that it still is sorted.
but
aside from that fact, the list continues to perform perfectly well as
a list even after the mutations - just no longer delivering objects in
sorted order.

Which may be a problem for those who rely on it.
 
A

Antoon Pardon

Op 2005-01-18 said:
This is word play IMO. What you describe is effectively using the mutables
to construct immutable keys for actual use, not using immutable keys.

No it is not wordplay. There is a difference between inaccessible and
immutable. The difference is that the user can mutate his own objects
which he wants to use as a key, which wouldn't be the case if he was
using immutables as keys.
Having
the immutable (because of access restriction) constructed keys allows
you to check on their consistency with their source any time you want,
but that doesn't change the fact that you have created an alternative dictionary
implementation with immutable keys made immutable by copying and access restriction.
If you update your dictionary when a key no longer matches its source data, you
are just deleting an immutable-by-special-means key and entering the associated
value in association with a new immutable-by-special-means key.

The immutable vs mutable as keys IMO refers to the class the key belongs
to. Not the accessibiliy of individual instances.
If you see a sign like

+--------------------------------->
| WILD GOOSE CHASE THIS WAY ->
+----------------------------->
||
|| \|/
|| =o=
|| /|\
.`.||,..__|_,.

Do you feel that you must make sure?

Talk to my wife. She regularly complains that I don't accept
her word often enough on what she says. :)
The sign painter may only be indicating what his own experience was, after all.
But if it was a core developer's experience, it may be prudent to bet on another trail
-- unless you have an extremely interesting hunch. Then you should follow your passion
and have your own experience, and you may wind up being able to contribute something
(even if only an exclamation point on the sign ;-)

That is so in your mind, and it is so given certain interpretations of the words
you are using,

As far as I understand I using them in a standard python interpretation.
but to others "mutable keys" may be a contradiction in terms, because
they do not use the words in the same sense as you.

I don't think so. I have seen defenders of only immutables as keys, use
the argument that using mutables as keys would require a copy of the key.
Therefore using only immutables allows for optimisations in the implementation.
I saw nothing in there discourse that implied they saw this copy as
immutable.
If you can't see alternative conceptual
constructs you are stuck, and if we can't agree on the meaning of words for a given dialog,
we won't be communicating very well. The difficulty is getting people to recognize their
implicit assumptions and definitions ;-)

I can be wrong, but until now I have seen no indication that I was
using mutable and immutable differently than other people. AFAICT
we all refer to whether an object belongs to a mutable or immutable
class.
People take short cuts in expressing themselves, just as you do.
E.g., you say "mutable key" when you really mean a value that will
remain constant while you are using it as a dictionary key.

What I mean is a key that belongs to a mutable class. Whether I
actually mutate it in certain circumstances or not doesn't change
that.

I also don't want my values to change when I have sorted a list
and still need to apply a number of algorithms that rely on
that. Nobody seems to have problems with the possibility that
the list items are mutable (and called such).
 
N

Nick Coghlan

Antoon said:
A rule of thumb is context sensitive. If circumstances change,
so do the rules of thumb. Principles have a broader field
of application.

IMO there is nothing principally wrong with using a mutable object
as a dictionary key. But avoiding doing so is a good rule of
thumb if you have a python-like implementation of a dictionary.

For a 'mutable key' to make sense, the following:

lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi

That's completely impractical though - for any sane implementation, at least one
of the above print statements will throw a KeyError.

Your example of a 'safe_dict' that copies mutable keys means that the final
print statement is the one that fails.

My suggestion of an "identity_dict" means that both of the same-value based
lookups would fail.

Notice that both of these approaches take a mutable key and make it immutable
(either by holding the sole reference to a copy, or retrieving an immutable
property of the mutable object).

There's also your solution of "I promise not to mutate the key while it is in
the dictionary". Workable, but opens the door to some entertaining bug hunts
when the promise is broken.

Which solution is the best default behaviour?

Well, the Zen of Python states "In the face of ambiguity, refuse the temptation
to guess".

So that's the policy the builtin dict follows - it doesn't try to guess when to
make a copy, or whether or not to use identity based semantics in the face of
mutability. Instead, it raises an exception at key entry time, asking the
programmer to clarify their intent.

The principle is *certainly* that hash tables should contain immutable keys.
Your own 'safe_dict' example implicitly concedes this point by making copies
'when required'. 'When required' for what? When required to preserve
immutability, perhaps?

In short, sane hash tables require immutable keys, and how mutable keys acquire
the requisite immutability is going to be application dependent.

Provision of a safe_dict or identity_dict would merely allow a programmer to
state their intent at the time the container is created, rather than having to
state it whenever keys are generated or referenced.

Cheers,
Nick.
 
A

Antoon Pardon

Op 2005-01-19 said:
For a 'mutable key' to make sense, the following:

As you said yourself, people use shortcuts when they express themselves.
With 'mutable key' I mean a mutable object that is used as a key.
That doesn't contradict that the key itself can't change because
it is inaccessible.
lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi
That's completely impractical though - for any sane implementation, at least one
of the above print statements will throw a KeyError.

Your example of a 'safe_dict' that copies mutable keys means that the final
print statement is the one that fails.

My suggestion of an "identity_dict" means that both of the same-value based
lookups would fail.

Notice that both of these approaches take a mutable key and make it immutable
(either by holding the sole reference to a copy, or retrieving an immutable
property of the mutable object).

There's also your solution of "I promise not to mutate the key while it is in
the dictionary". Workable, but opens the door to some entertaining bug hunts
when the promise is broken.

That is usually the case when promisses are broken. To mention something
different, I guess a number of equally entertaining bug hunts can be
caused by the fact that the value in the dictionary is changed.
Which solution is the best default behaviour?

IMO a dictionary establisches a relationship between values. The most
general implementation that protects this relationship from external
tampering is copying the key and value for use in the dictionary.
Well, the Zen of Python states "In the face of ambiguity, refuse the temptation
to guess".

But python does guess in the case of the value. If I want to establish a
relationship between "foo" and [3, 5, 8] then I can be in big trouble
if that list gets mutated in something else.
So that's the policy the builtin dict follows - it doesn't try to guess when to
make a copy, or whether or not to use identity based semantics in the face of
mutability. Instead, it raises an exception at key entry time, asking the
programmer to clarify their intent.

The principle is *certainly* that hash tables should contain immutable keys.
Your own 'safe_dict' example implicitly concedes this point by making copies
'when required'. 'When required' for what? When required to preserve
immutability, perhaps?

But they don't have to be immutable within the dictionary itself, that
is just an implementation detail. The only thing that matters is that
the dictionary can reproduce the relationship. How that is done is
immaterial.
In short, sane hash tables require immutable keys, and how mutable keys acquire
the requisite immutability is going to be application dependent.

No they don't, that is just the most easy implementation. If some
implementation would constantly change the internal objects but
consulting the dictionary wouldn't show any different effect then
that would be fine too. And even in this case a mutation
from the outside would probably cause havoc because it would ruin
the sanity of the internal structure.
Provision of a safe_dict or identity_dict would merely allow a programmer to
state their intent at the time the container is created, rather than having to
state it whenever keys are generated or referenced.

Well that would be a good thing IMO. Good enough in any case for me to
spend some time analyzing the requirements. Whether I'll actually
implement something depends on how much time I have and how often I
need it.
 
J

Just

Nick Coghlan said:
Antoon said:
A rule of thumb is context sensitive. If circumstances change,
so do the rules of thumb. Principles have a broader field
of application.

IMO there is nothing principally wrong with using a mutable object
as a dictionary key. But avoiding doing so is a good rule of
thumb if you have a python-like implementation of a dictionary.

For a 'mutable key' to make sense, the following:

lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi

That's completely impractical though - for any sane implementation, at least
one
of the above print statements will throw a KeyError.

Your example of a 'safe_dict' that copies mutable keys means that the final
print statement is the one that fails.

My suggestion of an "identity_dict" means that both of the same-value based
lookups would fail.

Notice that both of these approaches take a mutable key and make it immutable
(either by holding the sole reference to a copy, or retrieving an immutable
property of the mutable object).

There's also your solution of "I promise not to mutate the key while it is in
the dictionary". Workable, but opens the door to some entertaining bug hunts
when the promise is broken.

Which solution is the best default behaviour?

Well, the Zen of Python states "In the face of ambiguity, refuse the
temptation
to guess".

So that's the policy the builtin dict follows - it doesn't try to guess when
to
make a copy, or whether or not to use identity based semantics in the face of
mutability. Instead, it raises an exception at key entry time, asking the
programmer to clarify their intent.

The principle is *certainly* that hash tables should contain immutable keys.
Your own 'safe_dict' example implicitly concedes this point by making copies
'when required'. 'When required' for what? When required to preserve
immutability, perhaps?

In short, sane hash tables require immutable keys, and how mutable keys
acquire
the requisite immutability is going to be application dependent.

Provision of a safe_dict or identity_dict would merely allow a programmer to
state their intent at the time the container is created, rather than having
to
state it whenever keys are generated or referenced.

FWIW, the Cocoa framework on OSX supports mutable objects as keys.
However, it makes an immutable copy. This is cheap for immutable objects
-- Cocoa has both a mutable array as well as an immutable array,
likewise for dicts -- since the "copy" will then simply be the same
object. Thanks to PyObjC we can explore this from Python:

Python 2.3 (#1, Sep 13 2003, 00:49:11)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from Foundation import *
>>> lst = []
>>> dct = NSMutableDictionary.alloc().init()
>>> dct {}
>>> dct[lst] = "Hi!"
>>> dct
{(1) = "Hi!"; }

(Note that Python lists get wrapped as native Cocoa arrays. '(1)' is
Cocoa's repr for the wrapped array.)
>>> dct[lst] u'Hi!'
>>> lst.append(1)
>>> dct[lst]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File
"build/lib.darwin-7.6.0-Power_Macintosh-2.3/objc/_convenience.py", line
77, in __getitem__objectForKey
KeyError: [1, 1]
>>> dct.keys() ((1))
>>> dct[[1]] u'Hi!'
>>> k = dct.keys()[0]
>>> k (1)
>>> k.append(2)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File
"build/lib.darwin-7.6.0-Power_Macintosh-2.3/objc/_convenience.py", line
205, in <lambda>
objc.error: NSInternalInconsistencyException - *** -[NSCFArray
addObject:]: mutating method sent to immutable object
This is not all that different from Python actually, in that Cocoa tries
very hard to make it impossible to mutate dict keys.

Just
 
N

Nick Coghlan

Antoon Pardon wrote:
> I can be wrong, but until now I have seen no indication that I was
using mutable and immutable differently than other people. AFAICT
we all refer to whether an object belongs to a mutable or immutable
class.

The difference is that when you take a copy of the key and stick it in the
dictionary, such that the dictionary now holds the *sole* reference to that key,
you have made that key *effectively* immutable.

This is why no-one really batted an eyelid when you mentioned that mutable keys
can be used safely by making a copy - doing so makes the key *effectively*
immutable, and means that modifying the original key object (i.e. the
application's copy, not the dict's copy), and reusing it to look up in the
dictionary is likely to give a KeyError.

These semantics would be understandably surprising to many users, and hence, are
not supplied by default.

Additionally, a dictionary's keys are accessible via its API. Accordingly, to
preserve this 'effective immutability', making a copy on key input is
insufficient - keys must be copied on *output* as well (that is, dict.keys,
dict.items etc must return *copies* of the key objects, not the key objects
themselves).

Since there is no reliable way in Python to tell if an object is mutable or not
(the closest equivalent is the presence of __hash__, which clearly can't be used
in this example), this copying would need to be done for *every* object.

Alternately, the dictionary can say to the API clients, "make an immutable copy
and supply that instead. It is left to API clients to decide how best to make
the immutable version". The API client copies the key once (to make the
immutable version), and everyone lives happily ever after.

For example:

Py> class mylist(list):
.... def __init__(self, arg):
.... super(mylist, self).__init__(arg)
.... self._tuple = None
.... def frozen(self):
.... if self._tuple is None:
.... self._tuple = tuple(self)
.... return self._tuple
.... def unfreeze(self):
.... self._tuple = None
....
Py> x = mylist(range(10))
Py> x
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.append(10)
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.unfreeze()
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

This is much safer than having a list subclass that implements __hash__, and
still lets you avoid redundant copy operations.

Nicely, so long as you don't unfreeze the object you used to key the dictionary,
reusing the same object will always get you the correct dictionary entry, and
two lists that compare equal at the time they are frozen will also get you the
same dictionary entry. The equivalent tuples can be used for the lookup, too.
I also don't want my values to change when I have sorted a list
and still need to apply a number of algorithms that rely on
that. Nobody seems to have problems with the possibility that
the list items are mutable (and called such).

OK, to make this comparison of sorted lists and dictionaries fair:

Write a sorted_list class that is like a regular Python list, but maintains as
an invariant that the list contents will stay sorted.

See how well you go maintaining that invariant while allowing mutable objects in
the list. The only way would be to copy them in when they're supplied, and copy
them out again when you're done. Otherwise, there is no way the class can keep
its promise. The performance will be lousy, since __setitem__ and __getitem__
will be making copies all the time.

Alternatively, the class could declare itself to work reliably only with
immutable objects. Performance will improve, since copies need only be made when
an object *actually* changes (and the old immutable copy is deleted and the new
version inserted in its place).

Cheers,
Nick.
 
A

Antoon Pardon

Op 2005-01-19 said:
Antoon Pardon wrote:
> I can be wrong, but until now I have seen no indication that I was

The difference is that when you take a copy of the key and stick it in the
dictionary, such that the dictionary now holds the *sole* reference to that key,
you have made that key *effectively* immutable.

This is why no-one really batted an eyelid when you mentioned that mutable keys
can be used safely by making a copy - doing so makes the key *effectively*
immutable, and means that modifying the original key object (i.e. the
application's copy, not the dict's copy), and reusing it to look up in the
dictionary is likely to give a KeyError.

These semantics would be understandably surprising to many users, and hence, are
not supplied by default.

This seems like circle reasoning. The reason why these semantics would
be surprising is that those are not the semantics supplied.
Additionally, a dictionary's keys are accessible via its API. Accordingly, to
preserve this 'effective immutability', making a copy on key input is
insufficient - keys must be copied on *output* as well (that is, dict.keys,
dict.items etc must return *copies* of the key objects, not the key objects
themselves).

Yes I haven been thinking about this too. I also think not making a copy
on output is less dangerous than not making a copy on input.
Since there is no reliable way in Python to tell if an object is mutable or not
(the closest equivalent is the presence of __hash__, which clearly can't be used
in this example), this copying would need to be done for *every* object.

Alternately, the dictionary can say to the API clients, "make an immutable copy
and supply that instead. It is left to API clients to decide how best to make
the immutable version". The API client copies the key once (to make the
immutable version), and everyone lives happily ever after.

For example:

Py> class mylist(list):
... def __init__(self, arg):
... super(mylist, self).__init__(arg)
... self._tuple = None
... def frozen(self):
... if self._tuple is None:
... self._tuple = tuple(self)
... return self._tuple
... def unfreeze(self):
... self._tuple = None
...
Py> x = mylist(range(10))
Py> x
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.append(10)
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.unfreeze()
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

This is much safer than having a list subclass that implements __hash__, and
still lets you avoid redundant copy operations.

Nicely, so long as you don't unfreeze the object you used to key the dictionary,
reusing the same object will always get you the correct dictionary entry, and
two lists that compare equal at the time they are frozen will also get you the
same dictionary entry. The equivalent tuples can be used for the lookup, too.

Interesting idea. But I think you are wrong when you say that two lists
that compare equal at the time they are frozen, will get the same
dictionary entry. The problem is an object must compare equal to
the key in the dictionary to get at the same entry. So if you freeze
a list and its copy but then mutate them differently, they no longer
are equal and so wont get you at the same entry.

[ Rest snipped, this is getting into a semantic debate on what
'mutable' means. This may be interesting enough on its own
but I feel it detracts too much from what I consider the
more important issue. ]
 
S

Steven Bethard

Nick said:
For a 'mutable key' to make sense, the following:

lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi

And here's an implementation that does so:

py> class sillydict(dict):
.... def __getitem__(self, key):
.... items = self.items()
.... self.clear()
.... self.update(items)
.... return super(sillydict, self).__getitem__(key)
....
py> class sillylist(list):
.... def __hash__(self):
.... return hash(tuple(self))
....
py> lst = sillylist([])
py> dct = sillydict({lst: "Hi!"})
py> print dct[sillylist([])]
Hi!
py> print dct[lst]
Hi!
py> lst.append(1)
py> print dct[sillylist([1])]
Hi!
py> print dct[lst]
Hi!

Of course, as noted by the classes themselves, they're silly. ;) But as
long as you don't mind rehashing the dict each time the dict is accessed
;) you can get sane behavior even in the presence of mutable keys.

Steve
 
D

David Eppstein

Nick Coghlan said:
For a 'mutable key' to make sense, the following:

lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi

Yes, and what should the following do?

lst1 = [1]
lst2 = [2]
dct = {lst1: "1", lst2: "2"}
lst2[0]=1
lst1[0]=2
print dct[[1]]
print dct[[2]]
 
S

Steven Bethard

David said:
For a 'mutable key' to make sense, the following:

lst = []
dct = {l: "Hi!"}
print dct[[]]
print dct[lst]
lst.append(1)
print dct[[1]]
print dct[lst]

Should print:
Hi
Hi
Hi
Hi


Yes, and what should the following do?

lst1 = [1]
lst2 = [2]
dct = {lst1: "1", lst2: "2"}
lst2[0]=1
lst1[0]=2
print dct[[1]]
print dct[[2]]

Depends on when you want the updates done. Since my implementation only
rehashes when the dict is accessed, neither key gets overwritten in this
case:

py> class sillylist(list):
.... def __hash__(self):
.... return hash(tuple(self))
....
py> class sillydict(dict):
.... def _rehash(self):
.... items = self.items()
.... self.clear()
.... self.update(items)
.... def __getitem__(self, key):
.... self._rehash()
.... return super(sillydict, self).__getitem__(key)
.... def __setitem__(self, key, value):
.... self._rehash()
.... return super(sillydict, self).__setitem__(key, value)
....
py> lst1 = sillylist([1])
py> lst2 = sillylist([2])
py> dct = sillydict({lst1:"1", lst2:"2"})
py> lst2[0] = 1
py> lst1[0] = 2
py> print dct[sillylist([1])]
2
py> print dct[sillylist([2])]
1

The nastier question is, what should the following code do:

lst1 = [1]
lst2 = [2]
dct = {lst1: "1", lst2: "2"}
lst2[0]=1
print dct[[1]]
print dct[[2]]

My implementation does the following:

py> lst1 = sillylist([1])
py> lst2 = sillylist([2])
py> dct = sillydict({lst1:"1", lst2:"2"})
py> lst2[0] = 1
py> print dct[sillylist([1])]
1
py> print dct[sillylist([2])]
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "<interactive input>", line 8, in __getitem__
KeyError: [2]

which is even sillier when you compare to what happens with the original
code you posted. ;)

Steve
 
N

Nick Coghlan

David said:
Yes, and what should the following do?

lst1 = [1]
lst2 = [2]
dct = {lst1: "1", lst2: "2"}
lst2[0]=1
lst1[0]=2
print dct[[1]]
print dct[[2]]

Provide yet another example for why mutable keys are almost guaranteed to result
in suprising semantics :)

Cheers,
Nick.
 
N

Nick Coghlan

Antoon said:
As you said yourself, people use shortcuts when they express themselves.
With 'mutable key' I mean a mutable object that is used as a key.
That doesn't contradict that the key itself can't change because
it is inaccessible.

Yeah - this was the point of terminology that was getting confused, I think. And
your interpretation was closer to the strict sense of the words.
Well, the Zen of Python states "In the face of ambiguity, refuse the temptation
to guess".

But python does guess in the case of the value. If I want to establish a
relationship between "foo" and [3, 5, 8] then I can be in big trouble
if that list gets mutated in something else.

However, the actual promise a Python dict makes is that it maps from keys by
value to values by identity.

That is, "k1 == k2" implies "d[k1] is d[k2]". An identity test on mutable
objects is entirely unambiguous (hence they can be values), but comparison is
not (hence they cannot be keys). Restricting keys to immutable objects
eliminates the ambiguity. Copying keys would also prevent mutation and eliminate
the ambiguity. The restriction approach is a *definite* performance enhancement
over the copying approach, though (this is probably the origin of the mantra
"the restriction to mutable keys is a performance enhancement"). The resulting
semantics of the restriction to immutable objects are also significantly less
surprising than those of the copying approach.

An identity dict would make the promise that "k1 is k2" implies "d[k1] is
d[k2]". In this case, the mutability of the key is irrelevant, so the behaviour
is unambiguous without any additional restrictions.
But they don't have to be immutable within the dictionary itself, that
is just an implementation detail. The only thing that matters is that
the dictionary can reproduce the relationship. How that is done is
immaterial.

Given the Python interpreter's current definition of "immutable" as "defines
__hash__", it's a little hard to have an unhashable object as a key :)
No they don't, that is just the most easy implementation. If some
implementation would constantly change the internal objects but
consulting the dictionary wouldn't show any different effect then
that would be fine too. And even in this case a mutation
from the outside would probably cause havoc because it would ruin
the sanity of the internal structure.

If you insist: s/immutable/effectively immutable/

And no data type can be expected to tolerate having their invariants violated by
external code messing with internal data structures.
Well that would be a good thing IMO.

True - I'm not sure how that 'merely' snuck in there :)
Good enough in any case for me to
spend some time analyzing the requirements. Whether I'll actually
implement something depends on how much time I have and how often I
need it.

My preference is for the identity dict - its performance would actually be
*better* than that of a standard dict in the domains where it applies
(annotation and classification of objects are the two main uses I can see for
such a data structure). This aligns quite well with the stated goal of the
collections module (providing data structures that are less general purpose than
the standard builtins, but deliver better performance for their particular use
cases)

Regards,
Nick.
 
N

Nick Coghlan

Antoon said:
Interesting idea. But I think you are wrong when you say that two lists
that compare equal at the time they are frozen, will get the same
dictionary entry. The problem is an object must compare equal to
the key in the dictionary to get at the same entry. So if you freeze
a list and its copy but then mutate them differently, they no longer
are equal and so wont get you at the same entry.

The trick is that the result of the freezing operation is cached until such time
as you explicitly unfreeze the object.

Output:

x: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Write via x, read via y: Hi there!
x: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y: [0, 1, 2, 3, 4, 5, 6, 7, 8]
Read via mutated x: Hi there!
Read via mutated y: Hi there!
x: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y: [0, 1, 2, 3, 4, 5, 6, 7, 8]
Found unfrozen x: False
Found unfrozen y: False


Produced by:

class mylist(list):
def __init__(self, arg):
super(mylist, self).__init__(arg)
self._tuple = None
def frozen(self):
if self._tuple is None:
self._tuple = tuple(self)
return self._tuple
def unfreeze(self):
self._tuple = None

x = mylist(range(10))
y = mylist(range(10))
dct = {}
dct[x.frozen()] = "Hi there!"
print "x:", x
print "y:", y
print "Write via x, read via y:", dct[y.frozen()]
x.append(10)
del y[-1]
print "x:", x
print "y:", y
print "Read via mutated x:", dct[x.frozen()]
print "Read via mutated y:", dct[y.frozen()]
x.unfreeze()
y.unfreeze()
print "x:", x
print "y:", y
print "Found unfrozen x:", x.frozen() in dct
print "Found unfrozen y:", y.frozen() in dct
 
A

Antoon Pardon

Op 2005-01-20 said:
The trick is that the result of the freezing operation is cached until such time
as you explicitly unfreeze the object.

I missed that you would use it with the idiom: dct[x.frozen()]

I have two problems with this approach.

1) It doesn't work when you get your keys via the keys/items
methods.

2) This is rather minor, but a user could still unfreeze
untimely
 
N

Nick Coghlan

Antoon said:
I missed that you would use it with the idiom: dct[x.frozen()]

The list itself isn't hashable with this approach, so you don't have much
choice. I wasn't particularly clear about that point, though.
I have two problems with this approach.

1) It doesn't work when you get your keys via the keys/items
methods.

True - the frozen object has no link back to the original object. That could be
added though (by returning a tuple subtype with the extra attribute)
2) This is rather minor, but a user could still unfreeze
untimely

True - doing that is less likely than mutating a hashable list though :)

I'm just noting this as a way to avoid copying data more than once when storing
immutable copies of mutable data in a dictionary. You're quite right that there
isn't a really clean idiom for doing that in Python (aside from moving to a
different data structure that works natively as a dict key, naturally).

Cheers,
Nick.
 
A

Antoon Pardon

Op 2005-01-20 said:
Antoon said:
I missed that you would use it with the idiom: dct[x.frozen()]

The list itself isn't hashable with this approach, so you don't have much
choice. I wasn't particularly clear about that point, though.
I have two problems with this approach.

1) It doesn't work when you get your keys via the keys/items
methods.

True - the frozen object has no link back to the original object. That could be
added though (by returning a tuple subtype with the extra attribute)
2) This is rather minor, but a user could still unfreeze
untimely

True - doing that is less likely than mutating a hashable list though :)

I'm just noting this as a way to avoid copying data more than once when storing
immutable copies of mutable data in a dictionary. You're quite right that there
isn't a really clean idiom for doing that in Python (aside from moving to a
different data structure that works natively as a dict key, naturally).

The problem here is IMO is the, we are all consenting adults (which we
are not but I wont start on this now), approach.

I have been thinking a bit in your freeze direction, but more thorough.
The idea i had was that freeze would replace all mutating methods
with methods that would throw an exception. a thaw method would inverse
the proces. It would then be the responsibilty of the dictionary to
freeze an object on entry and thaw it when removed. However this
wouldn't work with ordinary python objects, since there is no easy way
to have readonly or private variables.
 
B

Bengt Richter

Op 2005-01-20 said:
Antoon said:
I missed that you would use it with the idiom: dct[x.frozen()]

The list itself isn't hashable with this approach, so you don't have much
choice. I wasn't particularly clear about that point, though.
I have two problems with this approach.

1) It doesn't work when you get your keys via the keys/items
methods.

True - the frozen object has no link back to the original object. That could be
added though (by returning a tuple subtype with the extra attribute)
2) This is rather minor, but a user could still unfreeze
untimely

True - doing that is less likely than mutating a hashable list though :)

I'm just noting this as a way to avoid copying data more than once when storing
immutable copies of mutable data in a dictionary. You're quite right that there
isn't a really clean idiom for doing that in Python (aside from moving to a
different data structure that works natively as a dict key, naturally).

The problem here is IMO is the, we are all consenting adults (which we
are not but I wont start on this now), approach.

I have been thinking a bit in your freeze direction, but more thorough.
The idea i had was that freeze would replace all mutating methods
with methods that would throw an exception. a thaw method would inverse
the proces. It would then be the responsibilty of the dictionary to
freeze an object on entry and thaw it when removed. However this
wouldn't work with ordinary python objects, since there is no easy way
to have readonly or private variables.
Would you like a dictionary that acts as you want and takes care of all
problems internally, and accepts keys and values of any type without wrapping
or other modification -- or do you want a wrapper that can make any object
suitable for use as key or value in python's curent definition of dict?

Just DYFR please. You still haven't you know ;-)

Regards,
Bengt Richter
 
A

Antoon Pardon

Op 2005-01-21 said:
Would you like a dictionary that acts as you want and takes care of all
problems internally, and accepts keys and values of any type without wrapping
or other modification -- or do you want a wrapper that can make any object
suitable for use as key or value in python's curent definition of dict?

Just DYFR please. You still haven't you know ;-)

My feeling is that I'm hungry and would like to eat and you ask whether
my requirements are a sandwhich or a bowl of soup. :)

The real requirement IMO is that any attempt to mutate an object wont
result in a mutated key. Both proposals seem to solve that. The second
is at the moment for me the most interesting to discuss, since it is
a new approach to me, and most likely to learn me something new.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top