An ordered dictionary for the Python library?

Antoon Pardon · Sep 13, 2007

Actually I meant by key order, so insertion order doesn't matter at
all. If you need a dictionary-like data structure that respects
insertion order you could use a list of (key, value) tuples.

If you want a dictionary that iterates over his keys and does so
in a sorted manner you probably want a tree.

One posible implemenation can be found here:

http://www.pardon-sleeuwaegen.be/antoon/avltree.html

I hope it is usefull to you.

mensanator · Sep 13, 2007

Yes. It should use a functional data structure.

Click to expand...

Could you elaborate?

Here's my impression of the postings so far: there are 3 categories of
response:

(A) There is no need for such a thing.

(B) It is useful, but so easy to implement that it needn't be in the
library.

(C) It is worth having an ordered data structure of some kind.

Regarding (A) and (B): I remember Python before it had the sets
module. "There's no need for sets, you can do them with dicts".
Thankfully sets are in the language and I for one use them
extensively.

For (C) what I had in mind was:

An API that is identical to a dict, with the few additional methods/
behaviours listed below:
- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)
- Extra methods
key_at(index : int) -> key
item_at(index : int) -> (key, value)
value_at(index : int) -> value
set_value_at(index : int, value) # no return value necessary but
maybe key if convenient
od[x:y:z] -> list of values ordered by key # can read a slice but
not assign a slice
and maybe
keys_for(fromindex : int, uptobutexcludingindex : int) -> ordered
list of keys

I've given examples of the use cases I envisage (and that I use in
both Python with my own ordered dict and in C++ with the map class) in
a previous posting, and I've shown the API I envisage above. I know
that some people might prefer ordering by value or the option of case-
insensitive ordering, but both these can be achieved using the API
above, e.g.,
od[value.lower()] = value # case-insensitive ordering of list of
values
And as for duplicates there are two approaches I use: (1) make each
string unique as I showed in my previous post, or (2) use a value that
itself is a list, dict or set.
(As for those who have suggested an ordered data structure that isn't
a dict, I don't see a conflict, I'd be happy to see more data
structures in the collections module.)

Is what I've suggested sufficient (and sufficiently general) for the
standard library?

Is the index the insertion order? The only use I have an ordered
dictionary is to quickly determine that a sequence value is the
first duplicate found (the structure can have millions of numbers)
and then play back (so they don't have to be re-calculated) the
sequence from the first duplicate to the end.

So I would get the index when I find the duplicate and then
iterate data[index:] to get the sequence in insertion order?

Paul Rubin · Sep 13, 2007

Mark Summerfield said:
Could you elaborate?

I mean use a data structure like an AVL tree, that can make a new dict
object quickly, whose contents are the same as the old object except
for one element (i.e. most of the contents are shared with the old
dict). You should also have ordered versions of both lists and sets.

Consider the situation with lists:

a = some 1000 element list
b = a + [23]

Now a has 1000 elements and b has 1001 elements (23 followed by a's
elements), but constructing b required copying all of a's elements.

It's sort of the same with sets:

a = 1000 element set
b = a + set((23,))

b had to copy the contents of a. By using a tree structure, you can
construct b in O(log(1000)) operations so that most of the elements
are shared with a, while at the same time you don't mutate a. That
makes it a lot easier to program in a style where you have a bunch of
slightly different versions of some set or dict flying around. For
an example of where this came up, see this thread:

http://groups.google.com/group/comp...read/thread/78b5953488d772e9/82f701c302122777

For sample implemetations and API's, you could look at the Hedgehog
Lisp version of AVL trees (including a Python dict-like interface):

http://hedgehog.oliotalo.fi/

and Haskell's Data.Map library:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Map.html

I think there is something similar in the ML Basis library but I'm not
familiar with that.

For more about functional data structures, see some of Chris Okasaki's
articles:

http://www.eecs.usma.edu/webs/people/okasaki/pubs.html

Dr. Okasaki also has written a book about the topic, called "Purely
Functional Data Structures", which is very good.

That said, in Python maybe this stuff would be easier to use
improperly because Python coding style uses mutation so much. It
might be best to limit sharing to the case of frozen dicts and frozen
sets.

James Stroud · Sep 13, 2007

Mark said:
- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)

+1 for all your suggestions below, but -1 for the above. You suggest that

myOrderedDict['key'] = value

would place it in the dictionary in sorted order depending on 'key'.
More natural to some of us (or maybe its just me) would be to append the
key/value pair to the end of items.

My Argument: The problem with inserting at sorted position is the
assumption that the dict is already sorted, but maybe I want the key
ordering to reflect something arbitrary, like a database, etc (e.g.
'Last', 'First', 'MI'). So now I assign to an OrderedDict that already
has these latter 3 keys:

myOrderedDict['Favorite Color'] = 'chartruse'

Where does it go? Answer: it probably depends on how the underlying
sorter works and (even worse) the internal state of the sorter's data
structure. So, my intuition is that SortedDict behavior should be kept
distinct from OrderedDict behavior.

- Extra methods
key_at(index : int) -> key
item_at(index : int) -> (key, value)
value_at(index : int) -> value
set_value_at(index : int, value) # no return value necessary but
maybe key if convenient
od[x:y:z] -> list of values ordered by key # can read a slice but
not assign a slice

I do not see why assigning a slice is a bad idea here. The behavior
could be such:

od = OrderedDict((('Last':'Smith'), ('First':'John'), ('MI':'J'))

od[:1] = ['Brown']
od[:2] = ['Glotz', 'Joe']

od[:2] = ['Williams', 'Mary', 'Francis'] # <== ValueError

The latter would be a phenomenon similar to unpacking a tuple of the
wrong length.

and maybe
keys_for(fromindex : int, uptobutexcludingindex : int) -> ordered
list of keys

Of course the naming is redundant for these in my opinion. E.g.:

key_at -> key
item_at -> item
value_at -> value
set_value_at -> set_value (not simply set)
keys_for -> keys (takes optional argument(s))

James

James Stroud · Sep 13, 2007

James said:
set_value_at -> set_value (not simply set)

Actually, 'set' seems better than set_value.

=?iso-8859-1?q?J=FCrgen_Urner?= · Sep 14, 2007

Puh, what a discussion... most common use case I can think of is

It's a bit tireing having to type

for key in sorted(d.keys()):
do_somethig_with(d[key])

Click to expand...

instead of a trustfully

As far as I can see much cleaner. No black magic needed ++ sort the
dict
to another order and rely on the sort order being stable would be a
really
a nice thing to have.

My 2 cents, Jürgen

Carl Banks · Sep 14, 2007

Mark said:
Mark said:

- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)

Click to expand...

+1 for all your suggestions below, but -1 for the above. You suggest that

myOrderedDict['key'] = value

would place it in the dictionary in sorted order depending on 'key'.
More natural to some of us (or maybe its just me) would be to append the
key/value pair to the end of items.

Or, maybe, like, you know, you could have two different types that
maintain two different orders?

Carl Banks

James Stroud · Sep 14, 2007

Carl said:
Mark said:

- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)

Click to expand...

+1 for all your suggestions below, but -1 for the above. You suggest that

myOrderedDict['key'] = value

would place it in the dictionary in sorted order depending on 'key'.
More natural to some of us (or maybe its just me) would be to append the
key/value pair to the end of items.

Click to expand...

Or, maybe, like, you know, you could have two different types that
maintain two different orders?

Do you mean, like, a SortedDict and an OrderedDict like I mentioned in
the post you are replying to?

Mark Summerfield · Sep 14, 2007

Puh, what a discussion... most common use case I can think of is

It's a bit tireing having to type

for key in sorted(d.keys()):
do_somethig_with(d[key])

Click to expand...

Click to expand...

instead of a trustfully

As far as I can see much cleaner. No black magic needed ++ sort the
dict
to another order and rely on the sort order being stable would be a
really
a nice thing to have.

My 2 cents, Jürgen

What I envisage is:

d = ordereddict(a=1, x=20, b=35, m=4)
# some time later
d["e"] = 15
# later still
d["b"] = 70
d.keys() # returns ['a', 'b', 'e', 'm', 'x']
d.values() # returns [1, 70, 15, 4, 20]

So no matter _when_ you add because the underlying data structure is
ordered (by key) you can always get access in key order. Also, you
can't "sort" the ordereddict; it is in key order and that's it. The
whole point is that you can use these things and never have to sort;
if you need different orders you create extra ordereddicts with
suitably munged keys and with values that are object references to the
objects you are interested in.

In reply to James Stroud:

- The reason you can't assign a slice is that the index positions
depend on the keys, so if you do:
od[newkey] = newvalue
where newkey goes (i.e., its index position) depends on how it orders
in relation to the rest, so it could go "anywhere".
I now feel that offering a slice is wrong because the [] syntax is
used for access by key, whereas a slice would be for access by index,
so using [] for both would be v. confusing.

- James proposed better method names which I've adopted (see later),
but I don't think set_value() should be set() because that would make
it seem to be the partner of get(), whereas get() uses a key and
set_value() uses an index for lookup.

- James also suggested a name and behaviour change:

keys(fromindex : int = None, uptobutexcludingindex : int = None) -

ordered list of keys

So keys() called on its own behaves just like dict.keys() (except that
all the keys are returned in order), but with one argument returns the
keys with index positions from the given index to the end, and with
two arguments returns the keys in the given range of indexes. (Note:
in an ordereddict values() returns all the values in key order.)

So from the earlier example:

d.key(2) # returns 'e'
d.item(3) # returns ('m', 4)
d.value(4) # returns 20
d.set_value(3, 99) # maybe returns 'm'; but anyway d.value(3) and
d['m'] now both return 99

- James (and some others) also felt that insertions should just go as
key/value pairs at the end. But that is not what I'm proposing (that's
a completely different data structure).

The whole point of ordereddict is that whatever you insert and
whenever you insert it, the ordereddict is always in key order.

Paul Rubin and at least one other person mentioned the use of an AVL
tree. At the moment I just want to see if I can get enough consensus
on an API and behaviour so that I could put in a PEP for just one
ordered data structure.

So to clarify, here's the entire API I'm proposing for ordereddict. In
all cases the ordereddict is always in (and kept in) key order, and
any method that returns a list or iterator always returns that list or
iterator (whether of keys or values) in key order:

ordereddict(dictionary=None)
The argument can be a dict or another ordereddict; all the dict()
initializer
approaches should also be supported.

ordereddict.update(dictonary=None, **kwargs)
Same as dict.update()---except that key order is preserved (a
point I won't
repeat in the others when I say "same as dict", but which is true
throughout)

@classmethod
ordereddict.fromkeys(cls, iterable, value=None) # Same as dict

ordereddict.key(index : int) -> key
Returns the index-th item's key

ordereddict.item(index : int) -> (key, value)
Returns the index-th item

ordereddict.value(index : int) -> value
Returns the index-th item's value

ordereddict.set_value(index : int, value)
Sets the index-th item's value to value; raises IndexError if
index is out of
range. If not expensive, maybe return the key.

ordereddict.copy() # Same as dict.
ordereddict.clear() # Same as dict.
ordereddict.get(key, value=None) # Same as dict
ordereddict.setdefault(key, value) # Same as dict
ordereddict.pop(key, value) # Same as dict
ordereddict.popitem() # Same as dict

ordereddict.keys(fromindex : int = None, uptoindex : int : None) ->
list of keys
Returns an ordered list of keys, or a slice of keys if one or two
indexes is given

ordereddict.values() # Same as dict
ordereddict.items() # Same as dict
ordereddict.iterkeys() # Same as dict
ordereddict.itervalues() # Same as dict
ordereddict.iteritems() # Same as dict
ordereddict.has_key() # Same as dict

Also the same as dict (and as always, working in key order):

for key in d: pass
if key in d: pass
len(d)
del d[key]
d[key]
d[key] = value

What this means is that users could drop in an ordereddict in place of
a plain dict and get the same behaviour (maybe not the same
performance!), but would now find that they could rely on the ordering
of keys, as well as having just four additional methods and one
existing method with additional optional arguments.

Mark Summerfield · Sep 14, 2007

I forgot one item in the proposed API:

ordereddict.delete(index : int)

Also, the API for keys() should be

ordereddict.keys(firstindex : int = None, secondindex : int = None)

If called with no args, returns list of all keys (in key order of
course); if one arg is given, returns keys with indexes in range(0,
firstindex); if two args are given, returns keys with indexes in
range(firstindex, secondindex).

Below is a stripped-down implementation in pure python that is just 84
lines long.
(I have a proper version with blank lines and doctests which is 482
lines but I thought that was a bit long to post.)

It should work as a drop-in replacement for dict (apart from
performance), but with the keys ordered by <, so that every list or
iterator that it returns is always in key order.

The get(), has_key(), __contains__(), len(), and __getitem__() methods
are not reimplemented because the base class versions work fine.

I'm only posting it to give a better feel for the API---if someone did
a better (faster) implementation (e.g., in C), that'd be great, but
best to get consensus on an API first I think (if consensus is
possible at all!).

import bisect
class ordereddict(dict):
def __init__(self, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.__keys = sorted(dict.keys(self))
def update(self, *args, **kwargs):
dict.update(self, *args, **kwargs)
self.__keys = sorted(dict.keys(self))
@classmethod
def fromkeys(cls, iterable, value=None):
dictionary = cls()
for key in iterable:
dictionary[key] = value
return dictionary
def key(self, index):
return self.__keys[index]
def item(self, index):
key = self.__keys[index]
return key, self[key]
def value(self, index):
return self[self.__keys[index]]
def set_value(self, index, value):
self[self.__keys[index]] = value
def delete(self, index):
key = self.__keys[index]
del self.__keys[index]
dict.__delitem__(self, key)
def copy(self):
dictionary = ordereddict(dict.copy(self))
dictionary.__keys = self.__keys[:]
return dictionary
def clear(self):
self.__keys = []
dict.clear(self)
def setdefault(self, key, value):
if key not in self:
bisect.insort_left(self.__keys, key)
return dict.setdefault(self, key, value)
def pop(self, key, value=None):
if key not in self:
return value
i = bisect.bisect_left(self.__keys, key)
del self.__keys
return dict.pop(self, key, value)
def popitem(self):
item = dict.popitem(self)
i = bisect.bisect_left(self.__keys, item[0])
del self.__keys
return item
def keys(self, firstindex=None, secondindex=None):
if firstindex is not None and secondindex is None:
secondindex = firstindex
firstindex = 0
else:
if firstindex is None:
firstindex = 0
if secondindex is None:
secondindex = len(self)
return self.__keys[firstindex:secondindex]
def values(self):
return [self[key] for key in self.__keys]
def items(self):
return [(key, self[key]) for key in self.__keys]
def __iter__(self):
return iter(self.__keys)
def iterkeys(self):
return iter(self.__keys)
def itervalues(self):
for key in self.__keys:
yield self[key]
def iteritems(self):
for key in self.__keys:
yield key, self[key]
def __delitem__(self, key):
i = bisect.bisect_left(self.__keys, key)
del self.__keys
dict.__delitem__(self, key)
def __setitem__(self, key, value):
if key not in self:
bisect.insort_left(self.__keys, key)
dict.__setitem__(self, key, value)
def __repr__(self):
return "ordereddict({%s})" % ", ".join(
["%r: %r" % (key, self[key]) for key in self.__keys])

Antoon Pardon · Sep 14, 2007

Puh, what a discussion... most common use case I can think of is

d = {'a': 1, 'b': 2, 'c': 3}
for key in d:
# do something that relies on order of keys as specified in the constructor

Click to expand...

It's a bit tireing having to type

for key in sorted(d.keys()):
do_somethig_with(d[key])

Click to expand...

instead of a trustfully

for value in d.values():
do_somethig_with(value)

Click to expand...

As far as I can see much cleaner. No black magic needed ++ sort the
dict
to another order and rely on the sort order being stable would be a
really
a nice thing to have.

My 2 cents, Jürgen

Click to expand...

What I envisage is:

d = ordereddict(a=1, x=20, b=35, m=4)
# some time later
d["e"] = 15
# later still
d["b"] = 70
d.keys() # returns ['a', 'b', 'e', 'm', 'x']
d.values() # returns [1, 70, 15, 4, 20]

which seems to be exactly what my avltree module mentioned earlier
provides.

from avltree import Tree
d=Tree(a=1, x=20, b=35, m=4)
d["e"] = 15
d["b"] = 70
d.keys() ['a', 'b', 'e', 'm', 'x']
d.values()

Click to expand...

Click to expand...

[1, 70, 15, 4, 20]

Mark Summerfield · Sep 14, 2007

Puh, what a discussion... most common use case I can think of is
d = {'a': 1, 'b': 2, 'c': 3}
for key in d:
# do something that relies on order of keys as specified in the constructor
It's a bit tireing having to type
for key in sorted(d.keys()):
do_somethig_with(d[key])
instead of a trustfully
for value in d.values():
do_somethig_with(value)
As far as I can see much cleaner. No black magic needed ++ sort the
dict
to another order and rely on the sort order being stable would be a
really
a nice thing to have.
My 2 cents, Jürgen

Click to expand...

Click to expand...

What I envisage is:

Click to expand...

d = ordereddict(a=1, x=20, b=35, m=4)
# some time later
d["e"] = 15
# later still
d["b"] = 70
d.keys() # returns ['a', 'b', 'e', 'm', 'x']
d.values() # returns [1, 70, 15, 4, 20]

Click to expand...

which seems to be exactly what my avltree module mentioned earlier
provides.

from avltree import Tree
d=Tree(a=1, x=20, b=35, m=4)
d["e"] = 15
d["b"] = 70
d.keys()

Click to expand...

Click to expand...

['a', 'b', 'e', 'm', 'x']>>> d.values()

[1, 70, 15, 4, 20]

Antoon,

Your AVL tree looks impressive (although it has five times the lines
of my ordereddict), but it does not appear to support dict.update()'s
API (which was extended in 2.4), so you can't do: avl.update({'g':
20}, a=9, b=22).

Also, it does not provide the key(), value(), and item() methods that
the API I proposed can support (because in an ordereddict, index
positions make sense).

If there was consensus on an API and you, me, and others had different
implementations, we could always come up with some tests to compare
their relative performance, and put the fastest one(s) forward in a
PEP. (I don't care if my own code is used or not, it is the
functionality in Python's standard library that I want, whoever's code
it is.)

Antoon Pardon · Sep 14, 2007

# some time later
d["e"] = 15
# later still
d["b"] = 70
d.keys() # returns ['a', 'b', 'e', 'm', 'x']
d.values() # returns [1, 70, 15, 4, 20]

Click to expand...

which seems to be exactly what my avltree module mentioned earlier
provides.

from avltree import Tree
d=Tree(a=1, x=20, b=35, m=4)
d["e"] = 15
d["b"] = 70
d.keys()

Click to expand...

['a', 'b', 'e', 'm', 'x']>>> d.values()

[1, 70, 15, 4, 20]

Click to expand...

Antoon,

Your AVL tree looks impressive (although it has five times the lines
of my ordereddict), but it does not appear to support dict.update()'s
API (which was extended in 2.4), so you can't do: avl.update({'g':
20}, a=9, b=22).

It is true that I didn't follow up the API difference made to dict
since I wrote the module, but I think these are rather minor.
I don't think it would take a lot of work to correct these.

Also, it does not provide the key(), value(), and item() methods that
the API I proposed can support (because in an ordereddict, index
positions make sense).

At the time I wrote my module I never had a need for these. Do you have
a use case or is it a consideration of completeness that makes you want
these? Maybe I can take a look in how to implement this, but at this
moment it doesn't sound that usefull to me.

On the other hand your API doesn't seem to allow for iterating over only
a part of the keys. Supposing all keys are strings, I can easily iterate
over all keys that start with an 'n' or with any arbitrary prefix.
That IMO seems more usefull.

If there was consensus on an API and you, me, and others had different
implementations, we could always come up with some tests to compare
their relative performance, and put the fastest one(s) forward in a
PEP. (I don't care if my own code is used or not, it is the
functionality in Python's standard library that I want, whoever's code
it is.)

Good luck on finding that consensus. If you really want this I fear you
will have to start writing that PEP before a consensus is reached and
hope to find a proposal that will be acceptable to the majority and
especially the BDFL.

Mark Summerfield · Sep 14, 2007

# some time later
d["e"] = 15
# later still
d["b"] = 70
d.keys() # returns ['a', 'b', 'e', 'm', 'x']
d.values() # returns [1, 70, 15, 4, 20]
which seems to be exactly what my avltree module mentioned earlier
provides.
from avltree import Tree
d=Tree(a=1, x=20, b=35, m=4)
d["e"] = 15
d["b"] = 70
d.keys()
['a', 'b', 'e', 'm', 'x']>>> d.values()
[1, 70, 15, 4, 20]

Click to expand...

Antoon,

Click to expand...

Your AVL tree looks impressive (although it has five times the lines
of my ordereddict), but it does not appear to support dict.update()'s
API (which was extended in 2.4), so you can't do: avl.update({'g':
20}, a=9, b=22).

Click to expand...

It is true that I didn't follow up the API difference made to dict
since I wrote the module, but I think these are rather minor.
I don't think it would take a lot of work to correct these.

I'm sure you're right.

At the time I wrote my module I never had a need for these. Do you have
a use case or is it a consideration of completeness that makes you want
these? Maybe I can take a look in how to implement this, but at this
moment it doesn't sound that usefull to me.

I put them in for completeness, although in some contexts I have found
the ability to ask for the n-th item to be v. useful.

On the other hand your API doesn't seem to allow for iterating over only
a part of the keys. Supposing all keys are strings, I can easily iterate
over all keys that start with an 'n' or with any arbitrary prefix.
That IMO seems more usefull.

That is an appealing feature---but I don't want to make any assumption
about keys (they could be ints, id()s, strs, or anything that is
acceptable to a dict.

There's nothing to stop you creating a PEP for your AVL tree---I'd
certainly be glad for one to be in the collections module. I'm not
advocating "only one" ordered data structure, but rather one
particular one---and I certainly hope the collections module will have
several eventually, and that other people will propose PEPs for other
data structures, such as AVL trees, B*Trees, skiplists, etc., since
all have something to offer.

Good luck on finding that consensus. If you really want this I fear you
will have to start writing that PEP before a consensus is reached and
hope to find a proposal that will be acceptable to the majority and
especially the BDFL.

I don't expect my API to satisfy everyone, but by making it as close
to what exists already, i.e., a dict, yet with keys that "happen" to
be ordered (and offering a few extra methods to help users exploit
that if they want), I am hoping this will make it more likely to be
acceptable.

Antoon Pardon · Sep 14, 2007

I put them in for completeness, although in some contexts I have found
the ability to ask for the n-th item to be v. useful.

If you don't have a use case, I advise you to drop them. As far as I
understand people's sentiments, including a feature without a use case
illustrating its usefullness will decrease your chances.

That is an appealing feature---but I don't want to make any assumption
about keys (they could be ints, id()s, strs, or anything that is
acceptable to a dict.

The feature doesn't depend on any assumption. if your keys are integers
you can iterate over all keys between 121 and 8264. Iterating over all
keys that start with an 'n' just depends on the fact that all such
strings lie between the strings 'n' and 'o'.

However not all keys acceptable to a dict, will be acceptable to
a SortedDict. Some types are hashable but not completly comparable.
Those objects will not be usable as a key in a SortedDict although
they can be used as a key in an normal dict.

There's nothing to stop you creating a PEP for your AVL tree---I'd
certainly be glad for one to be in the collections module. I'm not
advocating "only one" ordered data structure, but rather one
particular one---and I certainly hope the collections module will have
several eventually, and that other people will propose PEPs for other
data structures, such as AVL trees, B*Trees, skiplists, etc., since
all have something to offer.

I'm not interrested in writing a PEP. My impression from asking around
is that is too much work for too little chance to get it accepted.
That is more a personal evaluation of how I value my time and how much I
would prefer it to have my module included than about the PEP process
in itself.

If someone would like to use my avl module as a starting point for a PEP,
I may consider allowing that, but writing the PEP myself is going to
take too much time from other projects.

I don't expect my API to satisfy everyone, but by making it as close
to what exists already, i.e., a dict, yet with keys that "happen" to
be ordered (and offering a few extra methods to help users exploit
that if they want), I am hoping this will make it more likely to be
acceptable.

I wish you all the luck you can get. Maybe if you succeed I'll change
my mind about writing a PEP myself.

However I think your chances will increase if you write your module
and have it available in the cheese shop. If people start using your
module regularly, your chances of it being included in the standard
library will increase greatly.

Mark Summerfield · Sep 14, 2007

On 14 Sep said:
I wish you all the luck you can get. Maybe if you succeed I'll change
my mind about writing a PEP myself.

However I think your chances will increase if you write your module
and have it available in the cheese shop. If people start using your
module regularly, your chances of it being included in the standard
library will increase greatly.

I've taken your advice and put it on PyPI. PyPI isn't as easy to use
as CPAN, and the classifiers don't include "algorithms" or "data
structures" which I find surprising.

James Stroud · Sep 14, 2007

Mark said:
So to clarify, here's the entire API I'm proposing for ordereddict. In
all cases the ordereddict is always in (and kept in) key order, and
any method that returns a list or iterator always returns that list or
iterator (whether of keys or values) in key order:

ordereddict(dictionary=None)
The argument can be a dict or another ordereddict; all the dict()
initializer
approaches should also be supported.

ordereddict.update(dictonary=None, **kwargs)
Same as dict.update()---except that key order is preserved (a
point I won't
repeat in the others when I say "same as dict", but which is true
throughout)

@classmethod
ordereddict.fromkeys(cls, iterable, value=None) # Same as dict

ordereddict.key(index : int) -> key
Returns the index-th item's key

ordereddict.item(index : int) -> (key, value)
Returns the index-th item

ordereddict.value(index : int) -> value
Returns the index-th item's value

ordereddict.set_value(index : int, value)
Sets the index-th item's value to value; raises IndexError if
index is out of
range. If not expensive, maybe return the key.

ordereddict.copy() # Same as dict.
ordereddict.clear() # Same as dict.
ordereddict.get(key, value=None) # Same as dict
ordereddict.setdefault(key, value) # Same as dict
ordereddict.pop(key, value) # Same as dict
ordereddict.popitem() # Same as dict

ordereddict.keys(fromindex : int = None, uptoindex : int : None) ->
list of keys
Returns an ordered list of keys, or a slice of keys if one or two
indexes is given

ordereddict.values() # Same as dict
ordereddict.items() # Same as dict
ordereddict.iterkeys() # Same as dict
ordereddict.itervalues() # Same as dict
ordereddict.iteritems() # Same as dict
ordereddict.has_key() # Same as dict

Also the same as dict (and as always, working in key order):

for key in d: pass
if key in d: pass
len(d)
del d[key]
d[key]
d[key] = value

May I also make one more suggestion, to call it a "sort_ordered_dict"
(or "sortordereddict", or even better a "sorteddict"--where the "ed"
comes from "ordered")? Its hard for me to move past the established
definition of "order", as we think of tuples being ordered--as in the
first sentence of http://en.wikipedia.org/wiki/Tuple--to something that
is preserving an order according to a comparison. The distinction is so
firmly ingrained in my head that it took me a while to wake up to the
fact that you were describing something completely different than an
ordered dictionary (e.g. http://www.voidspace.org.uk/python/odict.html)
even though you were being very unambiguous with your description.

And I also think the ability to drop it in for a built-in dict is very
valuable.

James

Mark Summerfield · Sep 14, 2007

Mark Summerfield wrote: [snip]
May I also make one more suggestion, to call it a "sort_ordered_dict"
(or "sortordereddict", or even better a "sorteddict"--where the "ed"
comes from "ordered")? Its hard for me to move past the established
definition of "order", as we think of tuples being ordered--as in the
first sentence ofhttp://en.wikipedia.org/wiki/Tuple--tosomething that
is preserving an order according to a comparison. The distinction is so
firmly ingrained in my head that it took me a while to wake up to the
fact that you were describing something completely different than an
ordered dictionary (e.g.http://www.voidspace.org.uk/python/odict.html)
even though you were being very unambiguous with your description.

And I also think the ability to drop it in for a built-in dict is very
valuable.

James

It seems that the correct Python terminology for this is indeed
sorteddict (ordered by key), with ordereddict meaning (in insertion
order).

I guess I'll have to rename my module (although unfortunately, my book
has just gone into production and I have a (simpler) example of what I
considered to be an ordered dict, so I will be adding to the
terminology confusion). That notwithstanding, given that it is a
sorteddict, how is the API?

James Stroud · Sep 14, 2007

Mark said:
I guess I'll have to rename my module (although unfortunately, my book
has just gone into production and I have a (simpler) example of what I
considered to be an ordered dict, so I will be adding to the
terminology confusion). That notwithstanding, given that it is a
sorteddict, how is the API?

I must think the API good because I have been implementing, in parallel
with this discussion, my own "OrderedDict" with a very similar API (this
is part of a larger project where I recently found the need to have a
well-implemented ordered dict). The only real omission I see is to allow
instantiating a "sorted dict" with an optional cmp function--to allow
the generalization of such features as case-independence, etc.

James

Carl Banks · Sep 15, 2007

Carl said:
Carl said:

Mark Summerfield wrote:
- If an item is inserted it is put in the right place (because the
underlying data structure, b*tree, skiplist or whatever is
intrinsically ordered by < on the key)
+1 for all your suggestions below, but -1 for the above. You suggest
that

myOrderedDict['key'] = value

would place it in the dictionary in sorted order depending on 'key'.
More natural to some of us (or maybe its just me) would be to append
the key/value pair to the end of items.

Click to expand...

Or, maybe, like, you know, you could have two different types that
maintain two different orders?

Click to expand...

Do you mean, like, a SortedDict and an OrderedDict like I mentioned in
the post you are replying to?

Like, you know, maybe I should read the whole article I reply to.

*sigh*

Carl Banks

Ordered list question	1	Jul 17, 2011
PEP 372 -- Adding an ordered directory to collections	28	Jun 16, 2008
Efficiently building ordered dict	14	Feb 22, 2010
Ordered dict by default	14	Feb 5, 2009
Sorting dictionary by datetime value	0	Feb 8, 2014
odict the Ordered Diction 0.2.2	1	Nov 29, 2006
Yet another ordered dictionary implementation	0	Nov 26, 2005
Ordered dicts	6	Sep 26, 2006

An ordered dictionary for the Python library?

Antoon Pardon

mensanator

Paul Rubin

James Stroud

James Stroud

=?iso-8859-1?q?J=FCrgen_Urner?=

Carl Banks

James Stroud

Mark Summerfield

Mark Summerfield

Antoon Pardon

Mark Summerfield

Antoon Pardon

Mark Summerfield

Antoon Pardon

Mark Summerfield

James Stroud

Mark Summerfield

James Stroud

Carl Banks

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads