No trees in the stdlib?

T

Tom Reed

Why no trees in the standard library, if not as a built in? I searched
the archive but couldn't find a relevant discussion. Seems like a
glaring omission considering the batteries included philosophy,
particularly balanced binary search trees. No interest, no good
implementations, something other reason? Seems like a good fit for the
collections module. Can anyone shed some light?

Thanks.
 
A

Aahz

Why no trees in the standard library, if not as a built in? I searched
the archive but couldn't find a relevant discussion. Seems like a
glaring omission considering the batteries included philosophy,
particularly balanced binary search trees. No interest, no good
implementations, something other reason? Seems like a good fit for the
collections module. Can anyone shed some light?

What do you want such a tree for? Why are dicts and the bisect module
inadequate? Note that there are plenty of different tree implementations
available from either PyPI or the Cookbook.
 
J

João Valverde

Aahz said:
What do you want such a tree for? Why are dicts and the bisect module
inadequate? Note that there are plenty of different tree implementations
available from either PyPI or the Cookbook.
A hash table is very different to a BST. They are both useful. The
bisect module I'm not familiar with, I'll have to look into that, thanks.

I have found pyavl on the web, it does the job ok, but there no
implementations for python3 that I know of.

Simple example usage case: Insert string into data structure in sorted
order if it doesn't exist, else retrieve it.
 
J

João Valverde

João Valverde said:
A hash table is very different to a BST. They are both useful. The
bisect module I'm not familiar with, I'll have to look into that, thanks.

I have found pyavl on the web, it does the job ok, but there no
implementations for python3 that I know of.

Simple example usage case: Insert string into data structure in sorted
order if it doesn't exist, else retrieve it.
Crap, sorry about the mixed identities, I'm not using my own machine.
The original post is mine also.
 
J

João Valverde

João Valverde said:
A hash table is very different to a BST. They are both useful. The
bisect module I'm not familiar with, I'll have to look into that, thanks.

I have found pyavl on the web, it does the job ok, but there no
implementations for python3 that I know of.
The main problem with pyavl by the way is that it doesn't seem to be
subclassable (?). Besides some interface glitches, like returning None
on delete if I recall correctly.

There's also rbtree, which I didn't try. And I think that's it. On the
whole not a lot of choice and not as practical for such a common data
structure.
 
C

Chris Rebert

A hash table is very different to a BST.  They are both useful. The bisect
module I'm not familiar with, I'll have to look into that, thanks.

I have found pyavl on the web, it does the job ok, but there no
implementations for python3 that I know of.

Simple example usage case: Insert string into data structure in sorted order
if it doesn't exist, else retrieve it.

That's pretty much the bisect module in a nutshell. It manipulates a
sorted list using binary search.

Cheers,
Chris
 
S

Stefan Behnel

João Valverde said:
Besides some interface glitches, like returning None
on delete if I recall correctly.

That's actually not /that/ uncommon. Operations that change an object are
not (side-effect free) functions, so it's just purity if they do not have a
return value.

Although practicality beats purity, sometimes... ;)

Stefan
 
M

Miles Kaufmann

That's pretty much the bisect module in a nutshell. It manipulates a
sorted list using binary search.

With O(n) insertions and removals, though. A decent self-balancing
binary tree will generally do those in O(log n).

-Miles
 
J

Jason Scheirer

What do you want such a tree for?  Why are dicts and the bisect module
inadequate?  Note that there are plenty of different tree implementations
available from either PyPI or the Cookbook.

....And heapq is more-or-less an emulation of a tree structure in its
underlying model. I once wrote a binary sorted tree data structure for
Python in C. It performed anywhere from 15-40% worse than dicts. I
think with optimization it will only perform 10% worse than dicts at
best.

Oh hey maybe that is why trees aren't an emphasized part of the
standard. They are going to be much slower than the ultra-optimized
dicts already in the standard lib.
 
J

João Valverde

João Valverde said:
A hash table is very different to a BST. They are both useful. The
bisect module I'm not familiar with, I'll have to look into that, thanks.

I have found pyavl on the web, it does the job ok, but there no
implementations for python3 that I know of.

Simple example usage case: Insert string into data structure in sorted
order if it doesn't exist, else retrieve it.
After browsing the bisect module I don't think it is the complete
answer. Please correct me if I'm mistaken but...

Ignoring for a moment that subjectively I feel this is not very pythonic
for my use case, if I get back the insertion position, doesn't that mean
I have to go over on average N/2 items on a linked list to insert the
item in position? Maybe less for sophisticated implementations but still
O(n)? That doesn't compare favorably to the cost of (possibly) having to
rebalance a tree on insertion.
 
J

João Valverde

Jason said:
...And heapq is more-or-less an emulation of a tree structure in its
underlying model. I once wrote a binary sorted tree data structure for
Python in C. It performed anywhere from 15-40% worse than dicts. I
think with optimization it will only perform 10% worse than dicts at
best.

Oh hey maybe that is why trees aren't an emphasized part of the
standard. They are going to be much slower than the ultra-optimized
dicts already in the standard lib.
But a dict can't be used to implement a (sorted) table ADT.
 
J

João Valverde

Stefan said:
That's actually not /that/ uncommon. Operations that change an object are
not (side-effect free) functions, so it's just purity if they do not have a
return value.

Although practicality beats purity, sometimes... ;)

Stefan
I didn't know that. But in this case I think purity gets pummeled every
time :) It's still not making sense to force a lookup to fetch something
before deleting (another lookup operation). If that were imposed by
python's internal machinery I'd suggest fixing that instead.
 
J

João Valverde

João Valverde said:
I didn't know that. But in this case I think purity gets pummeled
every time :) It's still not making sense to force a lookup to fetch
something before deleting (another lookup operation). If that were
imposed by python's internal machinery I'd suggest fixing that instead.
To be clear what I mean by that is that it's just reference passing so
it can't generate programmatic errors.
 
C

Chris Rebert

After browsing the bisect module I don't think it is the complete answer.
Please correct me if I'm mistaken but...

Ignoring for a moment that subjectively I feel this is not very pythonic for
my use case, if I get back the insertion position, doesn't that mean I have
to go over on average N/2 items on a linked list to insert the item in

Linked list?! Python lists are *array-based*.
<"You must be new here" joke redacted>

Cheers,
Chris
 
J

João Valverde

João Valverde said:
After browsing the bisect module I don't think it is the complete
answer. Please correct me if I'm mistaken but...

Ignoring for a moment that subjectively I feel this is not very
pythonic for my use case, if I get back the insertion position,
doesn't that mean I have to go over on average N/2 items on a linked
list to insert the item in position? Maybe less for sophisticated
implementations but still O(n)? That doesn't compare favorably to the
cost of (possibly) having to rebalance a tree on insertion.
I was indeed corrected on this shameful blunder, but in my defense array
based lists are implementation dependent and the case for trees can be
made on deletion.
 
S

Steven D'Aprano

I once wrote a binary sorted tree data structure for Python in C. It
performed anywhere from 15-40% worse than dicts. I think with
optimization it will only perform 10% worse than dicts at best.

Oh hey maybe that is why trees aren't an emphasized part of the
standard. They are going to be much slower than the ultra-optimized
dicts already in the standard lib.

But dicts require hashable keys, and are in arbitrary order. You can't
(for example) traverse a dict in post-order.

Hash tables (dicts) are useful for many of the same things that trees are
useful for, but they are different data structures with different
strengths and weaknesses, and different uses. To argue that we don't need
trees because we have dicts makes as much sense as arguing that we don't
need dicts because we have lists.
 
P

Paul Rubin

Stefan Behnel said:
That's actually not /that/ uncommon. Operations that change an object are
not (side-effect free) functions, so it's just purity if they do not have a
return value.

But deletes in an AVL tree should not cause mutation. They should
just allocate a new root and path up to where the deleted node was.
That allows having references to the old and new versions of the tree,
etc.
 
P

Paul Rubin

Chris Rebert said:
That's pretty much the bisect module in a nutshell. It manipulates a
sorted list using binary search.

That lets you find an existing entry in log(n) time, but inserting
or deleting an entry still takes linear time. Also, you don't
get a persistent structure in the sense of functional programming.
 
S

Stefan Behnel

Paul said:
But deletes in an AVL tree should not cause mutation. They should
just allocate a new root and path up to where the deleted node was.
That allows having references to the old and new versions of the tree,
etc.

I doubt that there are many AVL implementations that do that. Plus, if
deletion doesn't delete, I'd happily consider that a bug.

Stefan
 
H

Hallvard B Furuseth

Stefan said:
That's actually not /that/ uncommon. Operations that change an object are
not (side-effect free) functions, so it's just purity if they do not have a
return value.

It's purity that they don't return the modified tree/dict/whatever.
They can still return the deleted element and remain pure.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top