# Efficient binary search tree stored in a flat array?

Discussion in 'Python' started by Douglas Alan, Jul 13, 2009.

1. ### Douglas AlanGuest

I couldn't find a good algorithms forum on the Internet, so I guess
maintain a binary search tree in a flat array (i.e., without using
pointers), as is typically done for a binary heap?

It *is* possible, of course, to keep an ordered list in a flat array,
and then do a binary search on the ordered array, but then insertion
and deletion are O(n), rather than O(log n). It would also clearly be
possible to store a balanced binary tree in a flat array, storing the
children of the node at index i at indices 2*i and 2*i + 1, just as
one does for a binary heap. But if you do that, I don't know if you
could then do insertions and deletions in O(log n) time.

One idea that came to mind, is that maybe it is possible using a
"treap", which is a combination of a binary heap and a binary search
tree. Insertions and deletions in binary heaps can be done in O(log n)
in a flat array, but I don't know if this is also true for a treap,
since you also have the binary search tree invariants to maintain, in
addition to the binary heap invariants. For all I know, this might
cause rotations to no longer be O(log n).

|>ouglas

Douglas Alan, Jul 13, 2009

2. ### Paul RubinGuest

Douglas Alan <> writes:
> It would also clearly be
> possible to store a balanced binary tree in a flat array, storing the
> children of the node at index i at indices 2*i and 2*i + 1, just as
> one does for a binary heap. But if you do that, I don't know if you
> could then do insertions and deletions in O(log n) time.

Probably not. Maybe you could organize a 2-3 tree like that, at the
expense of some space.

Paul Rubin, Jul 13, 2009

3. ### AahzGuest

In article <>,
Douglas Alan <> wrote:
>
>I couldn't find a good algorithms forum on the Internet, so I guess
>maintain a binary search tree in a flat array (i.e., without using
>pointers), as is typically done for a binary heap?
>
>It *is* possible, of course, to keep an ordered list in a flat array,
>and then do a binary search on the ordered array, but then insertion
>and deletion are O(n), rather than O(log n).

Still, unless your list is large (more than thousands of elements),
that's the way you should go. See the bisect module. Thing is, the
speed difference between C and Python means the constant for insertion
and deletion is very very small relative to bytecode speed. Keep in
mind that Python's object/binding model means that you're shuffling
pointers in the list rather than items.
--
Aahz () <*> http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur." --Red Adair

Aahz, Jul 13, 2009
4. ### Douglas AlanGuest

On Jul 13, 3:57 pm, (Aahz) wrote:

> Still, unless your list is large (more than thousands of elements),
> that's the way you should go.  See the bisect module.  Thing is, the
> speed difference between C and Python means the constant for insertion
> and deletion is very very small relative to bytecode speed.  Keep in
> mind that Python's object/binding model means that you're shuffling
> pointers in the list rather than items.

Thank you. My question wasn't intended to be Python specific, though.
I am just curious for purely academic reasons about whether there is
such an algorithm. All the sources I've skimmed only seem to the
answer the question via omission. Which is kind of strange, since it
seems to me like an obvious question to ask.

If I find the free time, I might try to work out myself whether it can
be done with a treap.

|>ouglas

Douglas Alan, Jul 13, 2009
5. ### Florian BruckerGuest

Douglas Alan wrote:
> Thank you. My question wasn't intended to be Python specific, though.
> I am just curious for purely academic reasons about whether there is
> such an algorithm. All the sources I've skimmed only seem to the
> answer the question via omission. Which is kind of strange, since it
> seems to me like an obvious question to ask.

IIRC comp.programming would be the place to ask such questions.

HTH,
Florian

Florian Brucker, Jul 14, 2009
6. ### Piet van OostrumGuest

>>>>> Douglas Alan <> (DA) wrote:

>DA> On Jul 13, 3:57 pm, (Aahz) wrote:
>>> Still, unless your list is large (more than thousands of elements),
>>> that's the way you should go.  See the bisect module.  Thing is, the
>>> speed difference between C and Python means the constant for insertion
>>> and deletion is very very small relative to bytecode speed.  Keep in
>>> mind that Python's object/binding model means that you're shuffling
>>> pointers in the list rather than items.

>DA> Thank you. My question wasn't intended to be Python specific, though.
>DA> I am just curious for purely academic reasons about whether there is
>DA> such an algorithm. All the sources I've skimmed only seem to the
>DA> answer the question via omission. Which is kind of strange, since it
>DA> seems to me like an obvious question to ask.

Of course you can take any BST algorithm and replace pointers by indices
in the array and allocate new elements in the array. But then you need
array elements to contain the indices for the children explicitely.
--
Piet van Oostrum <>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email:

Piet van Oostrum, Jul 14, 2009
7. ### Douglas AlanGuest

On Jul 14, 7:38 am, Florian Brucker <> wrote:

> Douglas Alan wrote:

> > Thank you. My question wasn't intended to be Python specific, though.
> > I am just curious for purely academic reasons about whether there is
> > such an algorithm. All the sources I've skimmed only seem to the
> > answer the question via omission. Which is kind of strange, since it
> > seems to me like an obvious question to ask.

> IIRC comp.programming would be the place to ask such questions.

Thanks, yes that does look like a good place to post such questions.
Unfortunately, it also looks to be overrun with stories on "hot girls
top and bottom sexy movie", though I'm sure I can ignore those. I'm
scared to look at the posting on "tricky bit twiddling", though.

|>ouglas

Douglas Alan, Jul 14, 2009
8. ### Douglas AlanGuest

On Jul 14, 8:10 am, Piet van Oostrum <> wrote:

> Of course you can take any BST algorithm and replace pointers by indices
> in the array and allocate new elements in the array. But then you need
> array elements to contain the indices for the children explicitely.

And why is this a problem? This is how binary heaps are typically
implemented, and it all works swimmingly. The node rotations for
keeping a binary heap balanced turn out to be suitable for
representation in a flat array. I.e., when you do the node rotations,
you only ever have to copy log n array elements.

In general, however, you can't move nodes around so easily when
represented in a flat array. A node movement in a tree represented
with pointers, might involves changing just two pointers, while when
represented as a flat array, might involve copying most of the array
to maintain the tree invariants. It just so turns out that maintaining
the invariants for a binary heap does not have this issue.

This is why I was curious about treaps, which are a type of binary
heap. The CLRS textbook on algorithms discusses treaps, but doesn't
ever mention whether they are as fortunate as less constrained binary
heaps. I'm sure I could work out for myself whether the treap
rotations are suitable for storage in a flat array, but I might make a
mistake somewhere in my reasoning, and then never know the true

|>ouglas

Douglas Alan, Jul 15, 2009
9. ### Douglas AlanGuest

On Jul 14, 9:19 am, Scott David Daniels <> wrote:

> It may well be that there is no good simple solution, and people avoid

I can't imagine why that should be the case. The CLRS textbook on
algorithms, for instance, goes to some pains to mathematically prove
that there is no comparison sort that can operate in faster than O(n
log n) time.

And any decent discussion of rabies would be sure to mention that
there is no known cure.

CLRS talks about binary heaps, binary search trees, and treaps, and it
shows how to maintain a binary heap in a flat array efficiently (n log
n time overall), but it never even seems to bring up the subject as to
whether a binary search tree or a treap can also be efficiently
maintained in a flat array.

Though it may be the case that these questions are left as exercises
for the student, and therefore buried in the exercises and problem
sets that I didn't read carefully.

> Piet van Oostrum wrote:

> > Of course you can take any BST algorithm and replace pointers by indices
> > in the array and allocate new elements in the array. But then you need
> > array elements to contain the indices for the children explicitely.

> And you loower your locality of reference (cache-friendliness).
> Note the insert in Python, for example, is quite cache-friendly.

I can't see that a binary search tree would typically have
particularly good cache-friendliness, so I can't see why a flat-array
representation, such as is done for a binary heap, would have
particularly worse cache-reference.

|>ouglas

Douglas Alan, Jul 15, 2009
10. ### Douglas AlanGuest

I wrote:

> On Jul 14, 8:10 am, Piet van Oostrum <> wrote:
>
> > Of course you can take any BST algorithm and replace pointers by indices
> > in the array and allocate new elements in the array. But then you need
> > array elements to contain the indices for the children explicitely.

> And why is this a problem?

Oh, I'm sorry -- I see what you are saying now. You're saying you can
just implement a normal binary search tree, but store the tree nodes
in an array, as if it were a chunk of memory, and use array indices as
pointers, rather than using memory addresses as pointers.

Fair enough, but I'm not sure what that would buy you. Other than,
perhaps some improved locality of reference, and the potential to
perhaps get the pointers take up less space if you know the array is
never going to grow to be very large.

|>ouglas

Douglas Alan, Jul 15, 2009
11. ### Paul RubinGuest

Douglas Alan <> writes:
> I can't see that a binary search tree would typically have
> particularly good cache-friendliness, so I can't see why a flat-array
> representation, such as is done for a binary heap, would have
> particularly worse cache-reference.

That is a good point. Maybe we should be paying more attention to
cache-oblivious algorithms.

http://en.wikipedia.org/wiki/Cache-oblivious_algorithm

H. Prokop's masters' thesis cited in the wiki article explains the
subject very well. A fair amount of work has been done on it since
then, but not as much as one might expect.

Paul Rubin, Jul 15, 2009
12. ### Dennis Lee BieberGuest

On Tue, 14 Jul 2009 16:23:58 -0700 (PDT), Douglas Alan
<> declaimed the following in
gmane.comp.python.general:

> And any decent discussion of rabies would be sure to mention that
> there is no known cure.
>

There have been one or two cases of recovery without undergoing the
immediate injection series...
--
Wulfraed Dennis Lee Bieber KD6MOG

HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: )
HTTP://www.bestiaria.com/

Dennis Lee Bieber, Jul 15, 2009
13. ### Piet van OostrumGuest

>>>>> Douglas Alan <> (DA) wrote:

>DA> I wrote:
>>> On Jul 14, 8:10 am, Piet van Oostrum <> wrote:
>>>
>>> > Of course you can take any BST algorithm and replace pointers by indices
>>> > in the array and allocate new elements in the array. But then you need
>>> > array elements to contain the indices for the children explicitely.

>>> And why is this a problem?

>DA> Oh, I'm sorry -- I see what you are saying now. You're saying you can
>DA> just implement a normal binary search tree, but store the tree nodes
>DA> in an array, as if it were a chunk of memory, and use array indices as
>DA> pointers, rather than using memory addresses as pointers.

>DA> Fair enough, but I'm not sure what that would buy you. Other than,
>DA> perhaps some improved locality of reference, and the potential to
>DA> perhaps get the pointers take up less space if you know the array is
>DA> never going to grow to be very large.

My second sentence that you quoted more or less means `it doesn't buy
you much'.
--
Piet van Oostrum <>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email:

Piet van Oostrum, Jul 15, 2009