Builtin classes list, set, dict reimplemented via B-trees

Discussion in 'Python' started by barnesc@engr.orst.edu, Sep 14, 2005.

  1. Guest

    Hi again,

    Since my linear algebra library appears not to serve any practical
    need (I found cgkit, and that works better for me), I've gotten bored
    and went back to one of my other projects: reimplementing the Python
    builtin classes list(), set(), dict(), and frozenset() with balanced
    trees (specifically, counted B-trees stored in memory).

    In short, this allows list lookup, insertion, deletion in O(log(N))
    time. It allows the set and dictionary types to be maintained in
    order, and insert/lookup/remove operations here take O(log(N)) time
    as well. Getting the k least or k greatest elements takes
    O(log(N)+k) time.

    I'm about 50% done with the implementation of this module.

    I thought I'd use this for Dijkstra's algorithm and some schedulers,
    but then I noticed that a binary heap can be retrofitted to have the
    DECREASE-KEY operation (see e.g. David Eppstein's priorityQueue).
    Thus my B-tree based classes aren't really necessary, since there are
    simpler heap implementations that do the same thing.

    So my question is: are there any other *practical* applications of a
    B-tree based list/set/dict ? In other words, is this module totally
    worth coding, or is it just academic wankery and theoretical flim
    flam ? :)

    Thanks for your comments/opinions/etc...

    - Connelly Barnes
    'Y29ubmVsbHliYXJuZXNAeWFob28uY29t'.decode('base64')


    ---------------------------------------------------------------------
    Detailed overview of the bcollections module (from the docstring)
    ---------------------------------------------------------------------

    """
    ....

    The bset() and bdict() classes improve upon the builtin set and
    dictionary types by maintaining the elements in sorted order.

    Generally, the b* classes are a factor of O(log(N)) slower[2] than the
    corresponding builtin classes, except for certain operations:

    - For a bset(), it takes O(log(N)) time to get the minimum, maximum,
    or ith element. Also, it takes O(log(N)+k) time to get the
    k first, k last, or any slice of k elements from the sorted set.
    - For a bdict(), dictionary keys can be retrieved in sorted order
    with the above time bounds.
    - For a blist(), items can be inserted and removed in O(log(N)) time.
    It takes O(log(N)+k) time to get, set, or delete a slice of k
    elements.

    ....

    blist - List implemented via B-tree.
    bset - Set implemented via B-tree.
    Additional methods:
    - s.min() - Minimum element
    - s.max() - Maximum element
    - s.index(x) - Index of x in sorted list of
    elements.
    - s.next(x) - Least element which is > x
    (x need not be in the set).
    - s.previous(x) - Greatest element which is < x
    (x need not be in the set).
    - s - Get element i from sorted list.
    - s[i:j] - Get slice from sorted list.
    bfrozenset - Immutable set implemented via B-tree.
    bdict - Dict implemented via B-tree.
    Additional methods:
    - a.min() - Minimum key
    - a.max() - Maximum key
    - a.index(k) - Index of k in sorted list of
    keys.
    - s.next(x) - Least key which is > x
    (x need not be an existing key).
    - s.previous(x) - Greatest key which is < x
    (x need not be an existing key).
    - a.get_key_at(i) - Get key at index i from sorted
    list of keys.
    - a.get_key_at(i, j) - Get slice from sorted key list.
    """
     
    , Sep 14, 2005
    #1
    1. Advertising

  2. Bryan Olson Guest

    wrote:
    > Hi again,
    >
    > Since my linear algebra library appears not to serve any practical
    > need (I found cgkit, and that works better for me), I've gotten bored
    > and went back to one of my other projects: reimplementing the Python
    > builtin classes list(), set(), dict(), and frozenset() with balanced
    > trees (specifically, counted B-trees stored in memory).

    [...]
    > So my question is: are there any other *practical* applications of a
    > B-tree based list/set/dict ? In other words, is this module totally
    > worth coding, or is it just academic wankery and theoretical flim
    > flam ? :)


    B-trees are specifically designed for disk storage. Seeking to a
    page takes much longer than reading a page of several kilobytes.
    B-trees read the keys for a many-way comparison in one seek.

    For in-memory comparison-based dictionaries, Red-Black trees,
    AVL trees, 2-3 trees, or skip-lists, are a better choice.


    --
    --Bryan
     
    Bryan Olson, Sep 14, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    391
    Bryan Olson
    Sep 14, 2005
  2. Replies:
    0
    Views:
    323
  3. Preben Randhol
    Replies:
    7
    Views:
    244
    Marc 'BlackJack' Rintsch
    Mar 3, 2008
  4. jacob navia

    Binary search trees (AVL trees)

    jacob navia, Jan 3, 2010, in forum: C Programming
    Replies:
    34
    Views:
    1,451
    Dann Corbit
    Jan 8, 2010
  5. bdb112
    Replies:
    2
    Views:
    311
    Chris Torek
    Jul 2, 2011
Loading...

Share This Page