BTree examples ??

Discussion in 'Perl Misc' started by Kenjis Kaan, Jun 27, 2003.

  1. Kenjis Kaan

    Kenjis Kaan Guest

    Hello. I am wondering if anyone has an example of using BTree for
    storing data on a disk file.?? I found some BTree modules on the
    web, but none give example of how you would then store it and retrieve
    it from disk drive. I have a need to store hundred of thousands of
    key, data values but told not to use a RDBMS. It would be nice if I
    can get some sample of that actually shows the write to disk and
    subsequent retrieval from it, which then allows you to do search for
    records by supplying a key. TIA
    Kenjis Kaan, Jun 27, 2003
    #1
    1. Advertising

  2. Kenjis Kaan

    Russ Jones Guest

    (Kenjis Kaan) wrote in
    news::

    > Hello. I am wondering if anyone has an example of using BTree for
    > storing data on a disk file.?? I found some BTree modules on the
    > web, but none give example of how you would then store it and retrieve
    > it from disk drive. I have a need to store hundred of thousands of
    > key, data values but told not to use a RDBMS. It would be nice if I
    > can get some sample of that actually shows the write to disk and
    > subsequent retrieval from it, which then allows you to do search for
    > records by supplying a key. TIA
    >


    Using a hash to hold your data and then using the Storable module to save
    and retrieve it comes immediately to mind, but that "hundreds of
    thousands" of records sounds pretty big.

    The venerable "Algorithms + Data Structures = Programs," Niklaus Wirth,
    1975, contains (as I recall) some BTree code. It's in Pascal, but it might
    be a starting place.
    Russ Jones, Jun 27, 2003
    #2
    1. Advertising

  3. Kenjis Kaan

    Malte Ubl Guest

    Kenjis Kaan wrote:
    > Hello. I am wondering if anyone has an example of using BTree for
    > storing data on a disk file.?? I found some BTree modules on the
    > web, but none give example of how you would then store it and retrieve
    > it from disk drive. I have a need to store hundred of thousands of
    > key, data values but told not to use a RDBMS. It would be nice if I
    > can get some sample of that actually shows the write to disk and
    > subsequent retrieval from it, which then allows you to do search for
    > records by supplying a key. TIA



    Wouldn't DBD::SQLite be a solution? This module includes a complete
    public domain RDBMS.

    http://search.cpan.org/author/MSERGEANT/DBD-SQLite-0.25/lib/DBD/SQLite.pm

    You could use it and not tell anybody :)

    malte
    Malte Ubl, Jun 27, 2003
    #3
  4. "Kenjis Kaan" <> wrote in message
    news:...
    > Hello. I am wondering if anyone has an example of using BTree for
    > storing data on a disk file.?? I found some BTree modules on the
    > web, but none give example of how you would then store it and retrieve
    > it from disk drive. I have a need to store hundred of thousands of
    > key, data values but told not to use a RDBMS. It would be nice if I
    > can get some sample of that actually shows the write to disk and
    > subsequent retrieval from it, which then allows you to do search for
    > records by supplying a key. TIA


    The DB_File module, that comes with Perl, has a btree opion. You can easily
    store hundres of thousands of key/data pairs in it.

    The example below is derived from the docuemntation that comes with DB_File

    use warnings ;
    use strict ;
    use DB_File ;

    my %h ;
    tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
    or die "Cannot open file 'tree': $!\n" ;

    # Add a key/value pair to the file
    $h{'Wall'} = 'Larry' ;
    $h{'Smith'} = 'John' ;
    $h{'mouse'} = 'mickey' ;
    $h{'duck'} = 'donald' ;

    # Delete
    delete $h{"duck"} ;

    # Cycle through the keys printing them in order.
    # Note it is not necessary to sort the keys as
    # the btree will have kept them in order automatically.
    foreach (keys %h)
    { print "$_\n" }

    untie %h ;

    Paul
    Paul Marquess, Jun 27, 2003
    #4
  5. Kenjis Kaan

    Vlad Tepes Guest

    Paul Marquess <> wrote:

    > The DB_File module, that comes with Perl, has a btree opion. You can
    > easily store hundres of thousands of key/data pairs in it.
    >
    >
    > The example below is derived from the docuemntation that comes with
    > DB_File
    >
    > use warnings ;
    > use strict ;
    > use DB_File ;
    >
    > my %h ;
    > tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
    > or die "Cannot open file 'tree': $!\n" ;
    >
    > # Add a key/value pair to the file
    > $h{'Wall'} = 'Larry' ;
    > $h{'Smith'} = 'John' ;
    > $h{'mouse'} = 'mickey' ;
    > $h{'duck'} = 'donald' ;
    >
    > # Delete
    > delete $h{"duck"} ;
    >
    > # Cycle through the keys printing them in order.
    > # Note it is not necessary to sort the keys as
    > # the btree will have kept them in order automatically.
    > foreach (keys %h)
    > { print "$_\n" }
    >
    > untie %h ;
    >
    > Paul


    ( I'm not sure about this, but I think I'm right: )

    A small note: If you have a very big hash, you'd might like to save
    memory by doing

    while ( (my $k, undef) = each %h ) {
    print "$k\n";
    }

    instead of using the foreach loop in the above example.

    --
    Vlad
    Vlad Tepes, Jun 27, 2003
    #5
  6. Kenjis Kaan

    Kenjis Kaan Guest

    Jörg Westheide <> wrote in message news:<>...
    > Hi!
    >
    > > I am wondering if anyone has an example of using BTree for
    > > storing data on a disk file.?? I found some BTree modules on the
    > > web, but none give example of how you would then store it and retrieve
    > > it from disk drive.

    >
    > Have a look at the DB File module
    >
    > > I have a need to store hundred of thousands of
    > > key, data values

    >
    > That's no problem (I used it with some millions)
    > The only backdraw I discovered, is the really bad performance on
    > Win(2K), it's fine on Linux though
    >
    > J rg


    But Win2k is what I am doing all this on. Using Activestate Perl.
    Kenjis Kaan, Jun 29, 2003
    #6
  7. Kenjis Kaan

    Kenjis Kaan Guest

    There has go to be a way to use BTree.pm (its floating around the
    internet) without resorting to the fancy DB_File with all the bells
    and whistles. All I want is just ot create a simple file containing
    BTree objects so I can then search and replace records at will. I
    think DB_File is an overkill unless I can put everything in the
    development directory which becomes part of the install package. I
    notice the only thing BTree.pm is missing is the delete operation.
    Which is really weird, I remember studying this stuffs back in school,
    and the books shows you some code with insert etc, but the delete is
    left as an exercise. Its beyond me that some guy would write the
    BTree.pm but left out the delete operations. I know its a bit harder
    to implement, but if you are going to do it, might as well make it
    complete so people can use it.

    "Paul Marquess" <> wrote in message news:<3efc67a3$0$18488$>...
    > "Kenjis Kaan" <> wrote in message
    > news:...
    > > Hello. I am wondering if anyone has an example of using BTree for
    > > storing data on a disk file.?? I found some BTree modules on the
    > > web, but none give example of how you would then store it and retrieve
    > > it from disk drive. I have a need to store hundred of thousands of
    > > key, data values but told not to use a RDBMS. It would be nice if I
    > > can get some sample of that actually shows the write to disk and
    > > subsequent retrieval from it, which then allows you to do search for
    > > records by supplying a key. TIA

    >
    > The DB_File module, that comes with Perl, has a btree opion. You can easily
    > store hundres of thousands of key/data pairs in it.
    >
    > The example below is derived from the docuemntation that comes with DB_File
    >
    > use warnings ;
    > use strict ;
    > use DB_File ;
    >
    > my %h ;
    > tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
    > or die "Cannot open file 'tree': $!\n" ;
    >
    > # Add a key/value pair to the file
    > $h{'Wall'} = 'Larry' ;
    > $h{'Smith'} = 'John' ;
    > $h{'mouse'} = 'mickey' ;
    > $h{'duck'} = 'donald' ;
    >
    > # Delete
    > delete $h{"duck"} ;
    >
    > # Cycle through the keys printing them in order.
    > # Note it is not necessary to sort the keys as
    > # the btree will have kept them in order automatically.
    > foreach (keys %h)
    > { print "$_\n" }
    >
    > untie %h ;
    >
    > Paul
    Kenjis Kaan, Jun 29, 2003
    #7
  8. In article <>,
    Kenjis Kaan <> wrote:
    >There has go to be a way to use BTree.pm (its floating around the
    >internet) without resorting to the fancy DB_File with all the bells
    >and whistles.


    Are you talking about the BTree.pm at

    http://perl.plover.com/BTree/

    ? Or have you found some other package? If you want help with a piece
    of software, you need to tell people what and where it is or else they
    won't know and they can't help you with it.

    >Its beyond me that some guy would write the BTree.pm but left out the
    >delete operations. I know its a bit harder to implement, but if you
    >are going to do it, might as well make it complete so people can use
    >it.


    If you *are* talking about the BTree.pm at
    http://perl.plover.com/BTree/, then I can tell you the answer to your
    question. The BTree.pm module was written as example code to
    accompany a magazine article about how B-Trees work. It was not
    intended to be used for real projects, because any such use would be
    silly even if the module did contain 'delete'. The author left out
    'delete' to keep the code short enough for publication and because the
    article didn't deal with 'delete'.

    However, if you really think that a complete BTree.pm is what you
    need, I expect that for a fee, the module author would be willing to
    add those functions to the module, and also extend it to store its
    values on the disk instead of in memory. You could spend some of the
    money you are saving by not using an RDBMS.
    Mark Jason Dominus, Jun 29, 2003
    #8
  9. "Jörg Westheide" <> wrote in message
    news:...
    Hi!

    > I am wondering if anyone has an example of using BTree for
    > storing data on a disk file.?? I found some BTree modules on the
    > web, but none give example of how you would then store it and retrieve
    > it from disk drive.


    Have a look at the DB_File module

    > I have a need to store hundred of thousands of
    > key, data values


    That's no problem (I used it with some millions)
    The only backdraw I discovered, is the really bad performance on
    Win(2K), it's fine on Linux though

    Jörg
    Paul Marquess, Jun 30, 2003
    #9
  10. "Vlad Tepes" <> wrote in message
    news:bdigil$a5a$...
    > Paul Marquess <> wrote:
    >
    > > The DB_File module, that comes with Perl, has a btree opion. You can
    > > easily store hundres of thousands of key/data pairs in it.
    > >
    > >
    > > The example below is derived from the docuemntation that comes with
    > > DB_File
    > >
    > > use warnings ;
    > > use strict ;
    > > use DB_File ;
    > >
    > > my %h ;
    > > tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
    > > or die "Cannot open file 'tree': $!\n" ;
    > >
    > > # Add a key/value pair to the file
    > > $h{'Wall'} = 'Larry' ;
    > > $h{'Smith'} = 'John' ;
    > > $h{'mouse'} = 'mickey' ;
    > > $h{'duck'} = 'donald' ;
    > >
    > > # Delete
    > > delete $h{"duck"} ;
    > >
    > > # Cycle through the keys printing them in order.
    > > # Note it is not necessary to sort the keys as
    > > # the btree will have kept them in order automatically.
    > > foreach (keys %h)
    > > { print "$_\n" }
    > >
    > > untie %h ;
    > >
    > > Paul

    >
    > ( I'm not sure about this, but I think I'm right: )
    >
    > A small note: If you have a very big hash, you'd might like to save
    > memory by doing
    >
    > while ( (my $k, undef) = each %h ) {
    > print "$k\n";
    > }
    >
    > instead of using the foreach loop in the above example.


    Good catch. You are quite correct. If you have millions of records in the
    database, this is the only way to iterate through them without running out
    of memory.

    Using "keys" of "values" will result in *all* the keys or values being read
    into memory in one go. The "each" function will only read a single key/value
    pair at a time.

    Paul
    Paul Marquess, Jun 30, 2003
    #10
  11. "Jörg Westheide" <> wrote in message
    news:...
    > Hi!
    >
    > > I am wondering if anyone has an example of using BTree for
    > > storing data on a disk file.?? I found some BTree modules on the
    > > web, but none give example of how you would then store it and retrieve
    > > it from disk drive.

    >
    > Have a look at the DB_File module
    >
    > > I have a need to store hundred of thousands of
    > > key, data values

    >
    > That's no problem (I used it with some millions)
    > The only backdraw I discovered, is the really bad performance on
    > Win(2K), it's fine on Linux though


    Can you elaborate on the performance issues on Win2k?

    I don't do any development on Windows myself, and haven't had any reports of
    bad performance (as far as I can remember). If there is a performance issue
    with Windows, I'd like to fix it (if it is fixable), or document it if it is
    not.

    cheers
    Paul
    Paul Marquess, Jun 30, 2003
    #11
  12. Kenjis Kaan

    Guest

    (Kenjis Kaan) wrote:
    > Hello. I am wondering if anyone has an example of using BTree for
    > storing data on a disk file.?? I found some BTree modules on the
    > web, but none give example of how you would then store it and retrieve
    > it from disk drive. I have a need to store hundred of thousands of
    > key, data values but told not to use a RDBMS.


    Were you told *why* not to use a RDBMS? If there is a valid reason
    to exclude RDBMS, that same reason might exclude other suggestions.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service New Rate! $9.95/Month 50GB
    , Jun 30, 2003
    #12
  13. "Jörg Westheide" <> wrote in message
    news:...
    > Hi Paul!
    >
    > > Can you elaborate on the performance issues on Win2k?

    >
    > Well, we have a program that reads data, stores them in a DB_File BTree
    > and then reads and prints them (sorted).
    > With some test data we had times of 40 minutes for a whole run under
    > Linux. Under Win2K we killed it after 24 hours while it was still
    > reading and storing the data. Profiling showed that inserting the data
    > is the problem (and not in our code)
    >
    > > I don't do any development on Windows myself, and haven't had any
    > > reports of bad performance (as far as I can remember).

    >
    > Maybe others don't use it with that much data. We had more than a
    > million entries in the BTree and the db file grew to more than 1GB. It
    > seemed to me that the perfomance became more worse the more data was
    > stored. With only some 10,000 entries there was not a big difference
    >
    > > If there is a performance issue
    > > with Windows, I'd like to fix it (if it is fixable), or document it
    > > if it is not.

    >
    > I would like to see it fixed.
    > If you need additional information, let me know


    Hmm, that certainly is a big difference!

    To take this any further, I need to reproduce the behaviour myself, so I
    need a few more details on the data you are storing - for example, are all
    keys fixed length? Are you using the default sorting algorithm that comes
    with Btree? If not, can I see the user-defined sorting sub please? Is the
    data in a completely random order or partially sorted?

    Ideally, I'd like to have a look at the script you are using, plus a sample
    of real data.

    Paul
    Paul Marquess, Jul 3, 2003
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nobody
    Replies:
    5
    Views:
    762
    Claudio Puviani
    Feb 9, 2004
  2. Nobody
    Replies:
    2
    Views:
    779
    John Harrison
    Feb 10, 2004
  3. Eloff

    Hashtable or Btree?

    Eloff, Dec 23, 2004, in forum: C++
    Replies:
    4
    Views:
    1,123
    Martin Stettner
    Dec 26, 2004
  4. Spam Me Please

    BTree beginner

    Spam Me Please, Mar 19, 2005, in forum: C++
    Replies:
    4
    Views:
    3,346
    Karl Heinz Buchegger
    Mar 21, 2005
  5. os2

    btree

    os2, Oct 25, 2003, in forum: C Programming
    Replies:
    4
    Views:
    492
    Ravi Uday
    Oct 27, 2003
Loading...

Share This Page