fastest native python database?

Discussion in 'Python' started by per, Jun 18, 2009.

  1. per

    per Guest

    hi all,

    i'm looking for a native python package to run a very simple data
    base. i was originally using cpickle with dictionaries for my problem,
    but i was making dictionaries out of very large text files (around
    1000MB in size) and pickling was simply too slow.

    i am not looking for fancy SQL operations, just very simple data base
    operations (doesn't have to be SQL style) and my preference is for a
    module that just needs python and doesn't require me to run a separate
    data base like Sybase or MySQL.

    does anyone have any recommendations? the only candidates i've seen
    are snaklesql and buzhug... any thoughts/benchmarks on these?

    any info on this would be greatly appreciated. thank you
     
    per, Jun 18, 2009
    #1
    1. Advertising

  2. On 6/17/2009 8:28 PM per said...
    > hi all,
    >
    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.


    You might like gadfly...

    http://gadfly.sourceforge.net/gadfly.html

    Emile

    >
    > does anyone have any recommendations? the only candidates i've seen
    > are snaklesql and buzhug... any thoughts/benchmarks on these?
    >
    > any info on this would be greatly appreciated. thank you
     
    Emile van Sebille, Jun 18, 2009
    #2
    1. Advertising

  3. per

    per Guest

    i would like to add to my previous post that if an option like SQLite
    with a python interface (pysqlite) would be orders of magnitude faster
    than naive python options, i'd prefer that. but if that's not the
    case, a pure python solution without dependencies on other things
    would be the best option.

    thanks for the suggestion, will look into gadfly in the meantime.

    On Jun 17, 11:38 pm, Emile van Sebille <> wrote:
    > On 6/17/2009 8:28 PM per said...
    >
    > > hi all,

    >
    > > i'm looking for a native python package to run a very simple data
    > > base. i was originally using cpickle with dictionaries for my problem,
    > > but i was making dictionaries out of very large text files (around
    > > 1000MB in size) and pickling was simply too slow.

    >
    > > i am not looking for fancy SQL operations, just very simple data base
    > > operations (doesn't have to be SQL style) and my preference is for a
    > > module that just needs python and doesn't require me to run a separate
    > > data base like Sybase or MySQL.

    >
    > You might like gadfly...
    >
    > http://gadfly.sourceforge.net/gadfly.html
    >
    > Emile
    >
    >
    >
    > > does anyone have any recommendations? the only candidates i've seen
    > > are snaklesql and buzhug... any thoughts/benchmarks on these?

    >
    > > any info on this would be greatly appreciated. thank you

    >
    >
     
    per, Jun 18, 2009
    #3
  4. On Jun 17, 8:28 pm, per <> wrote:
    > hi all,
    >
    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.
    >
    > does anyone have any recommendations? the only candidates i've seen
    > are snaklesql and buzhug... any thoughts/benchmarks on these?
    >
    > any info on this would be greatly appreciated. thank you


    I don't know how they stack up but what about:

    Python CDB

    http://pilcrow.madison.wi.us/#pycdb

    or Dee (for ideological reasons)

    http://www.quicksort.co.uk/

    --
    William Clifford
     
    William Clifford, Jun 18, 2009
    #4
  5. On Thu, Jun 18, 2009 at 05:28, per<> wrote:
    > hi all,

    Hi,

    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.


    If you just need something which does not depend on any external
    libraries (that's what I understand in "just needs python"), you
    should also consider sqlite3 as it is a built-in module in Python 2.5
    and newer. You do not need modules like pysqlite to use it.

    --
    Pierre "delroth" Bourdon <>
    Étudiant à l'EPITA / Student at EPITA
     
    Pierre Bourdon, Jun 18, 2009
    #5
  6. On 18 juin, 05:28, per <> wrote:
    > hi all,
    >
    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.
    >
    > does anyone have any recommendations? the only candidates i've seen
    > are snaklesql and buzhug... any thoughts/benchmarks on these?
    >
    > any info on this would be greatly appreciated. thank you


    Hi,

    buzhug syntax doesn't use SQL statements, but a more Pythonic syntax :

    from buzhug import Base
    db = Base('foo').create(('name',str),('age',int))
    db.insert('john',33)
    # simple queries
    print db(name='john')
    # complex queries
    print [ rec.name for rec in db if age > 30 ]
    # update
    rec.update(age=34)

    I made a few speed comparisons with Gadfly, KirbyBase (another pure-
    Python DB, not maintained anymore) and SQLite. You can find the
    results on the buzhug home page : http://buzhug.sourceforge.net

    The conclusion is that buzhug is much faster than the other pure-
    Python db engines, and (only) 3 times slower than SQLite

    - Pierre
     
    Pierre Quentel, Jun 18, 2009
    #6
  7. In message <07ac7d7a-48e1-45e5-a21c-
    >, per wrote:

    > i'm looking for a native python package to run a very simple data
    > base.


    Use Python mapping objects. Most real-world databases will fit in memory
    anyway.
     
    Lawrence D'Oliveiro, Jun 18, 2009
    #7
  8. per

    pdpi Guest

    On Jun 18, 8:09 am, Pierre Quentel <> wrote:
    > On 18 juin, 05:28, per <> wrote:
    >
    >
    >
    >
    >
    > > hi all,

    >
    > > i'm looking for a native python package to run a very simple data
    > > base. i was originally using cpickle with dictionaries for my problem,
    > > but i was making dictionaries out of very large text files (around
    > > 1000MB in size) and pickling was simply too slow.

    >
    > > i am not looking for fancy SQL operations, just very simple data base
    > > operations (doesn't have to be SQL style) and my preference is for a
    > > module that just needs python and doesn't require me to run a separate
    > > data base like Sybase or MySQL.

    >
    > > does anyone have any recommendations? the only candidates i've seen
    > > are snaklesql and buzhug... any thoughts/benchmarks on these?

    >
    > > any info on this would be greatly appreciated. thank you

    >
    > Hi,
    >
    > buzhug syntax doesn't use SQL statements, but a more Pythonic syntax :
    >
    > from buzhug import Base
    > db = Base('foo').create(('name',str),('age',int))
    > db.insert('john',33)
    > # simple queries
    > print db(name='john')
    > # complex queries
    > print [ rec.name for rec in db if age > 30 ]
    > # update
    > rec.update(age=34)
    >
    > I made a few speed comparisons with Gadfly, KirbyBase (another pure-
    > Python DB, not maintained anymore) and SQLite. You can find the
    > results on the buzhug home page :http://buzhug.sourceforge.net
    >
    > The conclusion is that buzhug is much faster than the other pure-
    > Python db engines, and (only) 3 times slower than SQLite
    >
    > - Pierre


    Which means that, at this point in time, since both gadfly and sqlite
    use approximately the same API, sqlite takes the lead as a core
    package (post-2.5 anyway)
     
    pdpi, Jun 18, 2009
    #8
  9. per <> writes:

    > hi all,
    >
    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.
    >
    > does anyone have any recommendations? the only candidates i've seen
    > are snaklesql and buzhug... any thoughts/benchmarks on these?
    >
    > any info on this would be greatly appreciated. thank you


    berkeley db is pretty fast. locking and such nice features are
    included. the module you'd be looking at is bsddb i believe.
     
    J Kenneth King, Jun 18, 2009
    #9
  10. per

    Ethan Furman Guest

    Ethan Furman wrote:
    > This body part will be downloaded on demand.
    >



    Not sure what happened there... here's the text...

    Howdy, Pierre!

    I have also written a pure Python implementation of a database, one that
    uses dBase III or VFP 6 .dbf files. Any chance you could throw it into
    the mix to see how quickly (or slowly!) it runs?

    The code to run the same steps are (after an import dbf):

    #insert test
    table = dbf.Table('/tmp/tmptable', 'a N(6.0), b N(6.0), c C(100)')
    # if recs is list of tuples
    for rec in recs:
    table.append(rec)
    # elif recs is list of lists
    #for a, b, c in recs:
    # current = table.append()
    # current.a = a
    # current.b = b
    # current.c = c

    #select1 test
    for i in range(100):
    nb = len(table)
    if nb:
    avg = sum([record.b for record in table])/nb

    #select2 test
    for num_string in num_strings:
    records = table.find({'c':'%s'%num_string}, contained=True)
    nb = len(records)
    if nb:
    avg = sum([record.b for record in records])/nb

    #delete1 test
    for record in table:
    if 'fifty' in record.c:
    record.delete_record()
    # to purge the records would then require a table.pack()

    #delete2 test
    for rec in table:
    if 10 < rec.a < 20000:
    rec.delete_record()
    # again, permanent deletion requires a table.pack()

    #update1 test
    table.order('a')
    for i in range(100): # update description says 1000, update code is 100
    records = table.query(python='10*%d <= a < 10*%d' %(10*i,10*(i+1)))
    for rec in records:
    rec.b *= 2

    #update2 test
    records = table.query(python="0 <= a < 1000")
    for rec in records:
    rec.c = new_c[rec.a]

    Thanks, I hope! :)

    ~Ethan~
    http://groups.google.com/group/python-dbase
     
    Ethan Furman, Jun 19, 2009
    #10
  11. per

    Aaron Brady Guest

    On Jun 17, 8:28 pm, per <> wrote:
    > hi all,
    >
    > i'm looking for a native python package to run a very simple data
    > base. i was originally using cpickle with dictionaries for my problem,
    > but i was making dictionaries out of very large text files (around
    > 1000MB in size) and pickling was simply too slow.
    >
    > i am not looking for fancy SQL operations, just very simple data base
    > operations (doesn't have to be SQL style) and my preference is for a
    > module that just needs python and doesn't require me to run a separate
    > data base like Sybase or MySQL.
    >
    > does anyone have any recommendations? the only candidates i've seen
    > are snaklesql and buzhug... any thoughts/benchmarks on these?
    >
    > any info on this would be greatly appreciated. thank you


    I have one or two. If the objects you're pickling are all
    dictionaries, you could store file names in a master 'shelve' object,
    and nested data in the corresponding files.

    Otherwise, it may be pretty cheap to write the operations by hand
    using ctypes if you only need a few, though that can get precarious
    quickly. Just like life, huh?

    Lastly, the 'sqlite3' module's bark is worse than its byte.
     
    Aaron Brady, Jun 19, 2009
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Birgitte Engebretsen

    [ANN] Xtractor XDB - Native XML database

    Birgitte Engebretsen, Sep 9, 2003, in forum: XML
    Replies:
    0
    Views:
    657
    Birgitte Engebretsen
    Sep 9, 2003
  2. Markus Seibold
    Replies:
    5
    Views:
    672
    Andy Dingley
    Nov 13, 2003
  3. Replies:
    6
    Views:
    12,994
    yasso
    Apr 2, 2009
  4. Harry Haller
    Replies:
    1
    Views:
    221
  5. Chris Rebert
    Replies:
    0
    Views:
    278
    Chris Rebert
    Oct 28, 2012
Loading...

Share This Page