fastest native python database?

per · Jun 18, 2009

hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you

Emile van Sebille · Jun 18, 2009

On 6/17/2009 8:28 PM per said...

hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

You might like gadfly...

http://gadfly.sourceforge.net/gadfly.html

Emile

per · Jun 18, 2009

i would like to add to my previous post that if an option like SQLite
with a python interface (pysqlite) would be orders of magnitude faster
than naive python options, i'd prefer that. but if that's not the
case, a pure python solution without dependencies on other things
would be the best option.

thanks for the suggestion, will look into gadfly in the meantime.

William Clifford · Jun 18, 2009

hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you

I don't know how they stack up but what about:

Python CDB

http://pilcrow.madison.wi.us/#pycdb

or Dee (for ideological reasons)

http://www.quicksort.co.uk/

Pierre Bourdon · Jun 18, 2009

hi all, Hi,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

If you just need something which does not depend on any external
libraries (that's what I understand in "just needs python"), you
should also consider sqlite3 as it is a built-in module in Python 2.5
and newer. You do not need modules like pysqlite to use it.

Pierre Quentel · Jun 18, 2009

hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you

Hi,

buzhug syntax doesn't use SQL statements, but a more Pythonic syntax :

from buzhug import Base
db = Base('foo').create(('name',str),('age',int))
db.insert('john',33)
# simple queries
print db(name='john')
# complex queries
print [ rec.name for rec in db if age > 30 ]
# update
rec.update(age=34)

I made a few speed comparisons with Gadfly, KirbyBase (another pure-
Python DB, not maintained anymore) and SQLite. You can find the
results on the buzhug home page : http://buzhug.sourceforge.net

The conclusion is that buzhug is much faster than the other pure-
Python db engines, and (only) 3 times slower than SQLite

- Pierre

Lawrence D'Oliveiro · Jun 18, 2009

In message <07ac7d7a-48e1-45e5-a21c-

i'm looking for a native python package to run a very simple data
base.

Use Python mapping objects. Most real-world databases will fit in memory
anyway.

pdpi · Jun 18, 2009

hi all,

Click to expand...

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

Click to expand...

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

Click to expand...

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

Click to expand...

any info on this would be greatly appreciated. thank you

Click to expand...

Hi,

buzhug syntax doesn't use SQL statements, but a more Pythonic syntax :

from buzhug import Base
db = Base('foo').create(('name',str),('age',int))
db.insert('john',33)
# simple queries
print db(name='john')
# complex queries
print [ rec.name for rec in db if age > 30 ]
# update
rec.update(age=34)

I made a few speed comparisons with Gadfly, KirbyBase (another pure-
Python DB, not maintained anymore) and SQLite. You can find the
results on the buzhug home page :http://buzhug.sourceforge.net

The conclusion is that buzhug is much faster than the other pure-
Python db engines, and (only) 3 times slower than SQLite

- Pierre

Which means that, at this point in time, since both gadfly and sqlite
use approximately the same API, sqlite takes the lead as a core
package (post-2.5 anyway)

J Kenneth King · Jun 18, 2009

per said:
hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you

berkeley db is pretty fast. locking and such nice features are
included. the module you'd be looking at is bsddb i believe.

Ethan Furman · Jun 19, 2009

Ethan said:
This body part will be downloaded on demand.

Not sure what happened there... here's the text...

Howdy, Pierre!

I have also written a pure Python implementation of a database, one that
uses dBase III or VFP 6 .dbf files. Any chance you could throw it into
the mix to see how quickly (or slowly!) it runs?

The code to run the same steps are (after an import dbf):

#insert test
table = dbf.Table('/tmp/tmptable', 'a N(6.0), b N(6.0), c C(100)')
# if recs is list of tuples
for rec in recs:
table.append(rec)
# elif recs is list of lists
#for a, b, c in recs:
# current = table.append()
# current.a = a
# current.b = b
# current.c = c

#select1 test
for i in range(100):
nb = len(table)
if nb:
avg = sum([record.b for record in table])/nb

#select2 test
for num_string in num_strings:
records = table.find({'c':'%s'%num_string}, contained=True)
nb = len(records)
if nb:
avg = sum([record.b for record in records])/nb

#delete1 test
for record in table:
if 'fifty' in record.c:
record.delete_record()
# to purge the records would then require a table.pack()

#delete2 test
for rec in table:
if 10 < rec.a < 20000:
rec.delete_record()
# again, permanent deletion requires a table.pack()

#update1 test
table.order('a')
for i in range(100): # update description says 1000, update code is 100
records = table.query(python='10*%d <= a < 10*%d' %(10*i,10*(i+1)))
for rec in records:
rec.b *= 2

#update2 test
records = table.query(python="0 <= a < 1000")
for rec in records:
rec.c = new_c[rec.a]

Thanks, I hope!

~Ethan~
http://groups.google.com/group/python-dbase

Aaron Brady · Jun 19, 2009

hi all,

i'm looking for a native python package to run a very simple data
base. i was originally using cpickle with dictionaries for my problem,
but i was making dictionaries out of very large text files (around
1000MB in size) and pickling was simply too slow.

i am not looking for fancy SQL operations, just very simple data base
operations (doesn't have to be SQL style) and my preference is for a
module that just needs python and doesn't require me to run a separate
data base like Sybase or MySQL.

does anyone have any recommendations? the only candidates i've seen
are snaklesql and buzhug... any thoughts/benchmarks on these?

any info on this would be greatly appreciated. thank you

I have one or two. If the objects you're pickling are all
dictionaries, you could store file names in a master 'shelve' object,
and nested data in the corresponding files.

Otherwise, it may be pretty cheap to write the operations by hand
using ctypes if you only need a few, though that can get precarious
quickly. Just like life, huh?

Lastly, the 'sqlite3' module's bark is worse than its byte.

ANN: eGenix mxODBC Connect 2.1.0 - Python ODBC Database Interface	0	May 28, 2014
Python Internet Database	5	May 9, 2014
ANN: eGenix mxODBC 3.2.0 - Python ODBC Database Interface	0	Aug 28, 2012
ANN: eGenix mxODBC Connect - Python Database Interface 2.0.2	0	Dec 14, 2012
Native Code vs. Python code for modules	9	Jul 30, 2008
data structure suggestion (native python datatypes or sqlite;compound select)	0	Sep 16, 2010
ANN: eGenix mxODBC - Python ODBC Database Interface 3.1.0	0	Aug 19, 2010
ANN: eGenix mxODBC - Python ODBC Database Interface 3.1.1	0	May 25, 2011

fastest native python database?

per

Emile van Sebille

per

William Clifford

Pierre Bourdon

Pierre Quentel

Lawrence D'Oliveiro

pdpi

J Kenneth King

Ethan Furman

Aaron Brady

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads