sqlite3 db update extremely slow

C

coldpizza

I am trying to fill a sqlite3 database with records, and to this end I
have written a class that creates the db, the table and adds rows to
the table.

The problem is that the updating process is *extremely* slow, and
occasionally I get the message "database locked".

I tried removing "self.con.commit()" in the add_record method, but
then nothing is saved in the db. I don't know whether this has
anything to do with it, but I have found no option to enable
autocommit.

This is the class that I am using:

class sqliteDB(object):
"Wrapper for SQLite methods"
def __init__(self, db_file="sqlite3.db"):
'Intialize SQLite database, sqlite_db_init("db_file_name.db")'
print 'SQLite db init: ', db_file
self.con = sqlite3.connect(db_file)
self.cur = self.con.cursor()

def create_table(self, table):
"create table (table_name)"

query ='CREATE TABLE %s (hword VARCHAR(256) PRIMARY KEY,
definition TEXT)' % table
try:
self.cur.execute(query)
self.con.commit()
except Exception, e:
print e

def add_record (self, table, headWord, definition):

try:
self.cur.execute('INSERT INTO ' + table + '(hword,
definition) VALUES(?, ?)', (headWord, definition))
self.con.commit()
except Exception, e:
print e

And this is the actual code that I use to write to the db file:

db = sqliteDB()
db.create_table("table_name")

for k, v in myData:
db.add_record(table, k,v)

This works extremely slow (~10KB of data per second) and takes ages to
complete even with small files. Where did I go wrong?

Would it be faster (or possible) to import a text file to sqlite using
something like the mysql's command
LOAD DATA INFILE "myfile.csv"...?
 
R

Roel Schroeven

coldpizza schreef:
I am trying to fill a sqlite3 database with records, and to this end I
have written a class that creates the db, the table and adds rows to
the table.

The problem is that the updating process is *extremely* slow, and
occasionally I get the message "database locked".

I tried removing "self.con.commit()" in the add_record method, but
then nothing is saved in the db. I don't know whether this has
anything to do with it, but I have found no option to enable
autocommit.

Remove self.con.commit() from add_record(), and do it once after all
records are added.

The reason that the process is slow with a commit after every INSERT is
that sqlite syncs the inserted data to disk before it continues.
 
C

coldpizza

Thanks a lot, Roel, adding a single commit() at the end did solve the
speed problem.

Another question is do I have to explicitly close the DB connection,
or is it automatically garbage collected? Is it Ok to no have any
cleanup code?

Another question would be how to define the encoding for newly added
records?
And how do set the encoding for the retrieved records? Is it always
utf-8 by default?
 
S

Steve Holden

coldpizza said:
Thanks a lot, Roel, adding a single commit() at the end did solve the
speed problem.

Another question is do I have to explicitly close the DB connection,
or is it automatically garbage collected? Is it Ok to no have any
cleanup code?
It's generally OK, but you can register a function with atexit() if you
are paranoid about cleanup. Here's a sample with an ugly global variable.

from atexit import register

def close():
global conn
if conn:
conn.close()
print "Database closed"
conn = None

#
# We try to ensure the database is always closed by registering
# the nodule's close() function to be called on program exit
#
register(close)

import psycopg2 as db
conn = db.connect(database="billing", user="steve", password="tadaa!")
curs = conn.cursor()

print "Database opened"
Another question would be how to define the encoding for newly added
records?
And how do set the encoding for the retrieved records? Is it always
utf-8 by default?
Generally speaking each database instance will have an associated
encoding. Trying to establish some other encoding would then be pissing
into the wind.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
 
B

Ben Finney

Steve Holden said:
# We try to ensure the database is always closed by registering
# the nodule's close() function to be called on program exit

Ooh! Where do I find more about writing Python nodules? Is it related
to cluster programming?
 
G

Guest

coldpizza said:
Thanks a lot, Roel, adding a single commit() at the end did solve the
speed problem.

Another question is do I have to explicitly close the DB connection,
or is it automatically garbage collected? Is it Ok to no have any
cleanup code?

Another question would be how to define the encoding for newly added
records?
And how do set the encoding for the retrieved records? Is it always
utf-8 by default?

SQLite databases store text in UTF-8 encoding. If you use pysqlite, and
always use unicode strings, you will never have any problems with that.
pysqlite does not rap on your knuckles if you store arbitrary encodings in
the database, but you will feel sorry once you try to fetch the data:
.... # watch now
....Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pysqlite2.dbapi2.OperationalError: Could not decode to UTF-8 column 'bar'
with text '��'
HTH

-- Gerhard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top