Safe file I/O to shared file (or SQLite) from multi-threaded webserver

P

python

I'm looking for the best practice way for a multi-threaded python web
server application to read/write to a shared file or a SQLite database.

What do I need to do (if anything) to make sure my writes to a regular
file on disk or to a SQLite database are atomic in nature when multiple
clients post data to my application simultaneously?

Do I need to use a Queue type data structure and then run a background
thread that monitors my Queue for data which it (and it alone) removes
and copies to the destination file or SQLite datatbase?

Note: In my specific case, the web server will be based on CherryPy 3.1
but I think this type of question is relevant across other Python based
web server frameworks as well.

Thank you,
Malcolm
 
D

Diez B. Roggisch

I'm looking for the best practice way for a multi-threaded python web
server application to read/write to a shared file or a SQLite database.

What do I need to do (if anything) to make sure my writes to a regular
file on disk or to a SQLite database are atomic in nature when multiple
clients post data to my application simultaneously?

Do I need to use a Queue type data structure and then run a background
thread that monitors my Queue for data which it (and it alone) removes
and copies to the destination file or SQLite datatbase?

Note: In my specific case, the web server will be based on CherryPy 3.1
but I think this type of question is relevant across other Python based
web server frameworks as well.

AFAIK, sqlite ensures process-serialization via locking, and threads
synchronize themselves as well.

So you shouldn't need to worry at all.

Diez
 
R

Roger Binns

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
AFAIK, sqlite ensures process-serialization via locking, and threads
synchronize themselves as well.

SQLite versions prior to 3.5 did not support using the same connection or
cursors in different threads. (You needed to allocate, use, and close all
in the same thread.) Since then SQLite objects can be used in any thread
you want at any time. The SQLite error handling API is not threadsafe and
requires a lock to be held otherwise you can get incorrect errors or worst
case program crashes. The sqlite3/pysqlite code does not hold that lock
(API introduced in SQLite 3.6.5) so you are only safe if you continue to
only use objects in the same thread. If you use APSW then you can use any
SQLite object at any time in any thread (it holds the lock amongst other
things).
So you shouldn't need to worry at all.

The main gotcha is that SQLite uses file locking and the default behaviour
when unable to get a lock is to immediately return an error. SQLite does
have an API to specify how long it should wait to acquire the lock (it keeps
retrying until the time expires).

sqlite3/pysqlite only lets you specify this maximum time when opening the
connection and defaults to 5 seconds. On a busy server this may be too
short so you'll end up getting busy errors. (Remember that writes require
at least two disk syncs and that the default behaviour for Linux is to flush
all outstanding writes not just for the file requested.)

If you use APSW then you get default SQLite behaviour and two APIs - one
lets you set/change the timeout period and the other lets you install your
own busy handler which can do whatever it wants in order to prod things along.

(Disclosure: I am the author of APSW.)

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAks+17QACgkQmOOfHg372QSiCwCgpr6fSOr6UcUUZqTDoFA4RBcK
zb8An21zZCr30AQ7VGP/Q/CsQ3z+2EVs
=55MC
-----END PGP SIGNATURE-----
 
J

John Nagle

I'm looking for the best practice way for a multi-threaded python web
server application to read/write to a shared file or a SQLite database.

What do I need to do (if anything) to make sure my writes to a regular
file on disk or to a SQLite database are atomic in nature when multiple
clients post data to my application simultaneously?

SQLite can do that correctly, but SQLite isn't intended for use
as a database engine for a busy database being used by many
concurrent operations. Especially if those operations involve
updates. Any update in SQLite locks all tables involved for the duration
of the operation. When SQLite hits a lock, it returns an error code, and
the caller should retry after a delay. If this occurs frequently in
your application, you've hit the limits of SQLite. Then it's
time to move up to MySQL.

If you have enough traffic that you need a multi-threaded web server,
it's probably time to move up.

John Nagle
 
S

Steve Holden

John said:
SQLite can do that correctly, but SQLite isn't intended for use
as a database engine for a busy database being used by many
concurrent operations. Especially if those operations involve
updates. Any update in SQLite locks all tables involved for the duration
of the operation. When SQLite hits a lock, it returns an error code, and
the caller should retry after a delay. If this occurs frequently in
your application, you've hit the limits of SQLite. Then it's
time to move up to MySQL.
Or PostgreSQL, which has superior SQL standards conformance and
excellent high-volume data performance.
If you have enough traffic that you need a multi-threaded web server,
it's probably time to move up.
Yes, but not to MySQL, please. Particularly since there is a sword of
Damocles hanging over its head while the Oracle takeover of Sun is pending.

regards
Steve
 
S

Steve Holden

John said:
SQLite can do that correctly, but SQLite isn't intended for use
as a database engine for a busy database being used by many
concurrent operations. Especially if those operations involve
updates. Any update in SQLite locks all tables involved for the duration
of the operation. When SQLite hits a lock, it returns an error code, and
the caller should retry after a delay. If this occurs frequently in
your application, you've hit the limits of SQLite. Then it's
time to move up to MySQL.
Or PostgreSQL, which has superior SQL standards conformance and
excellent high-volume data performance.
If you have enough traffic that you need a multi-threaded web server,
it's probably time to move up.
Yes, but not to MySQL, please. Particularly since there is a sword of
Damocles hanging over its head while the Oracle takeover of Sun is pending.

regards
Steve
 
L

Lawrence D'Oliveiro

Steve said:
Yes, but not to MySQL, please. Particularly since there is a sword of
Damocles hanging over its head while the Oracle takeover of Sun is
pending.

Ah, I see the FUDsters are crawling out of the woodwork here, as well. I’ve
got news for you: MySQL is an open-source product. And you can’t kill Open
Source. So go crawling back to your proprietary world, if that’s the only
world you understand.
 
S

Steve Holden

Lawrence said:
Ah, I see the FUDsters are crawling out of the woodwork here, as well. I’ve
got news for you: MySQL is an open-source product. And you can’t kill Open
Source. So go crawling back to your proprietary world, if that’s the only
world you understand.

I have no objection to you attempting to inform me about things I
already understand, but I would appreciate at least some attempt on your
part to maintain civility in your discourse. There is no need to be so
obnoxious, or so ill-informed: I didn't get to chair the Python Software
Foundation by "crawling [around in a] proprietary world", so kindly mind
your manners.

MySQL has always been technically inferior to other choices of open
source database. The current state of affairs was entirely predictable,
and appears to be more to do with Monty Widenius' wish to continue
exploiting a brand that he sold toSun two years ago than it has to do
with technical issues, as I pointed out yesterday.

http://holdenweb.blogspot.com/2010/01/wht-save-mysql-now.html

regards
Steve
 
D

Diez B. Roggisch

Lawrence said:
Ah, I see the FUDsters are crawling out of the woodwork here, as well. I’ve
got news for you: MySQL is an open-source product. And you can’t kill Open
Source. So go crawling back to your proprietary world, if that’s the only
world you understand.

Since when is suggesting Postgres a sign of a proprietary world crawler?

And while I don't wish MySQL anything bad - open source *can* die, and
will, if leadership changes for the worst - which can happen. And is
certainly an immediate threat here.

Diez
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top