SQLite or files?

V

vlad

Hello,

I'm a novice in Python and got one question related to the information
storage for my application.

I'm currently working on rewriting my own the program that stores
everyday information sitting in the system tray. It was written in
Delphi some time ago and proved to be durable and fast enough. All the
info was stored in the simple RTF files.

However now I'd like to rewrite this program in Python (using PyQt) as
I want to make it cross-platform and add/remove some features. Now I'm
thinking about where to store my information. Would it be better to
use files as I used to do or use the database, SQLite in particular?
What will be faster and more flexible in the long run? This
application will be in the memory most of the time so I'm also
concerned about memory usage.

If anyone has any experience of comparison these two approaches please
help.

Thanks,
Vlad
 
T

Tim Chase

info was stored in the simple RTF files.
However now I'd like to rewrite this program in Python (using PyQt) as
I want to make it cross-platform and add/remove some features. Now I'm
thinking about where to store my information. Would it be better to
use files as I used to do or use the database, SQLite in particular?
What will be faster and more flexible in the long run? This
application will be in the memory most of the time so I'm also
concerned about memory usage.

Not knowing what you do with the files, nor what sort of data
they contain makes it a bit difficult to make suggestions in
light of the context. However:

- sqlite will let you perform arbitrary queries against your
data, so if you want to aggregate or perform complex conditional
tests, it's just SQL.

- I don't know if you're currently keeping the RTF in memory the
whole time, or if you repeatedly reload (whether in one go, or
streaming it) and reparse the file. This sounds memory and/or
processor intensive. Using sqlite, the processing is done at the
C-module level, the data is kept on disk and only brought into
memory as-requested, being released when you're done with it.

- concurrently sharing a sqlite database should have minimal
issues. Sharing RTF files concurrently means locking/contention
issues. May not be an issue for you.

- sqlite comes built-in with Python2.5+ while RTF processing is
not batteries-included from what I can tell[1]

So in the general case, I see sqlite being a notable win over
RTF. Depending on your data, if it's just key/value pairs of
strings, you might look into the anydbm module[2] which has an
even simpler interface.

-tkc

[1]
http://pyrtf.sf.net

[2]
http://docs.python.org/library/anydbm.html
 
G

Gabriel Genellina

I like it too, but I hear the great powers that be are going to
deprecate it.

Why?
Even if it were to be deprecated, that could only happen in 2.7, the
module would still be there in 2.8, and could only disappear in 2.9 (or
3.4) (see PEP 4: Deprecation of Standard Modules). And even after it's
gone, being a pure Python module it's easy to keep the previous version
around.
So, I wouldn't worry about that.
 
T

TerryP

Gabriel said:
And even after it's
gone, being a pure Python module it's easy to keep the previous version
around.
So, I wouldn't worry about that.

Yeah, I'm sure that is the same kind of thinking that caused 16-bit MS-
DOS applications to remain a part of Windows NT so long.
 
A

alex23

TerryP said:
Yeah, I'm sure that is the same kind of thinking that caused 16-bit MS-
DOS applications to remain a part of Windows NT so long.

So what part of the standard library do you recommend using instead?
Or was there no time for advice between snarkiness?
 
A

Aahz

- I don't know if you're currently keeping the RTF in memory the
whole time, or if you repeatedly reload (whether in one go, or
streaming it) and reparse the file. This sounds memory and/or
processor intensive. Using sqlite, the processing is done at the
C-module level, the data is kept on disk and only brought into
memory as-requested, being released when you're done with it.

You can also make a SQLite database be in-memory, giving you the
performance benefits of skipping the disk.
 
T

TerryP

alex23 said:
So what part of the standard library do you recommend using instead?
Or was there no time for advice between snarkiness?

As a matter of technique, I believe in fitting the storage to the
particulars of the problem at hand.

In my own projects, I will often employ simple text based formats
(unix-rc, ini, or xml) whenever possible, and then roll the
application specifics to suit it -- for any data that I expect that
gaining good compression rates on later, will be favourable.
Personally from what I've read in this thread, I would suggest using
sqlite3 or an xml parser, depending on exactly what the OP wants.
SQLite3 is fairly stable for routine use, and assuming that the OP has
half a clue of figuring it out, would probably suit'em perfectly with
much less bother then the standard xml brews.

Over the years I have seen virtually everything tried for storing
information, down to writing dictionaries out to a file for later
slupin' & eval() recovery, which is a method that I have occasionally
thrown my hands up at.... I don't even want to mention some of the
commercial products I've bumped into!
 
T

Tim Chase

You can also make a SQLite database be in-memory, giving you
the performance benefits of skipping the disk.


Yes, I love the in-memory database -- especially for my automated
testing in Django. However, the OP said that memory footprint
was a concern (granted, I don't know how much data they were
talking about -- a couple dozen rows in 1-2 tables might make
this an attractive option).

-tkc
 
A

AggieDan04

I like it too, but I hear the great powers that be are going to
deprecate it.

If you want the convenience of shelve without the limitations of dbm,
you can do:


"""Implementation of Python shelves using SQLite."""

from __future__ import division

import UserDict
import pickle
import sqlite3

def to_db_type(value):
"""
If value's type is supported natively in SQLite, return value.
Otherwise, return a pickled representation.
"""
if value is None or isinstance(value, (int, long, float,
basestring)):
return value
else:
return buffer(pickle.dumps(value))

def from_db_type(value):
"""
Converts a value from the database to a Python object.
"""
if isinstance(value, buffer):
return pickle.loads(value)
else:
return value

class SQLiteShelf(UserDict.DictMixin):
"""
Shelf implementation using an SQLite3 database.
"""
def __init__(self, filename):
self._database = sqlite3.connect(filename)
self._database.execute("CREATE TABLE IF NOT EXISTS Shelf "
"(Key TEXT PRIMARY KEY NOT NULL, Value
BLOB)")
self._open = True
def __del__(self):
self.close()
def __getitem__(self, key):
row = self._database.execute("SELECT Value FROM Shelf WHERE
Key=?",
[key]).fetchone()
if row:
return from_db_type(row[0])
else:
raise KeyError(key)
def __setitem__(self, key, value):
self._database.execute("INSERT OR REPLACE INTO Shelf VALUES
(?, ?)",
[key, to_db_type(value)])
def __delitem__(self, key):
self._database.execute("DELETE FROM Shelf WHERE Key=?", [key])
def keys(self):
"""Return a list of keys in the shelf."""
return [row[0] for row in
self._database.execute("SELECT Key FROM Shelf")]
def close(self):
"""Commit changes and close the file."""
if self._database is not None:
self._database.commit()
self._database.close()
self._database = None
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top