Solutions for data storage?

L

Leif K-Brooks

I'm writing a relatively simple multi-user public Web application with
Python. It's a rewrite of a similar application which used PHP+MySQL
(not particularly clean code, either). My opinions on various Web
frameworks tends to vary with the phase of the moon, but currently, I'm
planning to use Quixote.

Needless to say, my application will need some kind of persistent data
storage. The previous PHP+MySQL application uses around 1.1GB of
storage, so something like PyPerSyst where everything is kept in memory
is out.

I've looked at SQLObject, and it's very nice, but it doesn't provide
certain features I really want, like the ability to store lists of
strings or integers directly in the database (using commas in a varchar
column or something).

ZODB is very nice, but IndexedCatalog doesn't seem to provide any
automatic mechanisms for things like ensuring attribute uniqueness (for
e.g. usernames). I suppose I could implement manual checking in every
class that needs it, but it really seems like very error-prone code to
be implementing at the application level.

My ideal solution would be an object database (or object-relational
mapper, I guess) which provided total transparency in all but a few
places, built-in indexing, built-in features for handling schema
changes, the ability to create attributes which are required to be
unique among other instances, and built-in querying with pretty syntax.
Atop seems to come pretty close, but since it's meant to be used in the
asynchronous Twisted environment, it doesn't provide any thread safety.

So basically, I'm wondering if there are any Python data storage
solutions I haven't noticed that come a bit closer to what I'm looking
for. I could live without built-in schema evolution; the querying is
really the most important feature I want.
 
J

Jan Dries

Leif K-Brooks said:
I've looked at SQLObject, and it's very nice, but it doesn't
provide certain features I really want, like the ability to store
lists of strings or integers directly in the database (using commas
in a varchar column or something).

What exactly in SQLObject prevents you from doing this? You may have to
pack/unpack the list into a comma separated string yourself, but surely the 2
line code function that requires can't be the problem.
And SQLObject's support for properties makes this very clean.
See:
http://www.sqlobject.org/docs/SQLObject.html#adding-magic-attributes-properties
Or am I missing something here?

Regards,
Jan
 
S

Steve Holden

Leif said:
I'm writing a relatively simple multi-user public Web application with
Python. It's a rewrite of a similar application which used PHP+MySQL
(not particularly clean code, either). My opinions on various Web
frameworks tends to vary with the phase of the moon, but currently, I'm
planning to use Quixote.

Needless to say, my application will need some kind of persistent data
storage. The previous PHP+MySQL application uses around 1.1GB of
storage, so something like PyPerSyst where everything is kept in memory
is out.

I've looked at SQLObject, and it's very nice, but it doesn't provide
certain features I really want, like the ability to store lists of
strings or integers directly in the database (using commas in a varchar
column or something).
One only slightly related point: if you are storing lists as column
values in a relational database then your database structure is not
normalized, and you may very well be making huge amounts of unnecessary
work for yourself.

The correct way to handle repeated attributes (multiple values in a
single column ) is to make that column a new table with a many-to-one
relationship to the original table.

In other words, if you have a table GOOF with columns myKey and myVals
(along with many others, presumably), you should restructure your
database to include a new table (say GOOFVals). The columns in this
would be something like GOOFKey and GOOFVal. The following row from GOOF:

myKey myVals
1 'ABC,DEF,GHI,JKL'

would become four rows in GOOFVals:

GOOFKey GOOFVal
1 'ABC'
1 'DEF'
1 'GHI'
1 'JKL'

You might also have another automatically-generated column to act as a
primary key, or you might let the (GOOFKey, GOOFVal) pair act as a
primary key - that would be up to your application.

Now it's a simple query to find out which GOOFs have val 'DEF", for example.
ZODB is very nice, but IndexedCatalog doesn't seem to provide any
automatic mechanisms for things like ensuring attribute uniqueness (for
e.g. usernames). I suppose I could implement manual checking in every
class that needs it, but it really seems like very error-prone code to
be implementing at the application level.

My ideal solution would be an object database (or object-relational
mapper, I guess) which provided total transparency in all but a few
places, built-in indexing, built-in features for handling schema
changes, the ability to create attributes which are required to be
unique among other instances, and built-in querying with pretty syntax.
Atop seems to come pretty close, but since it's meant to be used in the
asynchronous Twisted environment, it doesn't provide any thread safety.

So basically, I'm wondering if there are any Python data storage
solutions I haven't noticed that come a bit closer to what I'm looking
for. I could live without built-in schema evolution; the querying is
really the most important feature I want.

Feel free to tell me I'm talking rubbish, but it's rubbish that made Ted
Codd a worldwide reputation. It may well be that a properly-normalized
relational structure is exactly what you need.

regards
Steve
 
P

Pat

Leif said:
I'm writing a relatively simple multi-user public Web application with
Python. It's a rewrite of a similar application which used PHP+MySQL
(not particularly clean code, either). My opinions on various Web
frameworks tends to vary with the phase of the moon, but currently, I'm
planning to use Quixote.

Needless to say, my application will need some kind of persistent data
storage. The previous PHP+MySQL application uses around 1.1GB of
storage, so something like PyPerSyst where everything is kept in memory
is out.

You might want to look at Schevo (http://schevo.org), an ODBMS and
application development framework. Schevo builds on some of the
concepts introduced in Pypersyst, and can use Pypersyst as its backend
storage, but it can also use ZODB and Durus (and it is easy to switch
between backends). Schevo provides schema evolution and migration
features, enforces referential integrity and field constraints,
enforces unique indexes (whether single field or multiple field keys),
etc. All you have to do is describe your objects using a simple syntax
such as this snippet from a weblog application that we are putting
together as an example:

class Author:
"""Authors write posts."""

name = f.string()
password = f.hashedValue()
email = f.string()

_key(name)

_icon('.apps.kuser')

def __str__(self):
return self.name


class Post:
"""Posts contain content posted to the weblog."""

slug = f.string(doc='The short name that appears in the URL.')
title = f.string()
published = f.datetime()
author = f.entity(allow=Author)
excerpt = f.memo()
content = f.memo()

_key(slug)

_icon('.filesystems.desktop')

def __str__(self):
return self.slug


Schevo might not be quite ready for your particular needs, but the
situation you describe is the target for Schevo. While most of our UI
work has been with Qt, our next focus is on Nevow, which we have used
in the past on a predecessor to what is now Schevo. I've used Quixote
in the past, but I'm not sure how easy it would be to use it as the UI
for a Schevo app. Most of our short-term efforts are going to be
concentrated on Nevow and Plone for web applications, and Qt for GUI
apps (and hopefully wxPython at some point).

Matthew Scott and I will be giving a Schevo presentation at PyCon on
the subject of "Developing Database Applications With Schevo". You can
read our outline at http://schevo.org/doc/pycon2005/proposal.

Good luck with your project.
 
P

Pat

Posting using Google Groups messed up the formatting of those class
definition examples. Assume that they contain the usual indentation of
typical class definitions. ;-)
 
S

Shalabh Chaturvedi

Leif said:
I'm writing a relatively simple multi-user public Web application with
Python. It's a rewrite of a similar application which used PHP+MySQL
(not particularly clean code, either). My opinions on various Web
frameworks tends to vary with the phase of the moon, but currently, I'm
planning to use Quixote.

Good choice, IMO! And while you're using Quixote, you might want to look
at QLime [1] for all your data storage needs :)
I've looked at SQLObject, and it's very nice, but it doesn't provide
certain features I really want, like the ability to store lists of
strings or integers directly in the database (using commas in a varchar
column or something).

In any mapper that lets you set attributes for storing columns (such as
QLime), this should be fairly simple using properties.
My ideal solution would be an object database (or object-relational
mapper, I guess) which provided total transparency in all but a few
places, built-in indexing, built-in features for handling schema
changes, the ability to create attributes which are required to be
unique among other instances, and built-in querying with pretty syntax.

Briefly, QLime uses RDBMS indexing, automatically handles schema changes
(in fact you don't define the schema in Python at all, you just create
the table). For uniqueness you'd have to set the constraint on the
table. To look at the query syntax avaialable, (and OR mapping features)
see http://www.qlime.org/0.5.1/qlime_or_mapping.html.

Cheers,
Shalabh

[1] http://www.qlime.org/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top