OODB vs RDBMS

T

Thomas Guettler

Hi,

most of the time I use ZODB/Durus to store my data.

I like it, but I know that it has some weaknesses:
- only accesible from python
- I need to code your indexes for fast searching yourself.

I think about using a RDBMS for the next project. What I don't like
about RDBMS: If you need a list of values you need to create a new
table. Example: If you want to store several email addresses of one
customer, you need to create a new table.

Since the namespace if tablenames is flat, you soon have so many
tables, that it is hard to browse them.

Postgres has extensions which allows you to store arrays in a column. Is
this supported by the python binding? Are there other databases which
support this?

Are there OR-mappers (object relational mappers) which support lists in
a column?

How is the unicode support of the python bindings to RDBMSs? I don't
want to convert the results of a query to unicode myself.
Can you insert unicode strings into a SELECT statement?

Which database supports fulltext searches with a customized word
normalization? (All should be the same: str. str straße, strasse)

Which OR-mapper do you suggest?
Are there OR-mappers which support this: Subobjects are
fetched automatically while you access the object tree?
Example: One customer has N TroubleTicketItems:
customerobj=get_customer(...) # First SELECT
for ticket in customerobj.tickets: # (*)
...

(*) Second SELECT gets executed only if needed.

I now this message is vague, but maybe we can discuss the
pros/cons of "OODB vs RDBMS"

Thomas
 
D

Dan Lenski

I'll second the recommendation of sqlalchemy... I've heard it's very
good. Personally I prefer SQLObject, which is a similar library but in
my opinion a bit simpler and more "pythonic".

Needing to make a special "join table" to do arrays is one of the
downsides of RDBMS, but it leads to big performance advantages over
kludged-in things like using a variable-length string to store arrays.
Plus, with an ORM like sqlalchemy/sqlobject, you never have to deal
with the join table directly. If you have a table that matches users
to purchases, for example, you can just do something like:

user = Users.retrieve(id)
purchases = user.Purchases()

ORMs like sqlobject are just awesome for making objects in a database
integrate seemlessly with python code: they make accessing and
manipulating objects in the database as easy as accessing regular
program variables.

Dan
 
B

Bruno Desthuilliers

Dan said:
I'll second the recommendation of sqlalchemy... I've heard it's very
good. Personally I prefer SQLObject, which is a similar library but in
my opinion a bit simpler and more "pythonic".

SQLObject and SQLAlchemy have different purposes IMHO : the first one is
meant as a way to easily persist objects in a SQL DBMS, while the second
is primarily a (mostly successful) attempt to provide a higher-level,
pythonic interface to SQL DBMS - the ORM features of SQLAlchemy being
mostly built on top of the SQL/Python integration layer.
 
M

metaperl

I'll second the recommendation of sqlalchemy... I've heard it's very
good. Personally I prefer SQLObject, which is a similar library but in
my opinion a bit simpler and more "pythonic".

TurboEntity was the first "SQLObject Layer for SQLAlchemy"... it will
soon be improved upon, but it was quite slick.
 
M

Magnus Lycka

Thomas said:
Hi,

most of the time I use ZODB/Durus to store my data.

I like it, but I know that it has some weaknesses:
- only accesible from python
- I need to code your indexes for fast searching yourself.

There are other features of relational database systems
that I find much more relevant than those, but then I don't
know what your use case is. For high performance, mission
critical, multi-user database systems, where many simultaneous
users actually do fine grained data manipulation, I don't think
there is any competition. Also, I don't know any other solution
that lets you do serious tuning without code changes. Even if the
application is the same, data size and usage patterns might lead
to vastly different performance. Being able to speed up some
operation on the expense of some other operations might be just
the right thing for a certain installation at a certain time.
This can be more or less automatic, depending on your brand and
version of RDBMS.

For a non-critical, single-user system without huge amounts of
data in a structure that fits the relational model, it might not
be worth the effort.
I think about using a RDBMS for the next project. What I don't like
about RDBMS: If you need a list of values you need to create a new
table. Example: If you want to store several email addresses of one
customer, you need to create a new table.

I think this is a tiny thing when you look at the big picture.
I don't know a lot about the PostgreSQL extensions, but the way
things work in relational database has proven to work very well
for quite some time. Obviously, OODBMS's haven't made any huge
impact, despite two decades of efforts.

The "impedance mismatch" between OO programming and relational
databases is annoying, but it's something we have to (and can)
deal with.
Since the namespace if tablenames is flat, you soon have so many
tables, that it is hard to browse them.

The tablename namespace is not flat in SQL. Where did you get this
from? Although not implemented in every RDBMS, the SQL standard has
the concept of a schema, and every table should belong to a schema.
For instance Oracle lacks schemata, but more or less makes up for it
by through the way it implements users. (Tables are owned by users.)

There's just this two level structure though, no abitrary hierarchy.
Postgres has extensions which allows you to store arrays in a column. Is
this supported by the python binding? Are there other databases which
support this?

I don't know about the first question, but regarding the second, none
of the popular ones do as far as I know. PostrgeSQL is a fine RDBMS
though.

By the way, a database is a collection of data, not some software.
Are there OR-mappers (object relational mappers) which support lists in
a column?

I think they do, but having a separate class for the email addresses
(if we continue with your example above) has its advantages too. If
the customer has several email addresses (and you feel a desire to
keep track of that) they are probably different in some ways. It's
e.g. likely that you should use one particular address as recipient
when you send mail, not just one at random, or all of them. You might
also realize than not only customers, but also other entities, such
as employees, sub-contractors and authorities have email addresses.
Actually, while you might get more tables due to the first normal
form, your tables might well get leaner, the total amount of columns
smaller, and your over-all datamodel more coherent.
How is the unicode support of the python bindings to RDBMSs? I don't
want to convert the results of a query to unicode myself.
Can you insert unicode strings into a SELECT statement?

As far as I remember, all the bindings I tried returned unicode
objects from varchar and char fields, and always accepted them
as parameters for CHAR/VARCHAR fields.

Remember to always pass parameters properly. I.e. use e.g.
cur.execute("SELECT * FROM T WHERE C=?", col_value) rather than
something like cur.execute("SELECT * FROM T WHERE C="+col_value).
The former will prevent SQL injection attacks, remove the need
to worry about quoting and escaping, and also make performance
better in the major systems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top