The quest for opensource database...

R

Ruby Tuesdays

Perhaps you database guru able to suggest what would be a good choice for
opensource database platform to use to develop projects.

At the moment the project is small/medium, but it will grow in size both
data, users, and number of transactions. I'm using MySQL for right now but
it lack of trigger, stored procedures, etc ... it sometimes slows the
project.

Is PostgreSQL any better/worse? Or is that any other choice beside the two?
Thanks.
 
L

Luis Felipe Strano Moraes

PostgreSQL is definitely better. I heard that the new version of MySQL
will have some of the new features you ask, but its still in beta stages.
Theres also an opensource DB called Firebird, which is based on source
code released of Interbase 6.x I think (might be saying nonsense here tough).

--lf
 
T

Tom Copeland

Is PostgreSQL any better/worse? Or is that any other choice beside the two?

RubyForge runs about on PostgreSQL and I've been really happy with it so
far. RubyForge is not a busy database - only about a query per second
or so, only about 400K records - but it's doing what we need it to do.

Yours,

Tom
 
D

Dick Davies

* Ruby Tuesdays said:
Perhaps you database guru able to suggest what would be a good choice for
opensource database platform to use to develop projects.

At the moment the project is small/medium, but it will grow in size both
data, users, and number of transactions. I'm using MySQL for right now but
it lack of trigger, stored procedures, etc ... it sometimes slows the
project.

Is PostgreSQL any better/worse? Or is that any other choice beside the two?
Thanks.

It's fab. The main/only reason to use MySQL is everyone and
their dog has support for it in their php webmail/whatever.
 
R

Raphael Bauduin

Ruby said:
Perhaps you database guru able to suggest what would be a good choice for
opensource database platform to use to develop projects.

At the moment the project is small/medium, but it will grow in size both
data, users, and number of transactions. I'm using MySQL for right now but
it lack of trigger, stored procedures, etc ... it sometimes slows the
project.

Is PostgreSQL any better/worse? Or is that any other choice beside the two?
Thanks.
Using postgres and satisfied (I couldn't work without triggers).
I heard that with tables with millions of entries, postgres could become rather slow.
I used postgres both with the ruby-dbi and native interfaces without problems.

I program the functions used by the triggers in plpgsql (there's a project to let you program the postgres functions in ruby, but it was unstable last time I looked at it).

Firebird (which is indeed coming from Interbase) is also a possibility, but I have no experience with it.

Raph
 
S

Shu-yu Guo

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I love PostgreSQL as well, and my main gripe with it is speed. I often
run into situations where classes correspond nicely to tables (like a
User class to a users table), so I'm tempted to write code which would
automatically query the database for the neccessary information in its
initialize. However, this gets quite slow because it becomes hard to
combine simple queries into complex ones (which execute much much
faster). I have made optimizations around this, but they're not as nice
to use as just pure data encapsulation.

This is in no way meant to discourage you from using PostgreSQL, merely
inform you on its biggest limitation I see. It has too many other
merits to be given up (triggers, decent transaction support, so on).

As a friend has told me again and again, always choose data integrity
over speed.
- --
Lo-lee-ta: the tip of the tongue taking a trip of three
steps down the palate to tap, at three, on the teeth.
Lo. Lee. Ta. GUO Shu-yu <[email protected]>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFAhZg5TrIPVeT9UA0RAjmQAJ9Q6odYknQjW+JGNUxmHcDp1giu+QCfR/1t
6c2LHkvNtvGnzl0cVFZq11U=
=A0Uj
-----END PGP SIGNATURE-----
 
M

Mike Z

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I love PostgreSQL as well, and my main gripe with it is speed. I often
run into situations where classes correspond nicely to tables (like a
User class to a users table), so I'm tempted to write code which would
automatically query the database for the neccessary information in its
initialize. However, this gets quite slow because it becomes hard to
combine simple queries into complex ones (which execute much much
faster). I have made optimizations around this, but they're not as nice
to use as just pure data encapsulation.

This is in no way meant to discourage you from using PostgreSQL, merely
inform you on its biggest limitation I see. It has too many other
merits to be given up (triggers, decent transaction support, so on).

As a friend has told me again and again, always choose data integrity
over speed.

I'm not convinced by the speed argument- I haven't found anything quicker
for the data I typically store. In other words: postgresql has always been
so fast that I haven't needed to care. Having seen postgres outperform
sybase by a _factor_ of 10 when building objects from stored data, it's
clear that a lot depends on the resources you dedicate to the database
rather than the database software itself (in ideal conditions, I'm sure
the race would be much closer).

As well as the rich data types and operations supported in postgresql, the
big win is PL/ruby which means you can write triggers in ruby. It's great
for embedding those tricky bits of business logic and data manipulation in
the database itself.

Cheers,
Mike
 
R

Ryan Pavlik

I love PostgreSQL as well, and my main gripe with it is speed. I
often run into situations where classes correspond nicely to tables
(like a User class to a users table), so I'm tempted to write code
which would automatically query the database for the neccessary
information in its initialize. However, this gets quite slow because
it becomes hard to combine simple queries into complex ones (which
execute much much faster). I have made optimizations around this,
but they're not as nice to use as just pure data encapsulation.
<snip>

There are a couple problems here, and a few solutions to them, as
well. I'll shamelessly plug Criteria here. Make sure you use the SVN
version:

http://svn.mephle.org/svn/mephle/criteria/

That will help you combine your queries; I do this regularly:

class Customer
# This is the table, you probably want to use a DBITable though.
IDX = SQLTable.new("my_table", ...)

CUSTOMER_NAME = (IDX.name == IDX.x0?) # .x0? is a placeholder
CUSTOMER_BEFORE = (IDX.date < IDX.x0?)
# :
# : more queries
# :
end

Later on we can combine them:

query = CUSTOMER_NAME["Bob"] & CUSTOMER_BEFORE[1975]

You get the idea. However this is only one of your problems; the
other is the fact it's very slow to do a query everytime you want an
attribute, either getting or setting. For this I would recommend
"caching" things; make an attr_sql_accessor or the like which (perhaps
using Criteria... I do this too) makes note of when the value changes,
but only updates the database on a flush, and/or tie in transactions
(especially nice with PgSQL's MVCC). This should keep you from having
3 lines like this:

customer.name = "Larry"
customer.date = 1998
customer.address = "..."

...generate 3 UPDATE statements, but only one instead. The only thing
you need to watch out for is making sure someone doesn't update the
database behind you; for this I'm not sure what you can do. (Maybe
some magic with stored ruby procedures and triggers... dunno.)
 
W

Wild Karl-Heinz

In message "The quest for opensource database..."

<snip>

RP> There are a couple problems here, and a few solutions to them, as
RP> well. I'll shamelessly plug Criteria here. Make sure you use the SVN
RP> version:

RP> http://svn.mephle.org/svn/mephle/criteria/

quite nice :)
but is sqlite post-setup.rb and the test scripts
uptodate?

it produce errors.

i've change to try

$SQLITE = SQLite::Database.new(dbfile, 0)

but then it's a bit to heavy for me :)

-bash-2.05b$ ruby post-setup.rb
Skipping DBI tests
Loaded suite post-setup
Started
...............F
Finished in 0.132559 seconds.

1) Failure:
test_sqlite(TestSqlite) [./t/sqlite.rb:14]:
Exception raised:
Class: <NameError>
Message: <"uninitialized constant Criteria::SqliteTable::SQLiteError">
---Backtrace---
/lib/criteria/sqlite.rb:21:in `create'
/t/sqlite.rb:10:in `test_sqlite'
/t/sqlite.rb:9:in `assert_nothing_raised'
/t/sqlite.rb:14:in `test_sqlite'
---------------

16 tests, 65 assertions, 1 failures, 0 errors

maybe it helps.

regards
Karl-Heinz
 
S

Shu-yu Guo

There are a couple problems here, and a few solutions to them, as
well. I'll shamelessly plug Criteria here. Make sure you use the SVN
version:

Well, this is just abstraction against raw SQL, my problem was where to
*put* the constructed query. Putting it in initialize is slow, consider
the following scenario:

There is a forum with 30 posts.
Each post has an user associated with it.
Each user has a profile associate with it.

The data is stored in three tables, posts, users, and profiles. If I
were to put the query in the initialize, that's 3 queries for each
post, and can quickly spiral out of control (especially since a good
amount of time might be wasted on pulling the same user again and
again).

What I do now is to define two class methods 'select_one' and
'select_all' (to look uniform with DBI, which I use) in the classes in
question (in this case, Post, User, and User::profile). These methods
fetch data (a WHERE clause is passed in as an argument) from the
database and pass a Hash to initialize. 'select_all' returns an array.

'select_all' is where the optimization happens, the query plan for the
scenario described might be something like the following:

1. In Post#select_all, select a list of all posts, then call
User#select_all with the criterion being all uids which appear in the
list of posts gotten.
2. In User#select_all, select a list of the users given by the
criterion, then call User::profile#select_all with the criterion again
being all the uids which appear.
3. In User::profile#select_all, fetch all the profiles described by the
criterion and return them.

This select_all chain improves performance significantly, but I
question its OO-ness. For 30 posts, for example, this is 3 queries vs
3*30 queries.

While criteria doesn't seem to solve my problem (well, present a better
solution), I think I'll adopt it. The more abstraction the merrier.
Thanks!
 
S

Shu-yu Guo

...generate 3 UPDATE statements, but only one instead. The only thing
you need to watch out for is making sure someone doesn't update the
database behind you; for this I'm not sure what you can do. (Maybe
some magic with stored ruby procedures and triggers... dunno.)

My application is web-based, and I just avoid having objects which
represent database data persist for more than one request. This is why
I need fast select. Unless I go distributed, caching things with ruby
procedures and triggers isn't enough, because if I use mod_ruby for
this web app, it's very likely that more than one copy of the app is
run as Apache pre-forks. If child #3 handles some request where some
data was input and it caches that input and updates the database, none
of the other children see that cached object, and unless they read the
database every request, they'll be hopelessly out of sync.

I hope what I said made sense.
 
R

Ryan Pavlik

On Wed, 21 Apr 2004 10:30:49 +0900

My application is web-based, and I just avoid having objects which
represent database data persist for more than one request. This is why
I need fast select. Unless I go distributed, caching things with ruby
procedures and triggers isn't enough, because if I use mod_ruby for
this web app, it's very likely that more than one copy of the app is
run as Apache pre-forks. If child #3 handles some request where some
data was input and it caches that input and updates the database, none
of the other children see that cached object, and unless they read the
database every request, they'll be hopelessly out of sync.

I hope what I said made sense.
<snip>

It does... I have found that approach to be rather performance-
inhibited as you seem to be finding as well. What I did is make the
CGI very thin, and do all the work in a persistent backend. The
benefits are manifold:

* There are no longer sessions on the backend, so you don't have
to worry about transaction-oriented programming and kludgery so
much.

* Data is retained as necessary, so access is fast.

* You haven't tied yourself to the web as a frontend, and can
write yourself "real" application frontends.

* You don't have to touch CGI or HTML much if you do it right,
which means focusing on core functionality.

Anyhow, this may or may not be an option for you, but it definitely
does help if you can do it.
 
C

Charles Comstock

It does... I have found that approach to be rather performance-
inhibited as you seem to be finding as well. What I did is make the
CGI very thin, and do all the work in a persistent backend. The
benefits are manifold:

* There are no longer sessions on the backend, so you don't have
to worry about transaction-oriented programming and kludgery so
much.

* Data is retained as necessary, so access is fast.

* You haven't tied yourself to the web as a frontend, and can
write yourself "real" application frontends.

* You don't have to touch CGI or HTML much if you do it right,
which means focusing on core functionality.

Anyhow, this may or may not be an option for you, but it definitely
does help if you can do it.

Ah the wonders of a DRb backend server with a cgi client and a gui client...

Now if only I could figure out how to effectively use DRb with ssl other
then through ssh or stunnel and it would be wonderful for everyone.

Charlie
 
E

Erwin Moller

Ruby said:
Perhaps you database guru able to suggest what would be a good choice for
opensource database platform to use to develop projects.

At the moment the project is small/medium, but it will grow in size both
data, users, and number of transactions. I'm using MySQL for right now but
it lack of trigger, stored procedures, etc ... it sometimes slows the
project.

Is PostgreSQL any better/worse? Or is that any other choice beside the
two? Thanks.

Postgresql is VERY robust and VERY rich with features.
You'll love it. I know I do. :)

Triggers are implemented. As are serial (autonumber), PK, FK, and all other
types of constraints you can conjure up.

A very strong language for scripting stored procedures is implemented too.
(named: plpgsql)

exporting, backing up, everything is there.

Redhat even developed a very nice GUI for Postgresql 7.3, which they call
RHDB (RedHatDataBase) which is actually Postgresql 7.3 (i didn't find any
differences)
It is called Database Administrator for Postgres Redhat edition3.

I am still running their alpha, but it works like a charm, so why update?

I hope I sound enthousiastic enough, because I really like postgresql. :)

Regards,
Erwin Moller
 
D

Dick Davies

* Mike Z said:
I'm not convinced by the speed argument- I haven't found anything quicker
for the data I typically store. In other words: postgresql has always been
so fast that I haven't needed to care.

One other important factor is transactions - by default, *every* postgres
operation has its own transaction, which means that if you wrap several
operations in a single transaction, things actually get _faster_...
 
T

Tony Marston

Ruby Tuesdays said:
Perhaps you database guru able to suggest what would be a good choice for
opensource database platform to use to develop projects.

At the moment the project is small/medium, but it will grow in size both
data, users, and number of transactions. I'm using MySQL for right now but
it lack of trigger, stored procedures, etc ... it sometimes slows the
project.

You do not need stored procedures or database triggers to write successful
applications. I once had to maintain a system that was built around
procedures and triggers, and it was a nightmare. The problem was that one
trigger/procedure updated several tables, which fired more triggers which
contained more updates which fired more triggers ..... It was impossible to
keep track of what was being fired where.

Another reason I prefer to put all my business logic into PHP code instead
of triggers is that PHP code is a lot easier to debug. Have you come across
an interactive debugger for database procedures and triggers?
 
E

Erwin Moller

Tony said:
You do not need stored procedures or database triggers to write successful
applications. I once had to maintain a system that was built around
procedures and triggers, and it was a nightmare. The problem was that one
trigger/procedure updated several tables, which fired more triggers which
contained more updates which fired more triggers ..... It was impossible
to keep track of what was being fired where.

Another reason I prefer to put all my business logic into PHP code instead
of triggers is that PHP code is a lot easier to debug. Have you come
across an interactive debugger for database procedures and triggers?

Very good advise, Tony.
I second that opinion 100%.
Been there too. :-(

Stick to constrainst like foreign keys (which do fire a trigger I think when
the relevant colums are updated/inserted) and avoid writing your own.
And stick to constraints like CHECK to make sure the data inserted makes
sense to you instead of triggers.
Stored procedures do often shorten the developmenttime, but are hard to
debug.
It is often easy to do 'some quick and dirty' solution using stored
procedures, but when something changes in the application (which happens
all the time) you have to debug those SP's too.

Regards,
Erwin Moller
 
J

Jerry Sievers

Tony Marston said:
You do not need stored procedures or database triggers to write successful
applications. I once had to maintain a system that was built around
procedures and triggers, and it was a nightmare. The problem was that one
trigger/procedure updated several tables, which fired more triggers which
contained more updates which fired more triggers ..... It was impossible to
keep track of what was being fired where.


Right! Gratuitous use of triggers, SPs and the like results in
indirection that can make a system impossible to understand. Use with
care.

But is this a vote in favor of a less advanced system that prevents
the temtation? I don't think so.

YMMV
 
U

Useko Netsumi

May be we can get away with the strored procedures. Why would one do
manipulation of the data inside a database engine, right? The function of
database is to store and retrieve information, that's it. Beyond that you
really stretch it and this where scripting language such as Ruby comes in.

But for trigger, to write a commercial grade transaction processing
application, you definitely need trigger. Trigger will aid
programmers/designer to ensure certain consistency checking happen
before/after certain opearation. How would you do that outside the database
engine withouth kludging it?

Just my 2cents back from my college days.


For simple and less-dynamic-update application, you might not need tri
 
T

Tony Marston

Useko Netsumi said:
May be we can get away with the strored procedures. Why would one do
manipulation of the data inside a database engine, right? The function of
database is to store and retrieve information, that's it. Beyond that you
really stretch it and this where scripting language such as Ruby comes in.

But for trigger, to write a commercial grade transaction processing
application, you definitely need trigger. Trigger will aid
programmers/designer to ensure certain consistency checking happen
before/after certain opearation. How would you do that outside the database
engine withouth kludging it?

I personally create a separate class for each database table and define the
constraints within that class. I then have standard code which is used to
deal with these constraints at the appropriate time (see
http://www.tonymarston.co.uk/php-mysql/databaseobjects2.html for some
examples). In this way I achieve the following objectives:
(1) The constraint details and business logic for a database table are
contained in the same class, therefore everything is in one place instead of
being scattered about hither and yon.
(2) I am in complete control when it comes to implementing the constraints,
so I have the power to change them at runtime should the need arise.
(3) If I have a problem I can always step through with my debugger.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top