Looking for minimal SQL

  • Thread starter Marco Aschwanden
  • Start date
M

Marco Aschwanden

Hi

I would like to develop an app that is (more or less) database independet.
Python DB API helps when masking "parameters" of sql statements. The db
driver cares for the correct conversion of a date, text, etc. This already
is a big step into the right direction.

The next step would be to use the least common denominator of all sql
dialects and do without all the sql goodies that the dialects offer... and
for this part I am wondering if anyone has a link / hint / book that
assembled this minimum sql. I am looking for something that says for
example:

Table/DB Design: always use lowercase letters, ...

Allowed sql functions: min, max, count

WHERE / HAVING hints:
- Don't use subselects

The focus is actually limited to SELECT / INSERT / UPDATE / DELETE (I
don't think that GRANT / CREATE XXX / ... statements are very portable).

I am aware, that a solution created this way is not as fast and as elegant
as an app that uses all features of a dialect, but I am really looking for
something database independent...

Thanks for any hints,
Marco
 
?

=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=

I am aware, that a solution created this way is not as fast and as
elegant as an app that uses all features of a dialect, but I am really
looking for something database independent...

What kind of application are you developing ?
 
M

Marco Aschwanden

What kind of application are you developing ?

I am developing nothing right now. I still am evaluating... eventually a
simple CRM (Customer Relationship Management). I might start doing
something next year, but my brain started to create a framework.

There are several reasons why I would like to stay "db" independent: For
one I would like to run the "same" application over some rpc-server (which
would have a powerful db) or as a standalone application with a "simple"
db behind (which will synchronized from time to time *plan*plan*). I have
already implemented a "simple" framework that allows me to reuse the same
code be it as a multi-tier app be it as a standalone app. A database
switch in the future is possible... hence... etc. etc.

-> Python offers openess when it comes to os platforms
-> db api offers a certain openness when it comes to dealing and
parameterizing sql-stmts ... I want to stick to a minimal sql-language set
to achieve real db openness

I am aware that programming "sql-neutral" is not the most efficient, the
easiest way to go. For example: If Sub-Selects are not portable than I
would split up the queries and glue them together with Python. If for
example one looks for the oldest customer:

With sub-selects you may write:

SELECT * FROM customer WHERE age = (SELECT max(age) FROM customer)

Without sub-selects I would have to write two seperate statements:

SELECT max(age) FROM customer --> take the "age" out of this query
"SELECT * FROM customer WHERE age = ?", (age)

This might not be the fastest solution but I would like to try!

Hope this answers your question,

Marco
 
O

Oliver Fromme

Marco Aschwanden said:
> The next step would be to use the least common denominator of all sql
> dialects and do without all the sql goodies that the dialects offer... and
> for this part I am wondering if anyone has a link / hint / book that
> assembled this minimum sql.

Then you will have a _very_ limited subset of SQL, especially
when you take mysql into account, which is not very standard-
ANSI-SQL compatible, as far as I know. For example, the
operator "||" is string-concatenation in standard SQL, while
it means logical OR in mysql. Double-quotes are used for
quoted identifiers in standard SQL, while they're used to
enclose strings in mysql. etc. etc.
> example:
>
> Table/DB Design: always use lowercase letters, ...

You will also have to be careful to avoid the reserved words
of _all_ SQL dialects. Not easy. Also, don't make your
identifiers longer than 20 characters.
> The focus is actually limited to SELECT / INSERT / UPDATE / DELETE (I
> don't think that GRANT / CREATE XXX / ... statements are very portable).

Both GRANT and CREATE are specified in standard SQL, and I
believe all the major databases conform to it (I don't know
if mysql does, though).

Maybe the PostgreSQL documentation is helpful for you:
For each SQL command, it contains a description of its
standard SQL conformance. Here's the URL:

http://www.postgresql.org/docs/7.4/static/sql-commands.html

If you stick to that, your SQL code will most probably also
work with Oracle, Sybase and other "serious" databases.

Best regards
Oliver

PS: When I say "standard SQL", I mean ANSI SQL92 and SQL99.
 
L

Leopold Schwinger

Oliver said:
Then you will have a _very_ limited subset of SQL, especially
when you take mysql into account, which is not very standard-
ANSI-SQL compatible, as far as I know. For example, the
operator "||" is string-concatenation in standard SQL, while
it means logical OR in mysql. Double-quotes are used for
quoted identifiers in standard SQL, while they're used to
enclose strings in mysql. etc. etc.

Maybe you don't have to be restricted like that. In my former company we
used the product SourcePro DB from RougeWave-Software. It's a kind
of abstraction layer. You never write an SQL-statement its more
a composition of class-objects. Of course, internal the SourcePro DB
makes an SQL-statement, but for the programmer its an C++ API.

So you don't care about how the SQL-Statement looks exactly like (if
an OR is written by 'OR' or '||'
==> Problem: Each database you want to access needs an own interface inside.

The real restrictions come from the features of an DBMS
a) does it support locks?
b) what about transactions
etc. etc.

greetings Leo
 
?

=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=

Then you will have a _very_ limited subset of SQL, especially
when you take mysql into account, which is not very standard-

Um, also consider the fact you can't do much with Mysql...
 
P

Paul Miller

Marco Aschwanden said:
Hi

I would like to develop an app that is (more or less) database independet.
Python DB API helps when masking "parameters" of sql statements. The db
driver cares for the correct conversion of a date, text, etc. This already
is a big step into the right direction.

The next step would be to use the least common denominator of all sql
dialects and do without all the sql goodies that the dialects offer... and
for this part I am wondering if anyone has a link / hint / book that
assembled this minimum sql. I am looking for something that says for
example:

I think what you are looking for is the ANSI SQL standard, probably
the SQL-92 version. However, I think something like PDO (recommended
in another message on this thread) might serve your needs better.
That is, if I understand what PDO does; it seems to me like it does
for database access what wxPython does for GUIs. If so, then there is
your consistent, multiple-DB compatible syntax, without needing to
cripple performance by dropping to SQL-92 or something.
 
D

Dennis Lee Bieber

easiest way to go. For example: If Sub-Selects are not portable than I

"If"??? I think they are still on the TBD list for MySQL, so at
the moment it is not an "if", it is an "as". <G>

I suppose you could start with the O'Reilly SQL book -- as I
recall, that covered MySQL, SQL Server, Oracle (and/or Sybase -- though
didn't SQL Server start life as a rebadged Sybase?).

Of course, MySQL also doesn't (yet) support triggers, stored
procedures, and only begins to touch foreign key/referential integrity.

--
 
S

Sion Arrowsmith

Dennis Lee Bieber said:
"If"??? I think they are still on the TBD list for MySQL, so at
the moment it is not an "if", it is an "as". <G>

They've been available since at least 4.1.0. Not always working
correctly, mind, but they've been there. In fact, I've hit
problems running SQL written for MySQL on Firebird becuase the
latter doesn't support nested selects in all the places the
former does....
 
S

Sion Arrowsmith

David Fraser said:
What we have found the best approach is to start development with two
databases in mind. In that case, you have to build in a basic layer of
database-independence. Then later, adding other databases tends to be an
easy refactor.

Seconded. I'm currently working on sticking a Firebird backend
onto an application which already supports Oracle and MySQL.
Thanks to the DB API, there's a lot of code which can be shared
in a base class, and for the most part the SQL-dialect-specific
tweaks are easy to stick in methods which override the common
code in a derived class. The biggest difference is in the SQL
needed for creating the tables in the DB in the first place.
I'd hate to be doing this in any language other than Python,
of course....
 
?

=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=

"If"??? I think they are still on the TBD list for MySQL, so at
the moment it is not an "if", it is an "as". <G>

Take a look at the column types PostgreSQL offers.
Among other interesting things, like a datetime type which works, it has
arrays.
You can define a column as an array of integers for instance.
Example :

Imagine we create a table to hold the index of a book. We should have
this (in pseudo-sql):

table words:
id integer primary key
word varchar

table refs:
page integer not null
word_id integer not null references words(id)

index on all interesting columns.

To find all pages where a word is, we look in "words" (index lookup on
word) and find the word id ; then we look in "refs" all pages with the
corresponding word_id
To find all words on a page, we look in "refs" (index lookup on page),
then we group by word, and we look in words to list the words.

With an array type, we have :

table words:
id integer primary key
word varchar
page integer[]

Then you can use a GIST index on page and ask it "find all rows where the
page list contains this page". This is a lot faster than the above Joins.
Getting the list of pages for a word is also a lot faster (it's only one
select, maximal locality of reference).
The GIST indexes are so optimized in Postgres that the speed for these
lookups is amazing.

I tested this feature by building a list of 100.000 words which appeared
in anything from 1 to 10 different random pages (between 1 and 1000). When
finding all words on a page, the GIST index had acceptable timings (ie. a
few tens of milliseconds to return between a few hundred and a few
thousands records) ; the pivot table had horrendous timings on the verge
of a second (because the Join was looking in the pivot table for the
condition first, then making index lookups for each row in this large
table, which is the only way to do it... and makes a lot of disk seeks !)



If you want to use a "minimal subset of SQL", you'll have to forgo these
interesting features...

Postgresql has indexable geometric types. On a random population of
500.000 points, it takes 90 ms to get the points in a certain bounding box
(this particular query returned 5.000 points)... without clustering on the
index it took about 200 ms...


So, if you want performance, it is difficult to ignore database-specific
points.

I know MySQL has geometric types too. But only for MyISAM tables !
(argh). So you don't get transactions on them. And MyISAM tables have
terrible write concurrency.
 
W

William Park

Marco Aschwanden said:
Hi

I would like to develop an app that is (more or less) database
independet.

No such thing. Entire database industry is about locking in customer to
their own products and upgrades, and to prevent migration to their
competitors at all cost. Even with "open source" database like MySQL
and PostgreSQL, you'll see sufficient differences in C API which will
cause you to abandon what you're trying to do.

This conclusion is result of my attempt to add database interfaces to
shell:
http://freshmeat.net/projects/bashdiff/
http://freshmeat.net/projects/basp/
 
R

Russell Lear

Marco Aschwanden wrote:
I am wondering if anyone has a link / hint / book that
assembled this minimum sql.

"SQL in a Nutshell" from O'Reilly documents both SQL99 and the deviations of
Oracle, MS SQLServer, MySQL, and PostgreSQL (including lists of vender
specific keywords). Don't know if it is quite what you're looking for, but
it might help.

That said, I'm not sure that trying for DB independence is really what you
need to do. By doing that you not only make your life more difficult from
an implementation PoV (by not taking advantages of useful utilities and
extensions offered by the specific vendor), but you may not be able to
provide the level of performance demanded by customers (depending on the
size of your application).

I recently developed an application for an Oracle database. (prototyped in
Python, but moved to C#/.NET cuz that's what was wanted). The original
version used pretty standard SQL and performed correctly. Only problem was
that, under more extreme (but not-atypical) customer loads, it crawled. I
needed to tweak the SQL to play to Oracle's quirks as well as using some of
their packages.

A better approach might be to provide an abstraction layer with
vendor-specific plugins. Above the abstraction layer you expose a model
that's important to your app. Below, you map to the DB.

Just my opinion.

Russell.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top