MySQL vrs SQLite

M

Michael

I'm considering changing a Python program of mine, that uses MySQL as
it's db backend, to use SQLite. I'm mostly wanting something that can
make my program a complete package without requiring the user install
and configure an external db program. Has anyone experience with these?
They both follow the same DB-API so the code should be pretty easy to
port, right? Any draw backs to making this change?
 
D

Dave Benjamin

I'm considering changing a Python program of mine, that uses MySQL as
it's db backend, to use SQLite. I'm mostly wanting something that can
make my program a complete package without requiring the user install
and configure an external db program. Has anyone experience with these?
They both follow the same DB-API so the code should be pretty easy to
port, right? Any draw backs to making this change?

I haven't used SQLite, through Python or otherwise, but I think it's
important to note nonetheless that it is supposedly a "typeless" database,
as opposed to MySQL, which (though perhaps tolerant to a fault) conforms
your data to the datatypes of your columns. The following paragraph from the
SQLite web site makes me rather uncomfortable:

"""SQLite is "typeless". This means that you can store any kind of data you
want in any column of any table, regardless of the declared datatype of that
column. (See the one exception to this rule in section 2.0 below.) This
behavior is a feature, not a bug. A database is suppose to store and
retrieve data and it should not matter to the database what format that data
is in. The strong typing system found in most other SQL engines and codified
in the SQL language spec is a misfeature - it is an example of the
implementation showing through into the interface. SQLite seeks to overcome
this misfeature by allowing you to store any kind of data into any kind of
column and by allowing flexibility in the specification of datatypes."""

-- http://www.hwaci.com/sw/sqlite/datatypes.html

If you are doing any type of database work where you need to deal with
different types of data (ie. not just strings) and you are concerned with
the integrity of your data, I recommend that you take a good hard look at
this "feature, not a bug" and determine if SQLite meets your project's
requirements. Like I said, I've never used it, so this is all conjecture.
 
A

Alan Gauld

No direct experience but doesn't SQLite store its data in memory?
If you have a large database that could lead to problems on PCs
with a small memory installation or running lots of apps?

On the other hand if the data volume is small SQLite could offer
significant performance gains over a disk oriented architecture
like MySQL etc.

Just a thought.

Alan G.
Author of the Learn to Program website
http://www.freenetpages.co.uk/hp/alan.gauld
 
M

Michel Claveau/Hamster

doesn't SQLite store its data in memory?
No ! It's the database-engine who is in embeddable. The data are stored in
one file per database.
 
R

richard

Michael said:
I'm considering changing a Python program of mine, that uses MySQL as
it's db backend, to use SQLite. I'm mostly wanting something that can
make my program a complete package without requiring the user install
and configure an external db program. Has anyone experience with these?

Yes, my project, the Roundup Issue Tracker (http://roundup.sf.net) has
interfaces to both MySQL and SQLite (and postgresql, metakit, anydbm, ...)

In short, sqlite is a *very* capable little database, as long as it only
ever has one user. More than one user, and it'll block access so only one
user may access it at a time.

It scales very well - better than MySQL, and about equal with postgresql as
far as I can tell. See:

http://www.mechanicalcat.net/richard/log/Python/Roundup_benchmark_time_again

for some comparative benchmarks.

They both follow the same DB-API so the code should be pretty easy to
port, right?

In theory, yes. In practise, the DB-API layer simple can't hide the various
little inconsistencies between databases (usually datatypes and incomplete
SQL implementations). The Roundup code includes large "common" interface
which covers 95% of its generic RDBMS interface, and then individual
modules for each of the specific RDBMSes. Postgresql is the thinnest layer,
then SQLite and then MySQL deviates the most from the common
implementation.

Any draw backs to making this change?

SQLite has some limitations on the SQL it implements. Then again, so does
MySQL. The SQLite website has a concise page indicating the unimplement SQL
features. ALTER TABLE is about the most annoying omission. On the other
hand, MySQL doesn't support sub-selects, but sqlite does.

SQLite treats all data as strings, but note that it does some internal
"typecasting" such that a column of numbers will be sorted numerically. You
will need to implement your own data conversion though. It's usually as
trivial as a simple mapping containing conversion functions like (from the
Roundup source):

sql_to_hyperdb_value = {
hyperdb.String : str,
hyperdb.Date : lambda x: date.Date(str(x)),
hyperdb.Link : str,
hyperdb.Interval : date.Interval,
hyperdb.Password : lambda x: password.Password(encrypted=x),
hyperdb.Boolean : int,
hyperdb.Number : rdbms_common._num_cvt,
}


Richard
 
?

=?ISO-8859-1?Q?Gerhard_H=E4ring?=

Michel said:
No ! It's the database-engine who is in embeddable. The data are stored in
one file per database.

You *can* also store the data in memory by specifiying :memory:, i. e.:

cx = sqlite.connect(":memory:")

-- Gerhard
 
R

Richie Hindle

[Dave, quoting the SQLite docs]
"""SQLite is "typeless". This means that you can store any kind of data you
want in any column of any table, regardless of the declared datatype of that
column. (See the one exception to this rule in section 2.0 below.) This
behavior is a feature, not a bug. A database is suppose to store and
retrieve data and it should not matter to the database what format that data
is in. The strong typing system found in most other SQL engines and codified
in the SQL language spec is a misfeature - it is an example of the
implementation showing through into the interface. SQLite seeks to overcome
this misfeature by allowing you to store any kind of data into any kind of
column and by allowing flexibility in the specification of datatypes."""

How about this?

"""Python containers (eg. lists, sets, tuples) are "typeless". This means
that you can store any kind of data you want in any container, without a
declared datatype for that container. This behavior is a feature, not a bug.
A container is suppose to store and retrieve data and it should not matter
to the container what format that data is in. The static typing systems
found in many other programming languages and codified in many language
specs is a misfeature - it is an example of the implementation showing
through into the interface. Python seeks to overcome this misfeature by
allowing you to store any kind of data into any container and by allowing
flexibility in the specification of datatypes."""

See every third posting to comp.lang.python over the past ten years for why
this kind of thing may or may not be a good idea. Most of the arguments
that apply to dynamic typing in Python also apply to typelessness in
databases (the main one being: it's up to your unit tests, not anything
else, to make sure your program works). If Python's dynamic typing fits
your brain, it seems likely that a typeless database should do so too.
 
K

Kevin Dahlhausen

Richard,

Thank you for posting information on SQL lite, especially the relative
sizes of your different store layers. This is really good
information.
 
D

Dave Benjamin

[Dave, quoting the SQLite docs]
"""SQLite is "typeless". This means that you can store any kind of data you
want in any column of any table, regardless of the declared datatype of that
column. (See the one exception to this rule in section 2.0 below.) This
behavior is a feature, not a bug. A database is suppose to store and
retrieve data and it should not matter to the database what format that data
is in. The strong typing system found in most other SQL engines and codified
in the SQL language spec is a misfeature - it is an example of the
implementation showing through into the interface. SQLite seeks to overcome
this misfeature by allowing you to store any kind of data into any kind of
column and by allowing flexibility in the specification of datatypes."""

How about this?

"""Python containers (eg. lists, sets, tuples) are "typeless". This means
that you can store any kind of data you want in any container, without a
declared datatype for that container. This behavior is a feature, not a bug.
A container is suppose to store and retrieve data and it should not matter
to the container what format that data is in. The static typing systems
found in many other programming languages and codified in many language
specs is a misfeature - it is an example of the implementation showing
through into the interface. Python seeks to overcome this misfeature by
allowing you to store any kind of data into any container and by allowing
flexibility in the specification of datatypes."""

First of all, a minor nit: the word you want here is "supposed", not
"suppose". =)

Database relations are not generic containers. For a tuple to exist in a
relation, it must satisfy a set of predicates that have been defined for
that relation. These predicates include primary key, foreign key, and unique
constraints. They also include (IMHO) the domains of the values for any
particular column. A "typeless" database would limit the kinds of assertions
I could make about a relation, and to me this is a misfeature.
See every third posting to comp.lang.python over the past ten years for why
this kind of thing may or may not be a good idea. Most of the arguments
that apply to dynamic typing in Python also apply to typelessness in
databases (the main one being: it's up to your unit tests, not anything
else, to make sure your program works). If Python's dynamic typing fits
your brain, it seems likely that a typeless database should do so too.

I reject the claim that typing issues in programming languages are the same
for databases.

There are definite advantages to programming in dynamically typed languages
like Python; most importantly, they are lean on syntax and don't get in your
way like the popular statically typed languages. The main thing that makes
programming in statically typed languages a pain is the amount of explicit
type annotations and casts typically required to pull it off. This is more
of a flaw in the respective type systems of these languages than with static
typing in general. Languages in the ML family have shown that a type system
can be out of your way (via type inference) yet still allow for compile-time
checks. The "duck typing" concept that Python and Ruby programmers get so
excited about can be accomplished using OCaml's objects because OCaml's type
inference system is powerful enough to support this.

Don't get me wrong; I love Python, and I love dynamically typed languages.
There are still lots of things you can do in a dynamically typed language
that would be difficult or impossible to accomplish in a statically typed
language (without heavy use of reflection and dynamic casts, anyway).

However, you simply cannot extend this issue to cover databases as well. For
instance, SQL does type inference also. The only time you need to specify
the types of data is when you declare your schema. The types of columns
returned by queries are inferred by the database automatically. No explicit
type annotations are necessary here. So, SQL does not suffer from my main
pet peeve with statically typed languages, even inside of a "typeful"
database.

If the database has the expressiveness to make assertions about the data it
contains, these assertions become assumptions to the programs that use the
database. If you like simple, terse Python programs, you'll enjoy using a
database that guarantees that your assertions were met, because these
assertions do not need to be checked by your program.

Databases may be shared by multiple programs written in multiple programming
languages. How are you going to write a unit test to ensure that someone
doesn't write a program in the future that sticks "hello" in a customer's
age field? Put simply, if the database does not ensure that this will not
happen, you'd better guard against this by using an int() cast and catching
the ValueError exception, in which case you're basically back to type
annotation.

Your claims about types in database being a feature remind me of MySQL's
earlier claims about why they didn't support foreign key constraints (if you
write your code properly, you don't need them; they just get in the way).
Now, you'll notice that they have thoroughly removed such claims from their
documentation.

I think it's fine if you want to have a typeless database. I was just
advising the OP to take a look at his requirements and make sure that this
is acceptable. Surely you don't expect him to assume that since he already
uses Python, a dynamically typed (read: not *typeless*) language, he does
not need to consider the repercussions that switching from MySQL to SQLite
will have on his program, because of a few weak analogies and claims of
brain-fitting?
 
D

Dave Benjamin

Sorry, a few minor corrections...

First of all, a minor nit: the word you want here is "supposed", not
"suppose". =)

This typo occurs in the documentation on the web site, so my nit was most
likely misdirected.
Your claims about types in database being a feature remind me of MySQL's

s/types/typelessness/
 
M

Michael

In short, sqlite is a *very* capable little database, as long as it only
ever has one user. More than one user, and it'll block access so only one
user may access it at a time.
It does block though - it doesn't loss data if more than one program
tries to access it at once? In this case I have multiple programs that
need to write to, and read from, the db in a reliable way. Most of the
actions should be small so I don't think blocking should pose a big
question but data loss would be very bad.
SQLite treats all data as strings, but note that it does some internal
"typecasting" such that a column of numbers will be sorted numerically. You
will need to implement your own data conversion though. It's usually as
trivial as a simple mapping containing conversion functions like (from the
Roundup source):
I always type cast, and otherwise clean, all data coming in and out of
my db functions anyway (for security reasons) so that isn't a problem.
Since SQLite stores data as strings that means data such as numbers will
take up more space than in a db such as MySQL?
 
R

richard

Michael said:
It does block though - it doesn't loss data if more than one program
tries to access it at once?
Yes.


Since SQLite stores data as strings that means data such as numbers will
take up more space than in a db such as MySQL?

Yep.


Richard
 
D

David M. Cooke

At some point said:
It does block though - it doesn't loss data if more than one program
tries to access it at once? In this case I have multiple programs that
need to write to, and read from, the db in a reliable way. Most of the
actions should be small so I don't think blocking should pose a big
question but data loss would be very bad.

From the sqlite library FAQ on sqlite.org:

(7) Can multiple applications or multiple instances of the same
application access a single database file at the same time?

Multiple processes can have the same database open at the same
time. Multiple processes can be doing a SELECT at the same time.
But only one process can be making changes to the database at
once.

So, your process will block only if something else is changing the
database. It also journals the changes to the database, so interrupted
transactions are rolled back.
I always type cast, and otherwise clean, all data coming in and out of
my db functions anyway (for security reasons) so that isn't a problem.
Since SQLite stores data as strings that means data such as numbers
will take up more space than in a db such as MySQL?

Depends. If your numbers as strings are less than four bytes, they
should take less space :)

Also, by default, the Python wrapper (pysqlite.sf.net) converts rows
declared as integer and float to the appropiate Python types. And
there's a way to pass type info before a statement so it does the
conversion for you.
 
R

Richie Hindle

[Dave]
[long and reasonable argument snipped]
Surely you don't expect him to assume that since he already
uses Python, a dynamically typed (read: not *typeless*) language, he does
not need to consider the repercussions that switching from MySQL to SQLite
will have on his program, because of a few weak analogies and claims of
brain-fitting?

Not at all. I wasn't advocating blindly accepting typelessness as something
that doesn't need to be considered. I was saying that many of the same
practices that go into writing a decent Python application also apply to
using a typeless database. To put it another way: although it might at
first appear that there's a whole slew of new dangers involved, in fact
they're mostly the same dangers that Python programmers are already familiar
with. Having read your arguments, I still think that's true.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top