database persistence with mysql, sqlite

C

coldpizza

Hi,

I want to run a database query and then display the first 10 records
on a web page. Then I want to be able to click the 'Next' link on the
page to show the next 10 records, and so on.

My question is how to implement paging, i.e. the 'Next/Prev' NN
records without reestablishing a database connection every time I
click Next/Prev? Is it at all possible with cgi/mod_python?

For example, in a NON-web environment, with sqlite3 and most other
modules, I can establish a database connection once, get a cursor
object on which I run a single 'SELECT * FROM TABLE' statement and
then use cursor.fetchmany(NN) as many times as there are still results
left from the initial query.

How do I do the same for the web? I am not using any high-level
framework. I am looking for a solution at the level of cgi or
mod_python (Python Server Pages under Apache). To call
cursor.fetchmany(NN) over and over I need to pass a handle to the
database connection but how do I keep a reference to the cursor object
across pages? I use mysql and sqlite3 as databases, and I am looking
for an approach that would work with both database types (one at a
time). So far I have successfully used the following modules for
database access: sqlite3, mysqld, and pyodbc.

So far, with mysql I use 'SELECT * FROM TABLE LIMIT L1, L2' where L1
and L2 define the range for the 'Next' and 'Previous' commands. I
have to run the query every time a click a 'Next/Prev' link. But I am
not sure that this is the best and most efficient way. I suppose using
CURSOR.FETCHMANY(NN) would probably be faster and nicer but how do I
pass an object reference across pages? Is it possible without any
higher-level libraries?

What would be the proper way to do it on a non-enterprise scale?

Would SqlAlchemy or SqlObject make things easier with regard to
database persistence?
 
B

Bryan Olson

coldpizza said:
I want to run a database query and then display the first 10 records
on a web page. Then I want to be able to click the 'Next' link on the
page to show the next 10 records, and so on.
My question is how to implement paging, i.e. the 'Next/Prev' NN
records without reestablishing a database connection every time I
click Next/Prev? Is it at all possible with cgi/mod_python?

Caching database connections works in mod_python; not so
much with cgi.
For example, in a NON-web environment, with sqlite3 and most other
modules, I can establish a database connection once, get a cursor
object on which I run a single 'SELECT * FROM TABLE' statement and
then use cursor.fetchmany(NN) as many times as there are still results
left from the initial query.

How do I do the same for the web?

Short answer: you don't. It would mean saving cursors with
partial query results, and arranging for incoming requests to
go to the right process. Web-apps avoid that kind of thing.
Many web toolkits offer session objects, but do not support
saving active objects such as cursors. That said, I've
never tried what you proposing with the tools you name.

Depending on how your database handles transactions, an
open cursor can lock-out writers, and even other readers.
How long do you keep it around if the user doesn't return?

What should happen if the user re-loads a page from a few
sets-of-10 back?

I am not using any high-level
framework. I am looking for a solution at the level of cgi or
mod_python (Python Server Pages under Apache). To call
cursor.fetchmany(NN) over and over I need to pass a handle to the
database connection but how do I keep a reference to the cursor object
across pages? I use mysql and sqlite3 as databases, and I am looking
for an approach that would work with both database types (one at a
time). So far I have successfully used the following modules for
database access: sqlite3, mysqld, and pyodbc.

So far, with mysql I use 'SELECT * FROM TABLE LIMIT L1, L2' where L1
and L2 define the range for the 'Next' and 'Previous' commands. I
have to run the query every time a click a 'Next/Prev' link.

You might want to run that query by a MySQL expert.

The basic method is nice in that it needs no server-side
state between requests. (It's a little squirrely in that
it can show a set of records that was never the contents
of the table.)

But I am
not sure that this is the best and most efficient way. I suppose using
CURSOR.FETCHMANY(NN) would probably be faster and nicer but how do I
pass an object reference across pages? Is it possible without any
higher-level libraries?

Do you know that you have a performance problem? If so do
you know that it is due to too many cursor.execute() calls?
Keeping partially-executed queries is way down on the list
of optimizations to try.
What would be the proper way to do it on a non-enterprise scale?

With mod_python, you can cache connections, which may help.
If you use "ORDER BY" with "LIMIT", the right index can make
a big difference.

Have you considered implementing your 'Next/Prev' commands
on the browser side with Javascript? The server could then
get all the records in one query, and the user would see
point-in-time correct results.

Another possibility is to the get all the query results and
save them in a session object, then deal them out a few at
a time. But as a rule of thumb, the less state on the server
the better.

Would SqlAlchemy or SqlObject make things easier with regard to
database persistence?

Quite likely, but probably not in the way you propose.
The web frameworks that use those toolkits try to do
things in robust and portable ways.
 
L

Lawrence D'Oliveiro

coldpizza said:
So far, with mysql I use 'SELECT * FROM TABLE LIMIT L1, L2' where L1
and L2 define the range for the 'Next' and 'Previous' commands. I
have to run the query every time a click a 'Next/Prev' link. But I am
not sure that this is the best and most efficient way.

Try it first, then see what happens. Remember, premature optimization is the
root of all (programming) evil.
 
G

Gerardo Herzig

coldpizza said:
Hi,

I want to run a database query and then display the first 10 records
on a web page. Then I want to be able to click the 'Next' link on the
page to show the next 10 records, and so on.

My question is how to implement paging, i.e. the 'Next/Prev' NN
records without reestablishing a database connection every time I
click Next/Prev? Is it at all possible with cgi/mod_python?

For example, in a NON-web environment, with sqlite3 and most other
modules, I can establish a database connection once, get a cursor
object on which I run a single 'SELECT * FROM TABLE' statement and
then use cursor.fetchmany(NN) as many times as there are still results
left from the initial query.

How do I do the same for the web? I am not using any high-level
framework. I am looking for a solution at the level of cgi or
mod_python (Python Server Pages under Apache). To call
cursor.fetchmany(NN) over and over I need to pass a handle to the
database connection but how do I keep a reference to the cursor object
across pages? I use mysql and sqlite3 as databases, and I am looking
for an approach that would work with both database types (one at a
time). So far I have successfully used the following modules for
database access: sqlite3, mysqld, and pyodbc.
Apache/cgi just dont work this way. When apache receives a new request
(a cgi being called), it starts a new thread, it execute him, and gives
the client some result. AND THEN KILL THE THREAD. Altough i never used
it, what i think you need is fast cgi (fcgi), wich takes care of
persistent connections to a web server.

Cheers.
Gerardo
 
C

coldpizza

Try it first, then see what happens. Remember, premature optimization is the
root of all (programming) evil.

It turned out that the method above ('SELECT * FROM TABLE LIMIT L1,
L2') works ok both with mysql and sqlite3, therefore I have decided to
stick with it until I find something better. With Sqlite3 you are
supposed to use LIMIT 10 OFFSET NN, but it also apparently supports
the mysql syntax (LIMIT NN, 10) for compatibility reasons.
 
M

M.-A. Lemburg

Hi,

I want to run a database query and then display the first 10 records
on a web page. Then I want to be able to click the 'Next' link on the
page to show the next 10 records, and so on.

My question is how to implement paging, i.e. the 'Next/Prev' NN
records without reestablishing a database connection every time I
click Next/Prev? Is it at all possible with cgi/mod_python?

For example, in a NON-web environment, with sqlite3 and most other
modules, I can establish a database connection once, get a cursor
object on which I run a single 'SELECT * FROM TABLE' statement and
then use cursor.fetchmany(NN) as many times as there are still results
left from the initial query.

How do I do the same for the web? I am not using any high-level
framework. I am looking for a solution at the level of cgi or
mod_python (Python Server Pages under Apache). To call
cursor.fetchmany(NN) over and over I need to pass a handle to the
database connection but how do I keep a reference to the cursor object
across pages? I use mysql and sqlite3 as databases, and I am looking
for an approach that would work with both database types (one at a
time). So far I have successfully used the following modules for
database access: sqlite3, mysqld, and pyodbc.

So far, with mysql I use 'SELECT * FROM TABLE LIMIT L1, L2' where L1
and L2 define the range for the 'Next' and 'Previous' commands. I
have to run the query every time a click a 'Next/Prev' link. But I am
not sure that this is the best and most efficient way. I suppose using
CURSOR.FETCHMANY(NN) would probably be faster and nicer but how do I
pass an object reference across pages? Is it possible without any
higher-level libraries?

What would be the proper way to do it on a non-enterprise scale?

Depends on what "enterprise" scale means to you :)

The easiest way to get excellent performance for such queries is
using a long running process, mod_scgi and have the browser
send a session cookie for you to use to identify the request.
You can then open the connection and keep it open while the user
browses the site.

If you want to save yourself from most of the details,
just use Zope or Plone + e.g. our mxODBC Zope DA for the
database connection (it works with all the databases
you mention on Windows, Linux and Mac OS X).

Even if you don't want to code things using Zope/Plone,
you should still consider it for taking care of all the
middleware logic and then write your application as
separate package which you hook into Zope/Plone using
"external methods" or "Python scripts" (in their Zope
sense).
Would SqlAlchemy or SqlObject make things easier with regard to
database persistence?

Not really: they don't provide the session mechanisms you
would need.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Sep 25 2007)________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
 
B

Bryan Olson

coldpizza said:
It turned out that the method above ('SELECT * FROM TABLE LIMIT L1,
L2') works ok both with mysql and sqlite3, therefore I have decided to
stick with it until I find something better. With Sqlite3 you are
supposed to use LIMIT 10 OFFSET NN, but it also apparently supports
the mysql syntax (LIMIT NN, 10) for compatibility reasons.

A more reliable form is along the lines:

SELECT keyfield, stuff
FROM table
WHERE keyfield > ?
ORDER BY keyfield
LIMIT 10

With the right index, it's efficient.
 
L

Lawrence D'Oliveiro

Bryan Olson said:
A more reliable form is along the lines:

SELECT keyfield, stuff
FROM table
WHERE keyfield > ?
ORDER BY keyfield
LIMIT 10

With the right index, it's efficient.

But that involves keeping track of the right starting keyfield value for the
next batch of records, which is complicated and nontrivial. Simpler to let
the DBMS do the work for you.
 
B

Bryan Olson

Lawrence said:
But that involves keeping track of the right starting keyfield value for the
next batch of records, which is complicated and nontrivial.

I think you missed the idea here. Recall that we return a
web page showing 10 records and a 'Next' link. We write the
link so that the browser will send back the parameter we
need. If the largest keyfield value on the page is
"Two-Sheds" the link might read:

<A HREF="http://rfh.uk/tablnext.cgi?start=Two-Sheds">Next</A>

The solution is stateless. There's no "keeping track" on the
server side. When we respond to a request, we neither look up
a previous request nor store anything for a future response.

Simpler to let
the DBMS do the work for you.

Keeping a cursor with pending data across HTTP requests is
a world of hurt.
 
L

Lawrence D'Oliveiro

Bryan Olson said:
We write the link so that the browser will send back the parameter we
need. If the largest keyfield value on the page is
"Two-Sheds" the link might read:

<A HREF="http://rfh.uk/tablnext.cgi?start=Two-Sheds">Next</A>

That's assuming keyfield is a) unique, and b) a relevant ordering for
displaying to the user.
Keeping a cursor with pending data across HTTP requests is
a world of hurt.

"limit offset, count" avoids all that.
 
B

Bryan Olson

Lawrence said:
That's assuming keyfield is a) unique,

Exactly; that was was the idea behind the name choice. The
method extends to multi-column keys, so it is generally
applicable.
and b) a relevant ordering for
displaying to the user.

That's a nice-to-have, but not required.
"limit offset, count" avoids all that.

It can be stateless, but then it is unreliable. Here's an
example with Python 2.5:


import sqlite3

db = sqlite3.connect(":memory:")

# Simple table, an integer and the hex for that integer
db.execute(
"CREATE TABLE numbers (num INTEGER PRIMARY KEY, hex TEXT)")

# Start with 10-29 in the table
for i in range(10, 30):
db.execute("INSERT INTO numbers VALUES (?, ?)", (i, hex(i)))


# Print 4 records starting at offset
def next4(offset):
cur = db.execute(
"SELECT * FROM numbers LIMIT 4 OFFSET ?",
(offset,))
for x in cur:
print x


# Walk the table with LIMIT and OFFSET

next4(0) # Good, prints 10-13
next4(4) # Good, prints 14-17

# Another transaction inserts new records
for i in range(0, 4):
db.execute("INSERT INTO numbers VALUES (?, ?)", (i, hex(i)))

next4(8) # Bad, prints 14-17 again

# Another transaction deletes records
for i in range(0, 4):
db.execute("DELETE FROM numbers WHERE num = ?", (i,))

next4(12) # Bad, we're missing 18-21



The method I advocated is still not the same as doing the
whole thing in a serializable transaction, but it will
return any record that stays in the table the whole time,
and will not return any record multiple times.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top