DBI:mysql and huge tables

T

Thomas Götz

hi there,

I have the following problem:

I'm using a MySQL database with a table that contains around 500.000
records. If I now want to select all that rows to do something with them,
this is getting _very_ memory-consuming.
Here's the code I use:

---------
#!/usr/bin/perl -w

use strict;
use warnings;
use DBI;

my $dsn         = "DBI:mysql:SOME_DATABASE:localhost";
my $user        = "tom";
my $pwd         = "*********";

my $dbh = DBI->connect($dsn, $user, $pwd) or die;
my $sql = qq{SELECT `SOME_FIELD` ` FROM `SOME_TABLE`};
my $sth = $dbh->prepare($sql);
$sth->execute;

while (my $result = $sth->fetchrow_array) {
        print "$result\n";
}
$sth->finish;
$dbh->disconnect;
exit;
---------

So, when executing the statement it seems that all result are loaded into
memory before they are processed, which is not what I want. What could I do
to speed up things and become less memory hungry?

Tom
 
T

Tore Aursand

my $sql = qq{SELECT `SOME_FIELD` ` FROM `SOME_TABLE`};
my $sth = $dbh->prepare($sql);
$sth->execute;
while (my $result = $sth->fetchrow_array) {
        print "$result\n";
}
$sth->finish;
$dbh->disconnect;
exit;

So, when executing the statement it seems that all result are loaded
into memory before they are processed, which is not what I want. What
could I do to speed up things and become less memory hungry?

I have never thought of this, even though I've created scripts which deal
with a "fair amount" of data. Are you _absolutely_ sure that this happens?

Unfortunately, I don't have a large (enough) dataset to this on, but you
could always try 'fetchrow_arrayref()' instead of 'fetchrow_array()'. I
doubt it will consume less memory (hey, who knows?), but according to the
DBI documentation (AFAIR) it's faster.
 
C

ctcgag

Thomas =?ISO-8859-15?Q?G=F6tz?= said:
hi there,

I have the following problem:

I'm using a MySQL database with a table that contains around 500.000
records. If I now want to select all that rows to do something with them,
this is getting _very_ memory-consuming.
Here's the code I use: ....
my $sql = qq{SELECT `SOME_FIELD` ` FROM `SOME_TABLE`};
my $sth = $dbh->prepare($sql);

$sth->{mysql_use_result}=1; ## Add this line here
$sth->execute;

while (my $result = $sth->fetchrow_array) {
print "$result\n";
}
$sth->finish;
$dbh->disconnect;
exit;

Yep, that does appear to be the default behavior. I guess it is because
readers block writers in mysql, so it wants to read the data as fast as
possible (i.e. into memory) to get out of the way.
which is not what I want. What could I
do to speed up things and become less memory hungry?

set the mysql_use_result attribute after prepare but before execute.

Xho
 
T

Tore Aursand

set the mysql_use_result attribute after prepare but before execute.

Ah. Excellent. This teaches me to read the DBD::mysql documentation, not
only the DBI documentation.

One suggestion, though: It's possible to set this "flag" on a per-query
basis, so that only a few queries are affected;

my $sth = $dbh->prepare('SELECT lots_of_data
FROM huge',
{'mysql_use_result' => 1});
$sth->execute();
...
$sth->finish();

One should have this in mind, though:

"...the 'mysql_use_result' attribute: This forces the driver to
use 'mysql_use_result' rather than 'mysql_store_result'. The
former is faster and less memory consuming, but tends to block
other processes."

In other words: As long as you're running this script quite stand-alone,
you could very well go fo the 'mysql_use_result' solution.

Thanks to 'ctcgag' for pointing this out!


--
Tore Aursand <[email protected]>
"Omit needless words. Vigorous writing is concise. A sentence should
contain no unnecessary words, a paragraph no unnecessary sentences,
for the same reason that a drawing should have no unnecessary lines
and a machine no unnecessary parts." -- William Strunk Jr.
 
C

ctcgag

Tore Aursand said:
Ah. Excellent. This teaches me to read the DBD::mysql documentation,
not only the DBI documentation.

I had actually just read about it in DuBois's book, but didn't understand
what it meant until I read that question here. Then it all made sense.
One suggestion, though: It's possible to set this "flag" on a per-query
basis, so that only a few queries are affected;

my $sth = $dbh->prepare('SELECT lots_of_data
FROM huge',
{'mysql_use_result' => 1});

I always knew that that second argument to prepare (and selectall*) had to
be good for something! I like the way this code looks better than mine.
However, it is behaviorally the same. When you set the value on the handle
after the prepare, it also only effects that handle. There doesn't seem
to be a way to change the default for an entire connection.
$sth->execute();
...
$sth->finish();

One should have this in mind, though:

"...the 'mysql_use_result' attribute: This forces the driver to
use 'mysql_use_result' rather than 'mysql_store_result'. The
former is faster and less memory consuming, but tends to block
other processes."

In other words: As long as you're running this script quite stand-alone,
you could very well go fo the 'mysql_use_result' solution.

It's even a little more flexible than that. As long as you don't
dilly-dally around between the time you execute and the time you read the
last row (or the time you call finish, if you never read through to the
end), it will be fine. Or at least, it will be as good as it gets.

The real problem occurs if you never read to the end and never call finish,
and the handle never falls out of scope. Then your handle is holding a
read lock on the mysql table, blocking all write access to the table. And
once a writer is in queue behind the read lock, all new readers are also
blocked.


Xho
 
T

Thomas Götz

$sth->{mysql_use_result}=1; ## Add this line here

yeah, that is exactly what I was looking for, thanks for that hint! ;-)
As the script is running on a mysql-server that has not too much traffic I
can live with the limitations you mentioned (blocking ...).

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top