DBI problem : How can I load quickly one huge table with DBI ??.

T

Tim Haynes

Vincent Le-Texier said:
If the query is :

my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
my $s_trans = $dbh->prepare($str_trans);

Foreach entries I want to load into the transcript table , I do :
$s_trans->bind_param(1,$id);
$s_trans->bind_param(2,$embl_acc);

and $s_trans->execute;

OK. At least you're not preparing it as well as executing it every time :)
I would like to known, if there are objects and methods with perl DBI to
load by block of statements (execute one statement every 3000 entries for
example) instead of each entry.

Two thoughts:

a) do you have any indexes or primary keys on the table while you're
inserting all this stuff? You don't want the hold-up of maintaining the
index every time, so only create such things after all the data's in place;

b) sure you can batch things up, with transactions. Turn off auto_commit
behaviour e.g. like:

| $dbh=DBI->connect("dbi:[stuff]", "", "", {AutoCommit => 0})

and then in your main loop, maintain a counter of number of rows and every
few hundred, do a commit. Fill in the blanks in the following:

| $rowcount=0;
| $sth=$dbh->prepare (some_insert_statement);
|
| while (looping_over_input_values) {
| $rc=$sth->bind_param(1, something);
| $rc=$sth->bind_param(2, somethingelse);
| $rc=$sth->execute;
|
| if(!($rowcount%100)) {
| $handle->commit;
| }
| $rowcount++;
| }
| $handle->commit; # don't forget this after they're all in

HTH,

~Tim
--
Product Development Consultant
OpenLink Software
Tel: +44 (0) 20 8681 7701
Web: <http://www.openlinksw.com>
Universal Data Access & Data Integration Technology Providers
 
V

Vincent Le-Texier

Hi all,

I'm using DBI from perl to load Oracle databse (dbi:Oracle).

my program works but very slow.

The fact is :

If the query is :

my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
my $s_trans = $dbh->prepare($str_trans);

Foreach entries I want to load into the transcript table , I do :
$s_trans->bind_param(1,$id);
$s_trans->bind_param(2,$embl_acc);

and $s_trans->execute;


that means, for each entry I've executed the statement $str_trans.

THIS IS VERY SLOW (if you have for example 1 million entries to load).


I would like to known, if there are objects and methods with perl DBI
to load by block of statements (execute one statement every 3000 entries
for example) instead of each entry.

Others ideas are also welcome.

Thanks for your advices,

Vincent.
 
D

Dominik Seelow

Vincent Le-Texier wrote:

Hello Vincent,
Hi all,

I'm using DBI from perl to load Oracle databse (dbi:Oracle).

my program works but very slow.

The fact is :

If the query is :

my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
my $s_trans = $dbh->prepare($str_trans);

Foreach entries I want to load into the transcript table , I do :
$s_trans->bind_param(1,$id);
$s_trans->bind_param(2,$embl_acc);
you don't have do to that.

$s_trans->execute ($id,$embl_acc)

works as well.
and $s_trans->execute;


that means, for each entry I've executed the statement $str_trans.

THIS IS VERY SLOW (if you have for example 1 million entries to load).


I would like to known, if there are objects and methods with perl DBI
to load by block of statements (execute one statement every 3000 entries
for example) instead of each entry.

You should set
$dbi->{AutoCommit} = 0;

so that statements are only committed if you explicitly do this. Of
course, you'll have to add a
$dbi->commit()
then, either at the end of your code (which is probably not a good idea
for 1 million entries) or you commit every 10,000 rows or so.

However, I once wrote the whole NCBI Unigene data (including accession
numbers) into two tables and, surprisingly, it was actually quite fast.

Others ideas are also welcome.

I think, SQL Loader might be faster than Perl. But I never used it. :)
Thanks for your advices,

Vincent.

Cheers,
Dominik
 
R

Ron Reidy

Since this is Oracle, you have a couple of options:

1. Use SQL*Loader. You will **never, ever** be able to write anything
that will match the speed. Because of the volume of data, this willbe
your best option. make sure you are using the direct method (1).
2. Use direct inserts with the "/*+ append */" hint (1)
3. Rewrite this using the Oracle::OCI module. I am not sure, but you
may be able to use array processing with this module. If you can use
array processing, it will be orders of magnitude faster than single inserts.

(1) Note: The direct method will leave your indexes in an invalid
state. When the load is completed, they will need to be rebuilt. Also,
any triggers you have on these tables will not fire with this method.

In addition to Tim's suggestions, you should be connecting using the
BEQUEATH SQL*Net connection protocol. Network traffic could be an issue
here.

Last, you should look into using the Oracle events system (talk with
your DBA, or look at the Oracle docs). This will tell you what is
slowing you code down. No matter what you are doing, if you are doing
Oracle development, you need to know how to use this tool.

Tim said:
If the query is :

my $str_trans = "INSERT INTO transcript (id,embl_acc) VALUES (?,?)";
my $s_trans = $dbh->prepare($str_trans);

Foreach entries I want to load into the transcript table , I do :
$s_trans->bind_param(1,$id);
$s_trans->bind_param(2,$embl_acc);

and $s_trans->execute;


OK. At least you're not preparing it as well as executing it every time :)

I would like to known, if there are objects and methods with perl DBI to
load by block of statements (execute one statement every 3000 entries for
example) instead of each entry.


Two thoughts:

a) do you have any indexes or primary keys on the table while you're
inserting all this stuff? You don't want the hold-up of maintaining the
index every time, so only create such things after all the data's in place;

b) sure you can batch things up, with transactions. Turn off auto_commit
behaviour e.g. like:

| $dbh=DBI->connect("dbi:[stuff]", "", "", {AutoCommit => 0})

and then in your main loop, maintain a counter of number of rows and every
few hundred, do a commit. Fill in the blanks in the following:

| $rowcount=0;
| $sth=$dbh->prepare (some_insert_statement);
|
| while (looping_over_input_values) {
| $rc=$sth->bind_param(1, something);
| $rc=$sth->bind_param(2, somethingelse);
| $rc=$sth->execute;
|
| if(!($rowcount%100)) {
| $handle->commit;
| }
| $rowcount++;
| }
| $handle->commit; # don't forget this after they're all in

HTH,

~Tim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top