pySQLite Insert speed

M

mdboldin

I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?
 
C

Carsten Haese

I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?

My only problem with (B) is that it should really be this:

sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'
curs.execute( sqlxb, values )

Apart from that, (B) is better than (A). The parameter binding employed in (B)
is not only faster on many databases, but more secure. See, for example,
http://informixdb.blogspot.com/2007/07/filling-in-blanks.html for some
in-depth explanations of why parameter binding is better than string
formatting for performing SQL queries with variable values.

HTH,
 
M

mdboldin

(B) is better than (A). The parameter binding employed in (B)
is not only faster on many databases, but more secure.
See, for example,http://informixdb.blogspot.com/2007/07/filling-in-
blanks.html

Thx. The link was helpful, and I think I have read similar things
before-- that B is faster.
So ... I just rewrote the test code from scratch and B is faster. I
must have had something wrong in my original timing.
 
S

Steve Holden

See, for example,http://informixdb.blogspot.com/2007/07/filling-in-
blanks.html

Thx. The link was helpful, and I think I have read similar things
before-- that B is faster.
So ... I just rewrote the test code from scratch and B is faster. I
must have had something wrong in my original timing.

Don't forget, by the way, that your original (B) code was performing the
string substitution of the parameter markers into the SQL statement each
time the statement was executed. As Carsten pointed out, this overhead
should be performed at most once, and only then if you want your code to
be portable over database backends with different paramstyles. Forget (A).

regards
Steve
 
T

Tim Roberts

I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?

I think you misunderstood. (B) is *ALWAYS* the proper way of doing
parameterized SQL queries. Unconditionally. The (A) style is way too
vulnerable to SQL injection attacks.
 
M

mdboldin

Steve, I want to make sure I understand. My test code is below, where
ph serves as a placeholder. I am preparing for a case where the number
of ? will be driven by the length of the insert record (dx)

dtable= 'DTABLE3'
print 'Insert data into table %s, version #3' % dtable
ph= '?, ?, ?, ?'
sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
t0a=time.time()
for dx in d1:
curs1.execute(sqlx,dx)
print (time.time()-t0a)
print curs1.lastrowid
conn1.commit()

I think you are saying that sqlx is re-evaluated in each loop, i.e.
not the same as pure hard coding of
sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
Is that right? Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.

And yes, I can see why (B) is always better from a security
standpoint. The python solutions for problems such as there are a
great help for people like me, in the sense that the most secure way
does not have a speed penalty (and in this case is 3-4x faster).
 
S

Steve Holden

Steve, I want to make sure I understand. My test code is below, where
ph serves as a placeholder. I am preparing for a case where the number
of ? will be driven by the length of the insert record (dx)

dtable= 'DTABLE3'
print 'Insert data into table %s, version #3' % dtable
ph= '?, ?, ?, ?'
sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
t0a=time.time()
for dx in d1:
curs1.execute(sqlx,dx)
print (time.time()-t0a)
print curs1.lastrowid
conn1.commit()

I think you are saying that sqlx is re-evaluated in each loop, i.e.
not the same as pure hard coding of
sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
Is that right?

Yes. If the sql is constant then you would be performing an unnecessary
computation inside the loop. Not a biggie, but it all takes time. Is the
loop above your original code? If so I was wrong about the loop.
Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.
Now this I don;t really understand at all. What's the point of trying to
replace sqlx with a copy of itself? Perhaps if you explained what you
hope this will achieve I could comment more intelligently.
And yes, I can see why (B) is always better from a security
standpoint. The python solutions for problems such as there are a
great help for people like me, in the sense that the most secure way
does not have a speed penalty (and in this case is 3-4x faster).

Yes, it's a real win-win. Since both the table and the number of
arguments appear to be variable one possible solution is to build a dict
that would allow you to look up the right SQL using the table name. So,
suppose you have the following tables and number of arguments:

tables = (("table1", 3),
("table2", 5),
("table3", 2)
)

you could create a suitable dict as (untested):

tdict = {}
for tbl, ct in tables:
tdict[tbl] = "INSERT INTO %s VALUES (%s)" % \
(tbl, ", ".join(["?"] * ct))

Then you can use the table to look up the right SQL, quite a fast
operation compared with actually building it.

regards
Steve
 
D

Diez B. Roggisch

I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?

You most certainly have not found that A is the preferred over B - the
opposite is true. Using A will make you vulnerable against
SQL-injection-attacks. B OTOH will ensure that the parameters are
properly escaped or otherwise dealt with.

Regarding the intuition - that depends on what actually happens inside
B. If B works in a way that it

- converts arguments to strings

- escapes these where necessary

- builts one SQL-statement out of it

- excutes the SQL

then B is slower than A because A is just string-interpolation, whereas
B is sanitizing + string-interpolation. So it must be slower.

But a "sane" DB will instead directly use the SQL passed in B, and
transmit the parameter as binary into the backend, resulting in more
compact representation + lesser or now marshalling overhead plus
possible query parsing overhead reduction due to cached execution plans.
Which could explain B being more performant.

Diez
 
M

mmm

Now this I don;t really understand at all. What's the point of trying to
replace sqlx with a copy of itself? Perhaps if you explained what you
hope this will achieve I could comment more intelligently.

I am/was attempting to convert

sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)

to code that did to need to be re-evaluated. i.e. to insert the
dtable and ph values as if they were hard coded.

copy.copy --> A shallow copy constructs a new compound object and
then (to the extent possible) inserts references into it to the
objects found in the original.
 
S

Steve Holden

mmm said:
I am/was attempting to convert

sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)

to code that did to need to be re-evaluated. i.e. to insert the
dtable and ph values as if they were hard coded.

copy.copy --> A shallow copy constructs a new compound object and
then (to the extent possible) inserts references into it to the
objects found in the original.

Unfortunately you weren't dealing with a compound object object here, so
all you are doing is creating a copy of the string you've just created
and replacing the original with it. Copy.copy() is meant for creating
(say) lists, tuples and dicts where the elements are references to the
same objects that the elements of the original structure referred to.

regards
Steve
 
M

mmm

Steve, I think you were right the first time is saying
it should really be this:
sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'

my copy.copy() has the equivalent effect.

Running this test code produces the output below

import copy

print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?, **** )
sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

I interpret this to mean that sqlx1 is not a simple string
 
S

Steve Holden

mmm said:
Steve, I think you were right the first time is saying


my copy.copy() has the equivalent effect.

Running this test code produces the output below

import copy

print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?, **** )
sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

I interpret this to mean that sqlx1 is not a simple string

Sorry, since you didn't copy and paste the actual code you ran I don't
regard those results as reliable.

What I will repeat, however, is that while there is a *slight*
difference is semantics between

s = "some string"
s1 = s

and

s = "some string"
s1 = copy.copy(s)

that difference is only to ensure that s and s1 point to different
copies of the same string in the latter case, whereas in the former case
s and s1 point to the same string.

Since strings are immutable in Python this really doesn't make any
difference. In either case changing the value of s will leave the value
of s1 unchanged, since it will still point to the string it was
originally bound to and s will point to some other string.

I suspect some superstition is at play here, or possibly you are cargo
cult programming :)

regards
Steve
 
P

Peter Otten

Steve said:
What I will repeat, however, is that while there is a *slight*
difference is semantics between

s = "some string"
s1 = s

and

s = "some string"
s1 = copy.copy(s)

that difference is only to ensure that s and s1 point to different
copies of the same string in the latter case, whereas in the former case
s and s1 point to the same string.

No, both "point" to the same string:
True

copy.copy() is just an expensive no-op here.

Peter
 
S

Steve Holden

Peter said:
No, both "point" to the same string:

True

copy.copy() is just an expensive no-op here.
I suppose wiht strings being immutable there is no need for copy.copy()
to actually return anything other than its argument for a string. Thanks
for pointing that out.

regards
Steve
 
S

Steve Holden

Peter said:
No, both "point" to the same string:

True

copy.copy() is just an expensive no-op here.
I suppose wiht strings being immutable there is no need for copy.copy()
to actually return anything other than its argument for a string. Thanks
for pointing that out.

regards
Steve
 
M

mmm

Oops I did make a mistake. The code I wanted to test should have been

import copy
print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

and the results would
== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1=  INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx2=  INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3=  INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Such that sqlx1 does to change with the re-assignment of 'pf'
And of course immutables such as strings are immutable. Got it now.
 
S

Steve Holden

mmm said:
Oops I did make a mistake. The code I wanted to test should have been

import copy
print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

and the results would
== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Such that sqlx1 does to change with the re-assignment of 'pf'
And of course immutables such as strings are immutable. Got it now.

You still aren't showing us the code you are actually running. Why can't
you just paste it into your message?

But anyway, you appear to have got the drift now.

regards
Steve
 
S

Steve Holden

mmm said:
Oops I did make a mistake. The code I wanted to test should have been

import copy
print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

and the results would
== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Such that sqlx1 does to change with the re-assignment of 'pf'
And of course immutables such as strings are immutable. Got it now.

You still aren't showing us the code you are actually running. Why can't
you just paste it into your message?

But anyway, you appear to have got the drift now.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top