SQL user function returning list for IN clause

F

Felix

I am using the Python SQLite3 interface, but the question is probably
general to python and SQL.

I want to run a query like

select * from table a, table b where a.foo IN foobar(b.bar)

where foobar is a user function (registered by create_function in
pysqlite3) returning a list of integers. However such functions can
only return basic data types so the above is invalid. I am wondering
what the best way around this is.

I could fetch rows from table b, compute foobar(b.bar) and create a
new query for each result, but that seems very inefficient.
I could create a new table matching each row in b to all values of
b.bar and use that to join but that would be inefficient and very
redundant.

Rewriting the query to say
select * from table a, table b where foobar_predicate(a.foo, b.bar)
would work (foobar_predicate checks if a.foo is in foobar(b.bar). But
it does not allow to use an index on a.foo

If I knew the maximum length of foobar(b.bar) I could say
select * from table a, table b where a.foo in (foobar(b.bar,0), foobar
(b.bar,1), ..., foobar(b.bar,n))
where the second parameter to foobar chooses which element to return.
This is clearly not optimal.

Am I missing some obvious elegant way to do this or is it just not
possible given that the SQL IN statement does not really deal with
lists in the python sense of the word?

Thanks
 
P

Peter Otten

Felix said:
I am using the Python SQLite3 interface, but the question is probably
general to python and SQL.

I want to run a query like

select * from table a, table b where a.foo IN foobar(b.bar)

where foobar is a user function (registered by create_function in
pysqlite3) returning a list of integers. However such functions can
only return basic data types so the above is invalid. I am wondering
what the best way around this is.

I could fetch rows from table b, compute foobar(b.bar) and create a
new query for each result, but that seems very inefficient.
I could create a new table matching each row in b to all values of
b.bar and use that to join but that would be inefficient and very
redundant.

Rewriting the query to say
select * from table a, table b where foobar_predicate(a.foo, b.bar)
would work (foobar_predicate checks if a.foo is in foobar(b.bar). But
it does not allow to use an index on a.foo

If I knew the maximum length of foobar(b.bar) I could say
select * from table a, table b where a.foo in (foobar(b.bar,0), foobar
(b.bar,1), ..., foobar(b.bar,n))
where the second parameter to foobar chooses which element to return.
This is clearly not optimal.

Am I missing some obvious elegant way to do this or is it just not
possible given that the SQL IN statement does not really deal with
lists in the python sense of the word?

Define a function foobar_contains() as follows:

def foobar_contains(foo, bar):
return foo in foobar(bar)

and change the query to

select * from table a, table b where foobar_contains(a.foo, b.bar)

Peter
 
F

Felix

Rewriting the query to say
Define a function foobar_contains() as follows:

def foobar_contains(foo, bar):
    return foo in foobar(bar)

and change the query to

select * from table a, table b where foobar_contains(a.foo, b.bar)

I thought about that (see above), but it would not use an index on
a.foo which a regular a.foo IN (x,y,z) does.

Felix
 
P

Peter Otten

Felix said:
I thought about that (see above), but it would not use an index on
a.foo which a regular a.foo IN (x,y,z) does.

Sorry for not reading your post carefully.

Peter
 
P

pjcoup

I could create a new table matching each row in b to all values of
b.bar and use that to join but that would be inefficient and very
redundant.
[snip]

Is foobar(b.bar) essentially static? (I'm guessing so if you
considered this as an option).
If so, then this actually sounds like the best option to me.
Indexing on the foobar return values, this new table and joining
as described has got to be faster than the other alternatives
(even the foobar(b.bar,k) options).
Without this new table, it seems you have to calculate foobar(b.bar)
for every row of b (even if it is SQLite doing it, and not you
directly), unless I'm missing something. I'm assuming that the
overhead to store these function values won't kill you.
Good luck!

Pete
 
L

Lawrence D'Oliveiro

In message <c601fad6-8126-4f43-
I want to run a query like

select * from table a, table b where a.foo IN foobar(b.bar)

where foobar is a user function (registered by create_function in
pysqlite3) returning a list of integers. However such functions can
only return basic data types so the above is invalid. I am wondering
what the best way around this is.

Is it feasible to iterate over the entire inner join of both tables?

for entry in iterator("select a.foo, b.bar ... from a, b") :
if foobarp(entry["a.foo"], entry["b.bar"]) :
.... this is a combination I want ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top