Interview Question on Subqueries vs Regular joins

L

leonard.reinstein

Hi!

I was asked the following question at a technical interview: when
would you use subqueries and when would you use regular joins (pros
and cons of each approach in terms of design and performance)?

I have read several articles on the subject but could not find a good
answer to this question.

Any help would be appreciated.

Thanks!
 
C

Charles Hooper

Hi!

I was asked the following question at a technical interview: when
would you use subqueries and when would you use regular joins (pros
and cons of each approach in terms of design and performance)?

I have read several articles on the subject but could not find a good
answer to this question.

Any help would be appreciated.

Thanks!

This technical interview question does not make sense. Oracle _may_
automatically transform a subquery into an inline view, which then is
joined to the rest of the query as a "regular join".

Consider the following example, which may be thought of as a customer
order (T1), the lines for the customer order (T2), and a table
containing a list of parts that are not in stock (T3):
CREATE TABLE T1(
MY_ID VARCHAR2(30),
MY_DATE DATE,
CUSTOMER_ID VARCHAR2(15),
PRIMARY KEY (MY_ID));

CREATE TABLE T2(
T1_MY_ID VARCHAR2(30),
LINE_NO NUMBER(10),
PART_ID VARCHAR2(30),
ORDER_QTY NUMBER(22,4),
LINE_DATE DATE,
PRIMARY KEY (T1_MY_ID,LINE_NO));

CREATE TABLE T3(
PART_ID VARCHAR2(30),
BACK_ORDER_DATE DATE,
PRIMARY KEY (PART_ID));

INSERT INTO T1 VALUES(
'001',
TO_DATE('01-JAN-2006','DD-MON-YYYY'),
'C001');

INSERT INTO T1 VALUES(
'002',
TO_DATE('01-JAN-2006','DD-MON-YYYY'),
'C002');

INSERT INTO T1 VALUES(
'003',
TO_DATE('01-JAN-2006','DD-MON-YYYY'),
'C003');

INSERT INTO T1 VALUES(
'004',
TO_DATE('01-JAN-2006','DD-MON-YYYY'),
'C001');

INSERT INTO T2 VALUES(
'001',
1,
'ABC',
5,
TO_DATE('01-JAN-2006','DD-MON-YYYY'));

INSERT INTO T2 VALUES(
'001',
2,
'ABC',
5,
TO_DATE('05-JAN-2006','DD-MON-YYYY'));

INSERT INTO T2 VALUES(
'001',
3,
'ABC',
10,
TO_DATE('10-JAN-2006','DD-MON-YYYY'));

INSERT INTO T2 VALUES(
'002',
1,
'ABCD',
20,
TO_DATE('05-JAN-2006','DD-MON-YYYY'));

INSERT INTO T2 VALUES(
'004',
1,
'ABCE',
5,
TO_DATE('15-JAN-2006','DD-MON-YYYY'));

INSERT INTO T2 VALUES(
'004',
2,
'ABC',
10,
TO_DATE('20-JAN-2006','DD-MON-YYYY'));

INSERT INTO T3 VALUES(
'ABCD',
TO_DATE('15-FEB-2006','DD-MON-YYYY'));

INSERT INTO T3 VALUES(
'ABCE',
TO_DATE('20-FEB-2006','DD-MON-YYYY'));

INSERT INTO T3 VALUES(
'ABCF',
TO_DATE('25-FEB-2006','DD-MON-YYYY'));

A standard query to combine the header record with the line detail
record (note that header record 003 will not be returned), a "regular
join".
SELECT
T1.MY_ID,
T1.MY_DATE ORDER_DATE,
T1.CUSTOMER_ID,
T2.LINE_NO,
T2.PART_ID,
T2.ORDER_QTY,
T2.LINE_DATE
FROM
T1,
T2
WHERE
T1.MY_ID=T2.T1_MY_ID;

Now, let's add a subquery to return only those line detail records
that are not back ordered:
SELECT
T1.MY_ID,
T1.MY_DATE ORDER_DATE,
T1.CUSTOMER_ID,
T2.LINE_NO,
T2.PART_ID,
T2.ORDER_QTY,
T2.LINE_DATE
FROM
T1,
T2
WHERE
T1.MY_ID=T2.T1_MY_ID
AND T2.PART_ID NOT IN (
SELECT
PART_ID
FROM
T3);

Oracle may automatically transform the above into something like this:
SELECT
T1.MY_ID,
T1.MY_DATE ORDER_DATE,
T1.CUSTOMER_ID,
T2.LINE_NO,
T2.PART_ID,
T2.ORDER_QTY,
T2.LINE_DATE
FROM
T1,
T2,
(SELECT
PART_ID
FROM
T3) T3
WHERE
T1.MY_ID=T2.T1_MY_ID
AND T2.PART_ID=T3.PART_ID(+)
AND T3.PART_ID IS NULL;

Or, it may transform it into something else.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
 
L

leonard.reinstein

This technical interview question does not make sense. Oracle _may_
automatically transform a subquery into an inline view, which then is
joined to the rest of the query as a "regular join".

Understood. So what would be the general guidelines for a developer to
use "regular" joins as opposed to subqueries? Or does it simply a
matter of coding preference and won't matter from design and
performance perspective?
 
J

joel garry

Hi!

I was asked the following question at a technical interview: when
would you use subqueries and when would you use regular joins (pros
and cons of each approach in terms of design and performance)?

I have read several articles on the subject but could not find a good
answer to this question.

Any help would be appreciated.

Thanks!

Holy smokes! People have written entire _books_ about that question.
I would just throw up my hands and shake my head and say I guess I'm
not good enough for that job.

Start here: http://www.google.com/search?hl=en&q=subqueries+joins+"jonathan+lewis"&btnG=Google+Search
http://www.google.com/search?hl=en&q=subqueries+joins+site:asktom.oracle.com

On the other hand, they may just be looking for how you approach such
problems, and the answer might be something like there isn't any
general rule of thumb that will hold up over time, you must start with
a properly normalized design, and thorough testing, examination of
plans and tracing where necessary should elicit where specific coding
styling should be done. After all, we should be giving enough
information to the optimizer to figure out what it has to do, that is
the point of non-procedural languages, after all.

On the third hand, they may have just come back from some presentation
where someone gave specific examples of the limitations of the
optimizer, and be expecting you to know about that.

On the fourth hand, they may have seen some rules of thumb about the
subject, and be expecting that answer.

On the fifth hand, they may be looking for whether you can describe
how to prune queries more specifically with the subqueries, and be
expecting a discussion of various filtering techniques, and perhaps a
discussion of correlation between the parts of a query, unnesting,
what Oracle does to process a query, and maybe even ref cursor usage.
On the join side, they may be expecting this:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523

I don't interview well. I seem to have much better luck just going in
and doing the work.

jg
 
C

Charles Hooper

Understood. So what would be the general guidelines for a developer to
use "regular" joins as opposed to subqueries? Or does it simply a
matter of coding preference and won't matter from design and
performance perspective?

Joel Garry provided a more complete answer, but I will take a stab at
the answer...

In Oracle 8i and earlier, there was a difference. In Oracle 8i, the
subquery had to be resolved for _each_ row in the resultset as those
versions did not perform an automatic transformation, which often
yielded poor performance compared to an equivalent query that made use
of an inline view. Oracle 9i, in some/many cases Oracle blindly
performs such transformations, even if the cost (expected time) will
be greater. Oracle 10g may perform such transformations if the cost
will be lower.

A couple months ago I contributed to this thread, in which I compared
the performance of the different methods (EXISTS subquery, IN
subquery, and inline view):
http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/6b51cb4e0c26256b

In the above test case, the subquery method, without allowing any
automatic transformations, required 46 minutes and 21 seconds,
compared to 0.21 seconds when transformations were permitted.

Another thread:
http://groups.google.com/group/comp.databases.oracle.server/browse_thread/thread/db317ea2faa1a7d0

The best general guideline is to test the performance of the
equivalent SQL statements on the version of Oracle that is available.
Compare the execution plans - if the execution plans look very
similar, Oracle probably transformed one or more of the SQL statements
into a more efficient form.

I personally prefer the inline view approach with a "regular" join.
Note that in some cases, in the inline view the DISTINCT clause must
be included to eliminate unintended duplicate result rows when
converting a query from having a subquery to a "regular" join, so
maybe that is the distinction that the interviewer was hoping that you
would identify?

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
 
F

fitzjarrell

Joel Garry provided a more complete answer, but I will take a stab at
the answer...

In Oracle 8i and earlier, there was a difference.  In Oracle 8i, the
subquery had to be resolved for _each_ row in the resultset as those
versions did not perform an automatic transformation, which often
yielded poor performance compared to an equivalent query that made use
of an inline view.  Oracle 9i, in some/many cases Oracle blindly
performs such transformations, even if the cost (expected time) will
be greater.  Oracle 10g may perform such transformations if the cost
will be lower.

A couple months ago I contributed to this thread, in which I compared
the performance of the different methods (EXISTS subquery, IN
subquery, and inline view):http://groups.google.com/group/comp.databases.oracle.misc/browse_thre...

In the above test case, the subquery method, without allowing any
automatic transformations, required 46 minutes and 21 seconds,
compared to 0.21 seconds when transformations were permitted.

Another thread:http://groups.google.com/group/comp.databases.oracle.server/browse_th...

The best general guideline is to test the performance of the
equivalent SQL statements on the version of Oracle that is available.
Compare the execution plans - if the execution plans look very
similar, Oracle probably transformed one or more of the SQL statements
into a more efficient form.

I personally prefer the inline view approach with a "regular" join.
Note that in some cases, in the inline view the DISTINCT clause must
be included to eliminate unintended duplicate result rows when
converting a query from having a subquery to a "regular" join, so
maybe that is the distinction that the interviewer was hoping that you
would identify?

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

It could also be that the interviewer didn't know any more than the
interviewee and was merely reading questions (and 'answers') found in
various places on the Internet.

In my mind the question is too open-ended to allow for a definitive
answer, as you and others have proven in this thread. Any
'answer' (or 'silver bullet') claimed as definitive is most likely
suspect.

My two cents.


David Fitzjarrell
 
R

Robert M. Gary

This technical interview question does not make sense.  Oracle _may_
automatically transform a subquery into an inline view, which then is
joined to the rest of the query as a "regular join".

Perhaps the question was more to style than to how Oracle internally
masticates the query.
-robert
 
L

Lew

Hi!

I was asked the following question at a technical interview: when
would you use subqueries and when would you use regular joins (pros
and cons of each approach in terms of design and performance)?

I have read several articles on the subject but could not find a good
answer to this question.

Just out of curiosity, why was this cross-posted to clj.programmer? It's not
a Java question.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,432
Messages
2,571,681
Members
48,796
Latest member
Greg L.

Latest Threads

Top