JL Computer Consultancy

Subquery factoring – the WITH clause.

(Recreated from an original written for the Dizwell Wiki).

June 2006


Subquery factoring appeared in Oracle 9i as a mechanism which had two possible benefits. First, it could make complex SQL easier to read, second it made it possible for the optimizer to find (or the developer to enforce) a new optimization strategy.

Subquery factoring is introduced by the “with” clause. In outline, something like the following becomes valid:

with {subquery_name_1} as (
        select statement 1
),
{subquery_name_2} as (
        select statement 2
        possibly referencing object {subquery_name_1}
)
select ...
from
        subquery_name_1  sq1,
        subquery_name_2  sq2,
        some_table       st
where   ...

As you can infer from the outline, if you have a very messy query, with lots of complicated sub_clauses, you may be able to write it in parts (the subquery factoring) and then join those parts.

A specific performance option that then becomes available - and can be controlled - is that the optimizer may decide (a) to copy back the text of the factored subquery and optimize the resulting complex statement, or (b) create a global temporary table (with in-memory metadata) for the factored subquery and use this materialized result set in the main query.

As an example - and one of my commonest uses of subquery factoring - you could do the following:

create table t1
nologging              -- adjust as necessary
as
with generator as (
        select  /*+ materialize */
               rownum id
        from    all_objects
        where   rownum <= 3000
)
select
        /*+ ordered use_nl(v2) */
        rownum                 id,
        lpad(rownum,10,'0')    small_vc,
        rpad('x',100)          padding
from
        generator      v1,
        generator      v2
where
        rownum <= 1000000
;

A few things to notice about this query:

I have used the all_objects view in my factored subquery, and given the subquery the name generator, which I have then used twice in the main query.

Because I have acquired 3,000 rows in the subquery (that’s a pretty safe number from a typical install of the database) I could get up to 9,000,000 rows by joining it to itself – event though in this case I’ve only selected 1,000,000 rows. So I’ve got a fairly easy way of creating a big test table, even from a small starting data set.

Because any select statement can be factored like this, I happen to have demonstrated that the method can be used for the select statement inside a CTAS  (create table as select) statement.

I have used the explicit (and undocumented) hint /*+ materialize */ in the subquery, to force Oracle to create a global temporary table holding the 3,000 rows. If I hadn’t done that, the optimizer may have tried to do a self-join (cartesian) on all_objects - with a horrendous execution plan. As it is, there will be a simple cartesian join between two small global temporary tables. If you want to block the creation of the global temporary table, the opposite hint is the /*+ inline */ hint - this will make the optimizer substitute the subquery text in place in the main query.

Drawbacks

If you’ve been reading carefully, you will have spotted the big potential drawback associated with subquery factoring already: Global Temporary Tables. If the factored subquery is materialized, the run-time operation will do direct path writes – which means they will go to disc – to create the result set in your temporary tablespace, and then normal multiblock reads to load the data back into the cache as the query uses the result set. This may be a load on your I/O subsystem that you cannot afford to take.

There have also been some reports of systems suffering from unexpected amounts of lock (enqueue) and latch activity when highly concurrent systems use materialized subqueries very frequently.


Back to Index of Topics