The Oracle (tm) Users' Co-Operative FAQ

Tuning UPDATE/DELETE statements with subqueries


Author's name: Michael Bialik

Author's Email: bialik@isdn.net.il

Date written: 20 Aug 2001

Oracle version(s): 7.x, 8.0.x, 8.1.6, 8.1.7

Tuning UPDATE/DELETE statements with subqueries. 

Back to index of questions

The tuning of UPDATE or DELETE statement referencing a single table is no different from SELECT, however some performance problem arises when a number of tables are involved. In major number of cases there are 2 tables when one of them must be updated based on some criteria from another.

We are going to analyze 2 cases (both with RULE and COST based optimizers) of UPDATE statement execution but the same reasoning and methods apply to DELETE statement as well:

Case 1. EMP and DEPT tables are involved. We would like to update salaries of all employees (lets give them 5% raise) working for ‘SALES’ department. There is always a possibility of using 2 statements:
  a. SELECT from DEPT to get a value of DEPTNO column.
  b. UPDATE EMP SET SAL = SAL * 1.05 WHERE DEPTNO = :DNO

However lets see what happens when we try to do it by using single UPDATE statement (and Rule Based Optimizer - RBO):

   UPDATE EMP SET SAL = SAL * 1.05 
   WHERE DEPTNO IN (SELECT DEPTNO FROM DEPTWHERE DNAME = ‘SALES’)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.04

0.04

0

0

0

0

Execute

1

1.58

1.74

1

40135

2058

2006

Fetch

0

0.00

0.00

0

0

0

0

Total

2

1.62

1.78

1

40135

2058

2006

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

2007

    NESTED LOOPS                                                                                        

20015

       TABLE ACCESS (FULL) OF 'EMP'                                                         a

22020

       TABLE ACCESS (BY INDEX ROWID) OF 'DEPT'                               c

40028

           INDEX (UNIQUE SCAN) OF 'PK_DEPT' (UNIQUE)                        b

As we can see the optimizer performs:
  a. FULL TABLE SCAN of EMP table (about 20000 rows).
  For EACH row it :
  b. Accesses an INDEX of DEPT table by DEPTNO value to get a ROWID
  c. Accesses a DATA part of DEPT table by using ROWID retrieved from an
      index to get value of DNAME column.
      Only now it performs a final elimination of unnecessary rows by filter on
      DNAME column (DNAME = ‘SALES’)

It’s quite obvious that it’s may be more efficient to access DEPT table first and EMP table at later stage.
In order to force the optimizer to access the tables in specified order we have to supply USE_NL hint. It tells optimizer to use EMP table as an inner table and, by-product, DEPT table will be used as outer (driving) table.

 
  UPDATE /*+ USE_NL(EMP) */ EMP SET SAL = SAL * 1.05
  WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = ‘SALES’)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.01

0.01

0

0

0

0

Execute

1

0.91

1.68

5

7

2056

2006

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.92

1.69

5

7

2056

2006

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

2007

    NESTED LOOPS                                                                                         

2

       TABLE ACCESS (FULL) OF 'DEPT'

2007

       INDEX (RANGE SCAN) OF 'EMP$DEPT' (NON-UNIQUE)

Now we can compare the results:

 

Without hint

With hint

Improvement(%)

CPU

1.62

0.92

45

Elapsed

1.78

1.69

5

Access count

42193

2063

95

As expected there is a major performance gain.

Now let’s see what happens with the cost based optimizer (CBO).

   UPDATE EMP SET SAL = SAL * 1.05 
   WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = ‘SALES’)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.01

0.01

0

0

0

0

Execute

1

0.97

1.71

0

108

2062

2006

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.98

1.72

0

108

2062

2006

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

2007

    HASH JOIN

1

       TABLE ACCESS GOAL: ANALYZED (FULL) OF 'DEPT'

20014

       TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMP'

As we can see the CBO performs much better then RBO (it uses HASH JOIN instead of NESTED LOOP JOIN of RBO). However it’s interesting to see what will happen when we access EMP table by index instead of performing FTS on it. In order to do it we add a hint to our statement once again and another hint to enforce index access:

   UPDATE /*+ USE_NL(EMP) INDEX(EMP) */ EMP
   SET SAL = SAL * 1.05
   WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = ‘SALES’)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.04

0.04

0

0

0

0

Execute

1

0.88

1.02

0

7

2056

2006

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.92

1.06

0

7

2056

2006

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

2007

    NESTED LOOP

2

       TABLE ACCESS GOAL: ANALYZED (FULL) OF 'DEPT'

2007

       INDEX ACCESS GOAL: ANALYZED (RANGE SCAN) OF 'EMP$DEPT'

  (NON-UNIQUE)

2007

       INDEX ACCESS GOAL: ANALYZED (RANGE SCAN) OF 'EMP$DEPT'
(NON-UNIQUE)

Now (using CBO) there is almost no performance difference between hinted and non-hinted statements, however we have to remember a number of facts:
  a. We update about 10 percent of rows in EMP table
  b. Our rows are very short (so the number of block in the table EMP is relatively small)

My conclusion is that there is still a place to use hints when:
  a. FTS of updated table is performed and
  b. Only small percentage of rows are updated

We are going to verify that conclusion in our next example.

Case 2. Let’s check a bit more complicated statement.
  EMP table must be updated based on values in EMP_LOAD table that contains relatively small number of rows ( about 20000 rows in EMP table and 100 rows in EMP_LOAD table):

RBO:
  UPDATE emp e
  SET (ename, job, mgr, hiredate, sal, comm, deptno) =
    (SELECT ename, job, mgr, hiredate, sal, comm, deptno
     FROM emp_load el
     WHERE e.empno = el.empno)
  WHERE e.empno IN (SELECT empno FROM emp_load)

Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.03

0.03

0

0

0

0

Execute

1

0.49

0.54

0

20321

208

100

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.52

0.57

0

20321

208

100

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

101

    NESTED LOOPS                                                                                        

20015

       TABLE ACCESS (FULL) OF 'EMP'

20114

       INDEX   (RANGE SCAN) OF 'EMP_LOAD_PK' (UNIQUE)

0

 TABLE ACCESS (BY INDEX ROWID) OF 'EMP_LOAD'

0

   INDEX (UNIQUE SCAN) OF 'EMP_LOAD_PK' (UNIQUE)

             Once again FULL TABLE SCAN raises it’s ugly head. We are going to use the previous approach and try to access EMP as an INNER table.

     UPDATE /*+ USE_NL(e) INDEX(e) */ emp
     SET (ename, job, mgr, hiredate, sal, comm, deptno) = 
          (SELECT ename, job, mgr, hiredate, sal, comm, deptno
           FROM emp_load el
           WHERE e.empno = el.empno )
     WHERE e.empno IN ( SELECT empno FROM emp_load)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.02

0.02

0

0

0

0

Execute

1

0.05

0.05

0

401

204

100

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.07

0.07

0

401

204

100

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

101

    NESTED LOOPS                                                                                        

101

      INDEX GOAL: ANALYZED (FULL SCAN) OF 'EMP_LOAD_PK' (UNIQUE)

200

      INDEX (RANGE SCAN) OF 'PK_EMP' (UNIQUE)

0

 TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF 'EMP_LOAD'

0

   INDEX   GOAL: ANALYZED (UNIQUE SCAN) OF 'EMP_LOAD_PK'
              (UNIQUE)

          CBO:

    UPDATE emp e 
    SET (ename, job, mgr, hiredate, sal, comm, deptno) =
      (SELECT ename, job, mgr, hiredate, sal, comm, deptno
       FROM emp_load el
       WHERE e.empno = el.empno)
    WHERE e.empno IN (SELECT empno FROM emp_load)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.02

0.02

0

0

0

0

Execute

1

0.52

0.61

0

20321

208

100

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.54

0.63

0

20321

208

100

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

101

    NESTED LOOPS                                                                                         

20015

        TABLE ACCESS   GOAL: ANALYZED (FULL) OF 'EMP'

20114

        INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'EMP_LOAD_PK'
              (UNIQUE)

0

 TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF 'EMP_LOAD'

0

   INDEX   GOAL: ANALYZED (UNIQUE SCAN) OF 'EMP_LOAD_PK'
              (UNIQUE)

The result is identical to the default RBO decision, so let’s try and improve it:

    UPDATE /*+ USE_NL(e) INDEX(e) */ emp
    SET (ename, job, mgr, hiredate, sal, comm, deptno) =
        (SELECT ename, job, mgr, hiredate, sal, comm, deptno
         FROM emp_load el
         WHERE e.empno = el.empno )
    WHERE e.empno IN ( SELECT empno FROM emp_load)
Call

Count

CPU

Elapsed

Disk

Query

Current

Rows

Parse

1

0.02

0.02

0

0

0

0

Execute

1

0.06

0.06

0

401

204

100

Fetch

0

0.00

0.00

0

0

0

0

Total

2

0.08

0.08

0

401

204

100

Rows Execution Plan

0

UPDATE STATEMENT   GOAL: CHOOSE

1

 UPDATE OF 'EMP'

101

    NESTED LOOPS                                                                                        

101

        INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'EMP_LOAD_PK'
              (UNIQUE)

200

        INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'PK_EMP' (UNIQUE)

0

 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
              'EMP_LOAD'

0

   INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'EMP_LOAD_PK'
              (UNIQUE)

Now we can compare the results:

 

Without hint

With hint

Improvement(%)

CPU

0.54

0.08

85

Elapsed

0.63

0.08

87

Access count

20529

605

97

As we can see there is huge performance improvement in these cases just waiting to be executed.


Further reading: N/A


Back to top

Back to index of questions