| Author's name: Michael Bialik Author's Email: bialik@isdn.net.il |
Date written: 20 Aug 2001 Oracle version(s): 7.x, 8.0.x, 8.1.6, 8.1.7 |
| Tuning UPDATE/DELETE statements with subqueries. | |
The tuning of UPDATE or DELETE statement referencing a single table is no different from SELECT, however some performance problem arises when a number of tables are involved. In major number of cases there are 2 tables when one of them must be updated based on some criteria from another.
We are going to analyze 2 cases (both with RULE and COST based optimizers) of UPDATE statement execution but the same reasoning and methods apply to DELETE statement as well:
Case 1. EMP and DEPT tables are involved. We would like
to update salaries of all employees (lets give them 5% raise)
working for SALES department. There is always a
possibility of using 2 statements:
a. SELECT from DEPT to get a value of DEPTNO column.
b. UPDATE EMP SET SAL = SAL * 1.05 WHERE DEPTNO =
:DNO
However lets see what happens when we try to do it by using single UPDATE statement (and Rule Based Optimizer - RBO):
UPDATE EMP SET SAL = SAL * 1.05 WHERE DEPTNO IN (SELECT DEPTNO FROM DEPTWHERE DNAME = SALES)
| Call |
Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse | 1 |
0.04 |
0.04 |
0 |
0 |
0 |
0 |
| Execute |
1 |
1.58 |
1.74 |
1 |
40135 |
2058 |
2006 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
1.62 |
1.78 |
1 |
40135 |
2058 |
2006 |
| Rows |
Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
2007 |
NESTED LOOPS
|
20015 |
TABLE ACCESS (FULL) OF 'EMP'
ç
a |
22020 |
TABLE ACCESS (BY INDEX ROWID) OF 'DEPT'
ç c |
40028 |
INDEX (UNIQUE SCAN) OF 'PK_DEPT' (UNIQUE)
ç b |
As we can see the optimizer performs:
a. FULL TABLE SCAN of EMP table (about 20000 rows).
For EACH row it :
b. Accesses an INDEX of DEPT table by DEPTNO value to
get a ROWID
c. Accesses a DATA part of DEPT table by using ROWID
retrieved from an
index to get value of DNAME
column.
Only now it performs a final
elimination of unnecessary rows by filter on
DNAME column (DNAME =
SALES)
Its quite obvious that its may be more efficient
to access DEPT table first and EMP table at later stage.
In order to force the optimizer to access the tables in specified
order we have to supply USE_NL hint. It tells optimizer to
use EMP table as an inner table and, by-product, DEPT table will
be used as outer (driving) table.
UPDATE /*+ USE_NL(EMP) */ EMP SET SAL = SAL * 1.05 WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = SALES)
| Call |
Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse | 1 |
0.01 |
0.01 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.91 |
1.68 |
5 |
7 |
2056 |
2006 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.92 |
1.69 |
5 |
7 |
2056 |
2006 |
| Rows |
Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
2007 |
NESTED LOOPS
|
2 |
TABLE ACCESS (FULL) OF 'DEPT' |
2007 |
INDEX (RANGE SCAN) OF 'EMP$DEPT' (NON-UNIQUE) |
Now we can compare the results:
| |
Without
hint |
With
hint |
Improvement(%) |
| CPU | 1.62 |
0.92 |
45 |
| Elapsed |
1.78 |
1.69 |
5 |
| Access
count |
42193 |
2063 |
95 |
As expected there is a major performance gain.
Now lets see what happens with the cost based optimizer (CBO).
UPDATE EMP SET SAL = SAL * 1.05 WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = SALES)
| Call |
Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse | 1 |
0.01 |
0.01 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.97 |
1.71 |
0 |
108 |
2062 |
2006 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.98 |
1.72 |
0 |
108 |
2062 |
2006 |
| Rows |
Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
2007 |
HASH JOIN |
1 |
TABLE ACCESS GOAL: ANALYZED (FULL) OF 'DEPT' |
20014 |
TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMP' |
As we can see the CBO performs much better then RBO (it uses HASH JOIN instead of NESTED LOOP JOIN of RBO). However its interesting to see what will happen when we access EMP table by index instead of performing FTS on it. In order to do it we add a hint to our statement once again and another hint to enforce index access:
UPDATE /*+ USE_NL(EMP) INDEX(EMP) */ EMP SET SAL = SAL * 1.05 WHERE DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE DNAME = SALES)
| Call |
Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse | 1 |
0.04 |
0.04 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.88 |
1.02 |
0 |
7 |
2056 |
2006 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.92 |
1.06 |
0 |
7 |
2056 |
2006 |
| Rows |
Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
2007 |
NESTED LOOP |
2 |
TABLE ACCESS GOAL: ANALYZED (FULL) OF 'DEPT' |
2007 |
INDEX ACCESS GOAL: ANALYZED (RANGE SCAN) OF 'EMP$DEPT' (NON-UNIQUE) |
2007 |
INDEX ACCESS GOAL: ANALYZED (RANGE SCAN) OF 'EMP$DEPT' (NON-UNIQUE) |
Now (using CBO) there is almost no performance difference
between hinted and non-hinted statements, however we have to
remember a number of facts:
a. We update about 10 percent of rows in EMP table
b. Our rows are very short (so the number of block in
the table EMP is relatively small)
My conclusion is that there is still a place to use hints
when:
a. FTS of updated table is performed and
b. Only small percentage of rows are updated
We are going to verify that conclusion in our next example.
Case 2. Lets check a bit more complicated
statement.
EMP table must be updated based on values in EMP_LOAD
table that contains relatively small number of rows ( about 20000
rows in EMP table and 100 rows in EMP_LOAD table):
RBO:
UPDATE emp e
SET (ename, job, mgr, hiredate, sal, comm, deptno) =
(SELECT ename, job, mgr, hiredate, sal,
comm, deptno
FROM emp_load el
WHERE e.empno = el.empno)
WHERE e.empno IN (SELECT empno FROM emp_load)
| Call |
Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
Parse |
1 |
0.03 |
0.03 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.49 |
0.54 |
0 |
20321 |
208 |
100 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.52 |
0.57 |
0 |
20321 |
208 |
100 |
| Rows | Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
101 |
NESTED LOOPS
|
20015 |
TABLE ACCESS (FULL) OF 'EMP' |
20114 |
INDEX (RANGE SCAN) OF 'EMP_LOAD_PK' (UNIQUE) |
0 |
TABLE
ACCESS (BY INDEX ROWID) OF 'EMP_LOAD' |
0 |
INDEX (UNIQUE SCAN) OF 'EMP_LOAD_PK' (UNIQUE) |
Once again FULL TABLE SCAN raises its ugly head. We are
going to use the previous approach and try to access EMP as an
INNER table.
UPDATE /*+ USE_NL(e) INDEX(e) */ emp
SET (ename, job, mgr, hiredate, sal, comm, deptno) =
(SELECT ename, job, mgr, hiredate, sal, comm, deptno
FROM emp_load el
WHERE e.empno = el.empno )
WHERE e.empno IN ( SELECT empno FROM emp_load)
| Call | Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse |
1 |
0.02 |
0.02 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.05 |
0.05 |
0 |
401 |
204 |
100 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.07 |
0.07 |
0 |
401 |
204 |
100 |
| Rows | Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
101 |
NESTED LOOPS
|
101 |
INDEX GOAL: ANALYZED (FULL SCAN) OF 'EMP_LOAD_PK' (UNIQUE) |
200 |
INDEX (RANGE SCAN) OF 'PK_EMP' (UNIQUE) |
0 |
TABLE
ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'EMP_LOAD' |
0 |
INDEX GOAL: ANALYZED (UNIQUE SCAN) OF
'EMP_LOAD_PK' (UNIQUE) |
CBO:
UPDATE emp e
SET (ename, job, mgr, hiredate, sal, comm, deptno) =
(SELECT ename, job, mgr, hiredate, sal, comm, deptno
FROM emp_load el
WHERE e.empno = el.empno)
WHERE e.empno IN (SELECT empno FROM emp_load)
| Call | Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse |
1 |
0.02 |
0.02 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.52 |
0.61 |
0 |
20321 |
208 |
100 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.54 |
0.63 |
0 |
20321 |
208 |
100 |
| Rows |
Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
101 |
NESTED LOOPS
|
20015 |
TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMP' |
20114 |
INDEX GOAL: ANALYZED (RANGE SCAN) OF
'EMP_LOAD_PK' (UNIQUE) |
0 |
TABLE
ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'EMP_LOAD' |
0 |
INDEX GOAL: ANALYZED (UNIQUE SCAN) OF
'EMP_LOAD_PK' (UNIQUE) |
The result is identical to the default RBO decision, so
lets try and improve it:
UPDATE /*+ USE_NL(e) INDEX(e) */ emp
SET (ename, job, mgr, hiredate, sal, comm, deptno) =
(SELECT ename, job, mgr, hiredate, sal, comm, deptno
FROM emp_load el
WHERE e.empno = el.empno )
WHERE e.empno IN ( SELECT empno FROM emp_load)
| Call | Count |
CPU |
Elapsed |
Disk |
Query |
Current |
Rows |
| Parse |
1 |
0.02 |
0.02 |
0 |
0 |
0 |
0 |
| Execute |
1 |
0.06 |
0.06 |
0 |
401 |
204 |
100 |
| Fetch |
0 |
0.00 |
0.00 |
0 |
0 |
0 |
0 |
| Total |
2 |
0.08 |
0.08 |
0 |
401 |
204 |
100 |
| Rows | Execution
Plan |
0 |
UPDATE
STATEMENT GOAL: CHOOSE |
1 |
UPDATE
OF 'EMP' |
101 |
NESTED LOOPS
|
101 |
INDEX GOAL: ANALYZED (RANGE SCAN) OF
'EMP_LOAD_PK' (UNIQUE) |
200 |
INDEX GOAL: ANALYZED (RANGE SCAN) OF 'PK_EMP'
(UNIQUE) |
0 |
TABLE
ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF 'EMP_LOAD' |
0 |
INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'EMP_LOAD_PK' (UNIQUE) |
Now we can compare the results:
| |
Without
hint |
With
hint |
Improvement(%) |
| CPU | 0.54 |
0.08 |
85 |
| Elapsed |
0.63 |
0.08 |
87 |
| Access
count |
20529 |
605 |
97 |
As we can see there is huge performance improvement in these cases just waiting to be executed.
Further reading: N/A