ࡱ> '(&ܥhc eT8J05#FFIIIIIIIIII I I"IJCIIIIIIIIIJJJJJJCJXJOJIIIIIIJIIIIIIIIIIIIII`_IIIIIIIIIIExplain Plan - Parallel, Partitions, and Problems Jonathan Lewis JL Computer Consultancy, UK Summary Just when you thought you were getting the hang of understanding the execution plans created by the EXPLAIN PLAN facility, Oracle introduced partitioned tables, parallel execution, objects, new opimisation strategies such as bitmap star transformations, in-line views as columns, row-level security, and CUBE and ROLLUP (Oracle 8.1.5) and the Analytic functions (8.1.6). This article, and the associated presentation, look at some of the more convoluted results that can be produced as the output of EXPLAIN PLAN, and give you a toe-hold into interpreting some of the messier plans that you (or Oracle) can produce in the search for improved performance. Parallelism Consider the query: select ep1.v1, count(ep2.v2) from ep2, ep1 where ep2.v1 = 'Tyne Tees' and ep1.n1 = ep2.n1 group by ep1.v1 order by count(ep2.v2) desc ; No doubt if I printed the serial execution plan, few of you would be surprised to see something like the following: SELECT STATEMENT (choose) SORT (order by) SORT (group by) HASH JOIN TABLE ACCESS (analyzed) JPL1 EP1 (full) TABLE ACCESS (analyzed) JPL1 EP2 (full) You would interpret this to mean that Oracle scans the EP1 table and distributes its rows into a small collection of hash buckets based on the value of the N1 column, then scans the EP2 table, eliminating all but 'Tyne Tees' and distributing those into the same hash buckets using the same N1 value. After performing the join in each bucker, and discarding unwanted rows, Oracle sorts the resulting row set to perform the group by and count, passing the smaller result set forward to the sort operation to order them by the 'count' value. But what does this plan turn into if I make the tables parallel ? There are three extra pieces of information to look out for in the plan_table: the 'OTHER' which is the SQL actually sent to parallel query slaves, the 'OTHER_TAG' which explains how the different rows in the plan are tied together, and the OBJECT_NODE which acts as a reference point for 'table-queues' in the parallel SQL. For example, a parallel version of the above might give you the following: 1 SORT (order by) PARA_TO_SERIAL :Q819004 SELECT A1.C0 C0,A1.C1 C1 FROM :Q819003 A1 ORDER BY A1.C0 DESC 2 SORT (group by) PARA_TO_PARA :Q819003 SELECT COUNT(A1.C1) C0,A1.C0 C1 FROM :Q819002 A1 GROUP BY A1.C0 3 HASH JOIN PARA_TO_PARA :Q819002 SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) */ A2.C1 C0,A1.C1 C1 FROM :Q819000 A1, :Q819001 A2 WHERE A2.C0=A1.C0 4 TABLE ACCESS (analyzed) EP2(full) PARA_TO_PARA :Q819000 SELECT /*+ NO_EXPAND ROWID(A1) */ A1."N1" C0,A1."V2" C1 FROM "EP2" A1 WHERE ROWID BETWEEN :B1 AND :B2 AND A1."V1"='Tyne Tees' 5 TABLE ACCESS (analyzed) EP1(full) PARA_TO_PARA :Q819001 SELECT /*+ NO_EXPAND ROWID(A1) */ A1."N1" C0,A1."V1" C1 FROM "EP1" A1 WHERE ROWID BETWEEN :B1 AND :B2 This is actually an example of how parallel query can become very expensive. There are two clues in the plan that something fairly nasty is probably happening: first the appearance of PARALLEL_TO_PARALLEL (trimmed in my output to para_to_para) which tells you that one layer of parallel slaves is acquiring data then flinging it out to another layer of parallel slaves - this often means that vast numbers of small messages are being hurled around the system. The second clue is that there are a number of very small pieces of SQL generated for parallel query slaves; again, this suggests that layers of slaves are being used, time and time again, to generate a partial result and fling it out to another layer of slaves. Ultimately, though the only sure guideline of the threat is to look at the actual code executed by the parallel query slaves. In this case we see: Line no. 4, node Q819000: The first set of parallel slaves scans EP1, and distributes the results to the second set of parallel slaves based on a random distribution of values of N1. Line no. 5, node Q819001: The same set of parallel slaves then scans EP2, and distributes the results to the second set of parallel slaves in the same way Line no. 3, node Q819002: The second set of slaves operate hash joins on the incoming sets of data - queues Q819000 and Q819001. If necessary, every slave may have to allocate temporary space to complete the hash join. As the hash joins complete, the second set of slaves distribute their results back to the first set of slaves, using a ranging mechanism based on the N1 value to send each row to the most appropriate slave. Line no. 2, node Q819003: the first set of slaves sort and group the data in their incoming queues (Q819002). These slaves may also require temporary space to complete their sort. As the sort/group is finished, they pass the results back to the second set of slaves, using a ranging mechanism based on the N1 value to send each row to the most appropriate slave. Line no. 1, node Q819004: the second set of slaves order the data in their incoming queue (Q819003), and again may have to make use of temporary space to do the necessary sorting, before passing the results, on demand, to the query co-ordinator process that is the end-users 'shadow' process. You may think that this sounds like a lot of work - and it is. The number of messages passed around, and the number of independent sorting operations that take place can be very expensive. Parallel Query isn't necessarily efficient, it is supposed to complete the job in the shortest possible time. There are, however, lots of ways in which parallel queries can be more efficient. When trying to find efficient parallel query plans, try to work out access paths that will allow a single 'layer' of slaves to join, and perhaps sort and aggregate, two or three tables. This can reduce the number of messages passed around quite dramatically, and the number of independent sorts. The clues to watch for in the plan are: the key expression PARALLEL_COMBINED_WITH_PARENT/CHILD, and a small number of large SQL statements as the parallel query SQL. An alternative (hinted) form of the query resulted in only three parallel steps, with the initial parallel SQL being the more obscure, but overall more effective: SELECT /*+ ORDERED NO_EXPAND USE_NL(A2) INDEX(A2 "EP1_PK") */ A2."V1" C0, A1.C1 C1 FROM ( SELECT /*+ NO_EXPAND ROWID(A3) */ A3."N1" C0,A3."V2" C1 FROM "EP2" A3 WHERE ROWID BETWEEN :B1 AND :B2 AND A3."V1"='Tyne Tees' ) A1, "EP1" A2 WHERE A2."N1" = A1.C0 ; Partitioning Partitioning can be even more entertaining than parallel queries. Lets look at what can happen with execution plans when you partition the EP1 and EP2 tables on the N1 column (which is used in the join), and then run the following query; select ep1.v1, count(ep2.v2) from ep2, ep1 where ep2.v1 = 'Tyne Tees' and ep2.n1 between 80 and 120 and ep1.n1 = ep2.n1 group by ep1.v1 ; Look carefully at the two plans (produced off the SAME query, using the SAME dataset, with no added hints) and try to decide why the plans look different and what effect that might have on performance. These plans (with costs eliminated) come from AUTOTRACE: SORT (GROUP BY) PARTITION RANGE (ITERATOR) HASH JOIN TABLE ACCESS (FULL) OF 'EP1' TABLE ACCESS (FULL) OF 'EP2' SORT (GROUP BY) HASH JOIN PARTITION RANGE (ITERATOR) TABLE ACCESS (FULL) OF 'EP1' PARTITION RANGE (ITERATOR) TABLE ACCESS (FULL) OF 'EP2' I was entirely truthful about the queries being on the same datasets - the data content was absolutely identical in both cases, but the big difference between the two plans was in the partitioning. For the first plan I created both tables with the exactly the same partition definitions as follows: partition by range (n1) ( partition p1 values less than (50), partition p2 values less than (100), partition p3 values less than (150), partition p4 values less than (250) ) For the second plan, I left the EP1 table partitioned as above, but partitioned the EP2 table a little bit out of step with the EP1 table, as follows: partition by range (n1) ( partition p1 values less than (51), partition p2 values less than (101), partition p3 values less than (151), partition p4 values less than (251) ) The difference in plans is explained by Oracle 8.1's ability to handle partition-wise joins. If you look at the enhanced plan (not yet available through AUTOTRACE) you will not that you can get some information about which partitions have been used at which stage of the process. There are actually three key columns to watch out for in the plan_table when using partitioning, these are:PARTITION_ID, PARTITION_START and PARTITION_STOP, shown below as 'Pt id', and 'Pt Range': SORT (group by) PARTITION RANGE (iterator) Pt id: 2 Pt Range: 2 - 3 HASH JOIN TABLE ACCESS EP1 (full) Pt id: 2 Pt Range: 2 - 3 TABLE ACCESS EP2 (full) Pt id: 2 Pt Range: 2 - 3 The plan shows 'partition id = 2' applies across the entire query, so that both tables are involved in the same single partitioning event. The start/stop values of 2 and 3 show us that in both cases it is partitions two and three of the two tables that are required to meet the query. The method of execution for this plan is to hash data from partition 2 of EP1 with data from partition 2 of EP2, then repeat (iterate) the process for partition 3 of the two tables. By comparison, the plan for the non-matching partitions shows that two independent partitioning operations take place (ids 3 and 5), even though, coincidentally the partitions accessed from each table are still numbers 2 and 3. SORT (group by) HASH JOIN PARTITION RANGE (iterator) Pt id: 3 Pt Range: 2 - 3 TABLE ACCESS EP1 (full) Pt id: 3 Pt Range: 2 - 3 PARTITION RANGE (iterator) Pt id: 5 Pt Range: 2 - 3 TABLE ACCESS EP2 (full) Pt id: 5 Pt Range: 2 - 3 In this instance, each table is scan separately (though with some economy since only the relevant partitions are visited) to produce one large data set per table, and then a single hash operation is applied to these larger data sets. Under serial access, and with simple examples, it is arguable whether two smaller operations, or one large operation would actually be more efficient; but with larger data sets, more partitions, and other types of access, the benefits of partition-wise joining can be tremendous. The difference becomes particularly noticeable, though, under parallel execution where the partition-wise method produces the desirable strategy of larger, complex SQL using fewer layers of slaves and much less messaging - the SQL generated for the partition-wise approach manages to handle the join in a single layer loaded with in-line views: SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) SWAP_JOIN_INPUTS(A2) */ A2.C1 C0,A1.C1 C1 FROM ( SELECT /*+ NO_EXPAND FULL(A3) */ A3."N1" C0,A3."V2" C1 FROM "EP2" PARTITION(:B1) A3 WHERE A3."V1"='Tyne Tees' AND A3."N1">=80 AND A3."N1"<=120 ) A1, (SELECT /*+ NO_EXPAND FULL(A4) */ A4."N1" C0,A4."V1" C1 FROM "EP1" PARTITION(:B1) A4 WHERE A4."N1">=80 AND A4."N1"<=120 ) A2 WHERE A2.C0=A1.C0 Problems With the ever increasing subtlety of the database engine, there are all sorts of suprises in store in Oracle 8.1. when you start using explain plan. This section lists a few examples of the puzzles that may confound you in the future. Row level security - a user does a simple select statement against a table with a security predicate; they cannot see the predicate and do not know that it exists. The trace file will show the actual access path, but if you try to use EXPLAIN PLAN, you may (a) have no securiry predicate, or (b) a different security predicate from the user. Inline views - it is now possible in Oracle 8.1 to write an SQL statement as a COLUMN inside another statement. (Previous versions of Oracle allowed in-line views to be placed in the FROM clause, 8.1 allows them in the SELECT list). EXPLAIN PLAN, and trace files, and tkprof may not show any references whatsoever to the tables used in the in-line view Materialized views - Oracle can now perform query rewrite if it has information about summary tables, and you have the correct privilege. A query against a large table with a few joined 'dimension' tables may turn into a completely different query against a pre-defined summary table. If you don't have the same privileges, you may get a completely different execution plan from the end-user. Object tables - Tables can contain nested tables. A query which appears to be against one table may be resolved into a query against a completely different table, which just happens to be the nested table. Index Organized tables - It is now possible to created secondary indexes against IOTs. Since the secondary index has to hold the primary key of the table as the 'logical rowid' it is possible that a query that looks as if it should access the 'table' will actually be resolved purely from an index scan on a seconary index. Index Joins - If you have several indexes on a table, which between them cover all the columns you are interested in for a query, Oracle may choose to ignore the table completely, using range scans and hash joins by rowid on the indexes to generate a result. However good you may be at interpreting executions after the event, Oracle is always moving onwards to produced new options to astound, entertain and baffle you. Jonathan Lewis page.  PAGE 1 "88.Apppp50G01123w44E5[56608F8G8M8N8O8P8S8T8x8uPaP uDPU^2A]^f2@[o2G_o0 ! 9 : \ g  +  . P e f ;<t][h #(?Fj 4h'"=[o  * I m !!"A"g"""%#?#d###,####%%&&M&&&@)S)_))) *A*B*E,------.'.@.`.|...../%/,/>/?/H/5012w4,w4E567/808Q8R8S8T8 K @ Normalx]a &@& Heading 110Uc@ Heading 2U^@ Heading 3U(@( Heading 4 <Uc$@$ Heading 5<c(@( Heading 6<V]c"A@"Default Paragraph Font$@$ Normal Indent0+@ Endnote Text @ Header o#$@"$ Footnote TextU"O2" Double IndentOBActionU^ @R Footer 9r &>@b&Title <UckJ@rSubtitle<)@ Page Number*-@* Macro Text ]a O Plain Text]Omacro Tb0@b List BulletG h 4hO Macro TetxtO MacroTextT5T8"%x8 #w4T8 !"T5%!@CTimes New Roman Symbol "Arial1Courier New"P hOFOFh$F+]P OChris Wray - Short contractJonathan LewisPre-Installed User  !"#$%*2Root EntryE F`_)WordDocument?JCompObjjSummaryInformation((  FMicrosoft Word Document MSWordDocWord.Document.69qOh+'0 ,8 ` l x Chris Wray - Short contractJonathan Lewis~kNormalPre-Installed User1Microsoft Word fDocumentSummaryInformation8   FMicrosoft Word Document MSWordDocWord.Document.89qJL Computer Consultancy] Chris Wray - Short contractor Windows 95E@@~(@V@V+՜.+,0@Hhp x JL Computer Consultancy] Chris Wray - Short contract