JL Computer Consultancy

An example of making sensible guesses from a tkprof output

April 2006


A question came up on AskTom recently (the item starts here) describing a performance problem and showing the output from tkprof before and after a first attempt to fix the problem. Looking at the output, I decided that there were several clues available for addressing the problem (or at least, asking questions related to the problem).

People often ask:How do you use tkprof to solve a performance problem ?”  The output supplied seemed to be an ideal example for demonstrating a few ideas. This is the relevant extract from the AskTom thread – I have highlighted the comment made by the original poster in the middle just to make it easier to see:

SELECT COUNT(*) AS JOBMATCHEDCOUNT

FROM   ESC_JOB, ESC_BOOKING

WHERE  ESC_JOB.ACCEPT_DT IS NOT NULL AND BOOKED_BY = :a AND

       ESC_JOB.BOOKING_ID = ESC_BOOKING.BOOKING_ID AND

       ESC_BOOKING.BOOKING_DT >= SYSDATE - (:b / 86400)

 

call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.01       0.00          0          0          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch        2      0.32       4.14       4077       6404          0           1

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        4      0.33       4.15       4077       6404          0           1

 

Misses in library cache during parse: 1

Optimizer goal: CHOOSE

Parsing user id: 75

 

Rows     Row Source Operation

-------  ---------------------------------------------------

      1  SORT AGGREGATE

      3   TABLE ACCESS BY INDEX ROWID ESC_JOB

      7    NESTED LOOPS

      3     TABLE ACCESS BY INDEX ROWID ESC_BOOKING

   8079      INDEX RANGE SCAN INDX_ESC_BKNG2 (object id 33516)

      3     INDEX RANGE SCAN INDX_ESC_JOB_BKNGID (object id 36513)

 

The above query had used the index on esc_booking.booked_by.

But when I hint the query to use index on esc_booking.booking_dt, it is fast:

 

SELECT /*+  index ( ESC_BOOKING INDX_ESC_BKNG_BKNGDT ) */ COUNT(*) AS

JOBMATCHEDCOUNT

FROM   ESC_JOB, ESC_BOOKING

WHERE  ESC_JOB.ACCEPT_DT IS NOT NULL AND BOOKED_BY = :a AND

       ESC_JOB.BOOKING_ID = ESC_BOOKING.BOOKING_ID AND

       ESC_BOOKING.BOOKING_DT >= SYSDATE - (:b / 86400)

 

call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse        1      0.00       0.00          0          0          0           0

Execute      1      0.00       0.00          0          0          0           0

Fetch        2      0.02       0.04         93       3359          0           1

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        4      0.02       0.04         93       3359          0           1

 

Misses in library cache during parse: 1

Optimizer goal: CHOOSE

Parsing user id: 75

 

Rows     Row Source Operation

-------  ---------------------------------------------------

      1  SORT AGGREGATE

      3   TABLE ACCESS BY INDEX ROWID ESC_JOB

      7    NESTED LOOPS

      3     TABLE ACCESS BY INDEX ROWID ESC_BOOKING

   4630      INDEX RANGE SCAN INDX_ESC_BKNG_BKNGDT (object id 36508)

      3     INDEX RANGE SCAN INDX_ESC_JOB_BKNGID (object id 36513)

The original poster wanted to know if it would be worth collecting histograms so that the optimizer would choose the “correct” index without the hint being used. This, of course, is a good question. Supplying correct statistics is always the strategically correct choice – hinting ought to be a last resort (or possibly an act of desperation under time pressure). However, in this case, it looked to me as if there was probably a more appropriate solution, and that histograms wouldn’t have an effect anyway.


Note – the database version is at least 9i, since the nested loop shows the newer structure allowing a table pre-fetch on the second table (esc_job). This can make it a little harder to work out exactly where the work is being done in some cases, but the numbers that appear in this case are so unbalanced that it is quite easy to see what it going on.

Note also that we are looking at execution paths with the heading “Row Source Operation”. This means we are seeing the execution paths that actually happened, not a theoretical plan generated by a call to explain plan. (There are cases where the “Row Source Operation” that gets dumped isn’t the only execution path that has been used – but since this was a test case, and since we see only one library cache miss reported, we can be fairly sure that what we are seeing is true).


Observations on the first query

The output reports 4,077 disk reads in the fetch line with an elapsed time of 4.14 seconds, an average of about 1,000 I/Os per second – clearly most of these I/Os are not real disc reads but are coming from a cache layer somewhere between Oracle’s data buffer and the physical disc.

The rows value for the range scan on index indx_esc_bkng2 is 8,079 – which means Oracle had to collect 8,079 rows from the esc_booking table. But the rows value from the table access line shows only 3 rows output – which means most of those accesses are redundant. It seems reasonably likely that most of the 6,404 logical I/Os (of which 4,077 were disk reads)  were caused by the need to visit approximately 6,400 different blocks of the table to check those 8.000 rows.

Observations on the second query

The output reports only 93 disk reads in the fetch line – which is far better – but the elapsed time of 0.04 seconds (an average of about 2,300 I/Os per second) again shows that most of these I/Os are not real disc reads.

The rows value for the range scan on index indx_esc_bkng_bkngdt is 4,630 – which means Oracle had to collect 4,630 rows from the esc_booking table. But the rows value from the table access line shows only 3 rows output – again most of those accesses are redundant. But of the 3,359 logical I/Os recorded, only 93 turned into disk reads. Possibly this was because many of the blocks were still in the Oracle db cache after running the previous query – possibly this could be because most of the required rows were packed into a much smaller number physical blocks which were frequently re-visited during the course of the query execution.

Guesses

a)         The purpose of the query is probably to find the most recent booking for an individual - given that the predicates with user inputs are:

BOOKED_BY = :a
AND   ESC_BOOKING.BOOKING_DT >= SYSDATE - (:b / 86400)

b)         Even without the comment from the original poster, you might have guessed that there were two relevant indexes on this table – both single column indexes, one on (booked_by), one on (booking_dt). Mind you, t is a little surprising (given the final reported row counts) that the optimizer has not done a “_b_tree_bitmap_plan conversion to combine both indexes before visiting the table – but if we assume this is Oracle 9, then values like “sysdate – (:b / 86400)” are treated as “unknown value” – so the selectivity of the date predicate is fixed at 5%. (This is a deficiency addressed in 10g – see my book)

c)         It seems reasonably likely that bookings for a date range will be packed into a small number of table blocks as they arrive, bookings by an individual will be scattered over a large time span – this would be consistent with the use of the index on booking date being more efficient than the use of the index on booking id.  (Although the optimizer may not find this path in 9i because of the 5% assumption noted above. It is also possible that the low clustering_factor that you might expect on a date-sequenced index like this may have been eliminated by the use of multiple freelists or ASSM).

d)         If referential integrity is being enforced, then the index on (booked_by) may exist for referential integrity purposes. (The foreign key index issue).

Suggestions

There is no efficient path available into the esc_booking table for this query, but an index on (booked_by, booking_dt) looks like a perfect index for this query, and may well be a generally more useful index. Since it starts with (booked_by) it will support the requirements of referential integrity which means the index on (booked_by) can probably be dropped when the two-column index is created.

Of course, there are always risks involved with changing index definitions to tune individual queries – other queries (even ones that don’t use either the old, or the new, index) may change their execution paths as a consequence. One particular trap to watch out for is that queries that used to use the (booked_by) index may not want to use the new (booked_by, booking_dt) because it is larger or because its clustering_factor is higher than the original. (In this case the clustering_factor is unlikely to get worse given my comments in (c) above, but if it is it could be fixed by using the dbms_stats.set_index_stats procedure).


Conclusion:

When you examine the output from tkprof it often takes a simultaneous cross-check of the block level statistics, the Row Source Operation counts, and some sensible guesswork about the purpose of the query, to be able to understand where the problem is, and how it may be addressed.


Back to Index of Topics