{\rtf1\ansi \deff4\deflang1033{\fonttbl{\f1\froman\fcharset2\fprq2 Symbol;}{\f4\froman\fcharset0\fprq2 Times New Roman;}{\f5\fswiss\fcharset0\fprq2 Arial;}{\f11\fmodern\fcharset0\fprq1 Courier New;}}
{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;
\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\widctlpar \f4\fs20 \snext0 Normal;}{\s1\qr\sb240\sa240\widctlpar\noline \shading1250 \b\f5\fs96 \sbasedon16\snext0
heading 1;}{\s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 \sbasedon16\snext0 heading 2;}{\s3\sb120\sa120\keepn\widctlpar\noline \f4\ul \sbasedon16\snext0 heading 3;}{\s4\keepn\widctlpar \b\i\f4\fs20\lang2057 \sbasedon0\snext0 heading 4;}{\*\cs10
\additive Default Paragraph Font;}{\s15\li720\widctlpar\tx720 \f11\fs20\lang2057 \snext15 macro;}{\s16\sa120\widctlpar\noline \f4 \snext16 main text;}{\s17\sa120\widctlpar\noline\brdrt\brdrs\brdrw15\brsp20 \brdrl\brdrs\brdrw15\brsp80 \brdrb
\brdrs\brdrw15\brsp20 \brdrr\brdrs\brdrw15\brsp80 \i\f4 \sbasedon16\snext17 Special Thought;}{\s18\li475\widctlpar\tx720 \f11\fs20\lang2057 \sbasedon15\snext18 SQL text;}{\s19\qr\sb240\sa240\sl480\slmult1\widctlpar\noline \b\f4\fs48 \sbasedon0\snext19
Highlights;}{\s20\qr\sb240\sa240\sl480\slmult1\widctlpar\noline \b\f4\fs48 \sbasedon16\snext20 Chapter Heading;}{\s21\fi-360\li720\sa120\widctlpar\noline\tx720{\*\pn \pnlvl10\pnf1\pnstart1\pnindent360\pnhang{\pntxtb \'b7}}\f4 \sbasedon16\snext21
bullet point;}{\*\cs22 \additive\ul\cf2 \sbasedon10 Hyperlink;}{\s23\widctlpar\tqc\tx4153\tqr\tx8306 \f4\fs20 \sbasedon0\snext23 header;}{\s24\widctlpar\tqc\tx4153\tqr\tx8306 \f4\fs20 \sbasedon0\snext24 footer;}}{\info
{\title How the Cost-Based Optimiser works}{\author Jonathan Lewis}{\operator Jonathan Lewis}{\creatim\yr2001\mo12\dy12\hr8\min49}{\revtim\yr2002\mo3\dy22\hr10\min51}{\version4}{\edmins2}{\nofpages4}{\nofwords1644}{\nofchars9372}{\*\company Gateway}
{\vern57443}}\widowctrl\ftnbj\aenddoc\hyphcaps0\formshade \fet0\sectd \linex0\headery709\footery709\colsx709\endnhere {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl2\pnucltr\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl3
\pndec\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang{\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}
{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}\pard\plain
\s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 {\lang2057 A New Paradigm in Hit Ratios.
\par }\pard\plain \widctlpar \f4\fs20 {\i The topic of hit ratios is one that can easily raise temperatures amongst DBAs. It is now generally recognised that hit ratios are }{\b\i NOT}{\i
the be-all and end-all of tuning. However, used with understanding and good humour at the appropriate point, they can be helpful.
\par }
\par It is becoming more widely appreciated that it is not a good idea to rely on {\b\i hit ratios}
as a method for assessing the efficiency of an Oracle database system. But it is important to remember that an idea is not automatically bad simply because it has gone out of fashion.
\par
\par Remember that Newton's laws of motion survived for about 300 years before Einstein produced the theory of special relativity, but even today Newton's laws are still totally adequate when used for the right reason and in the correct circumstances.
\par
\par Although the thoughtless extraction and reporting of {\b\i hit ratios} is not a sound idea, this paper would like to introduce a hit ratio that could possibly be used to produce a reasonable observation of real-life behaviour.
\par
\par \pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 What's Wrong with Hit Ratios ?
\par \pard\plain \widctlpar \f4\fs20
If you take a list of 1,000 numbers and reduce it to just three numbers (and those three numbers are specifically the count, mean and standard deviation) then you have created some useful information. On the other hand, if you start with just two numbers,
and reduce them to a single number (specifically their ratio) then you have lost information, and this loss of information is the generic drawback to the hit ratio.
\par
\par On top of this inherent information loss there is the added, and much more important, problem that there is often a devastating lack of understanding about the underlying measurements being used in the calculation. To clarify this statement, let me quote
an abstract of the simplest type of formula typically used to calculate a {\b\i hit ratio}.
\par
\par \pard\plain \s18\li475\widctlpar\tx720 \f11\fs20\lang2057 hit ratio H = 100 * (1 - (measurement P / measurement Q))
\par \pard\plain \widctlpar \f4\fs20
\par The trouble is that the measurements P and Q are often just a couple of numbers picked out of some {\b\i V$ dynamic performance view}, and it is often the case that people do not really know the exact significance of what these numbers are measuring.
\par
\par The classic (though more complex) example of this is, of course, the {\b\i buffer hit ratio}, a.k.a. the {\b\i cache hit ratio}, where one of several measurements used is the number of {\b\i current mode}
gets. However, over the years, the activity associated with a {\b\i current mode} get has varied dramatically, moreover the actual cost of a {\b\i current mode}
get varies with circumstances, and Oracle performs many actions which are virtually indistinguishable from {\b\i current mode} gets but does not record them as such in {\b\i v$sysstat}
. (You can confirm this last observation for yourself by comparing the activity of the {\b\i cache buffers chains }latch children with the {\b\i db block gets} statistic of {\b\i v$sysstat}
when performing different Oracle operations such as update, rollback and delayed block cleanout). So how can a {\b\i buffer hit ratio} be of any use when there is so much variation possible in the meaning (and accuracy) of the input values ?
\par
\par \pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 The FAN Hit Ratio
\par \pard\plain \widctlpar \f4\fs20 Despite my reservations about hit ratios in general, there is one Oracle database hit ratio that I have found to be a reasonable diagnostic tool for real-life applications. This is the {\b\i FAN hit ratio}
, which has to be applied at a macroscopic or operational level to an Oracle task. For a given operation (be it query, overnight batch, or anything else) the {\b\i FAN hit ratio} is calculated (in a fairly typical way) as:
\par
\par \tab FAN Hit Ratio =\tab 100 * (1 - least (1, calculated response time / actual response time ) )\tab (1)
\par Or\tab FHR\tab =\tab 100 * (1 - least (1, CRT/ ART ) )\tab \tab \tab \tab \tab (2)
\par
\par
\par Note: The FHR is never more than 100, and the presence of the {\b\i least()} function ensures that the FHR cannot return a negative value.
\par
\par Of the two input values, the {\b\i actual response time} (ART) is of course easy to understand and measure. However the {\b\i calculated response time} (CRT) is more of a problem, and it will sometimes require a fair deg
ree of expertise to be able to produce a reasonably accurate value. In fact I have seen many cases where an apparent performance problem is the result of an over-optimistic value for the {\b\i calculated response time}.
\par
\par The CRT for an operation represents the optimum time required to perform that operation assuming the database is operationally sound, correctly designed, and the best access path is taken. In the simplest cases the calculation basically requires you to em
ulate the work done by the {\b\i cost based optimiser} (CBO) in evaluating the cost of a query, but with the luxury of having perfect knowledge about the system, and plenty of time to do the analysis.
\par
\par Let me give a few examples:
\par
\par \trowd \trqc\trgaph108\trleft-108\trbrdrt\brdrs\brdrw15\brdrcf1 \trbrdrl\brdrs\brdrw15\brdrcf1 \trbrdrb\brdrs\brdrw15\brdrcf1 \trbrdrr\brdrs\brdrw15\brdrcf1 \trbrdrh\brdrs\brdrw15\brdrcf1 \trbrdrv\brdrs\brdrw15\brdrcf1 \clbrdrt\brdrs\brdrw15\brdrcf1
\clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \clcfpat1\clcbpat8\clshdng10000 \cellx964\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr
\brdrs\brdrw15\brdrcf1 \clcfpat1\clcbpat8\clshdng10000 \cellx3425\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \clcfpat1\clcbpat8\clshdng10000 \cellx5530\clbrdrt
\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \clcfpat1\clcbpat8\clshdng10000 \cellx6269 \pard \widctlpar\intbl {\b Example\cell Calculated response time\cell Actual response time\cell
FHR\cell }\pard \widctlpar\intbl {\b \row }\trowd \trqc\trgaph108\trleft-108\trbrdrt\brdrs\brdrw15\brdrcf1 \trbrdrl\brdrs\brdrw15\brdrcf1 \trbrdrb\brdrs\brdrw15\brdrcf1 \trbrdrr\brdrs\brdrw15\brdrcf1 \trbrdrh\brdrs\brdrw15\brdrcf1 \trbrdrv
\brdrs\brdrw15\brdrcf1 \clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx964\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1
\clbrdrr\brdrs\brdrw15\brdrcf1 \cellx3425\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx5530\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb
\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx6269 \pard \widctlpar\intbl 1\cell 3 seconds\cell 12 seconds\cell 75%\cell \pard \widctlpar\intbl \row \pard \widctlpar\intbl 2\cell 4 minutes\cell 5 minutes\cell 20%\cell \pard \widctlpar\intbl
\row \pard \widctlpar\intbl 3\cell 0.12 seconds\cell 0.1 seconds\cell 0%\cell \pard \widctlpar\intbl \row \pard \widctlpar\intbl 4\cell 0.001 seconds\cell 0.01 seconds\cell 90%\cell \pard \widctlpar\intbl \row \pard \widctlpar\intbl 5\cell 100 minutes
\cell 1,000 minutes\cell 90%\cell \pard \widctlpar\intbl \row \trowd \trqc\trgaph108\trleft-108\trbrdrt\brdrs\brdrw15\brdrcf1 \trbrdrl\brdrs\brdrw15\brdrcf1 \trbrdrb\brdrs\brdrw15\brdrcf1 \trbrdrr\brdrs\brdrw15\brdrcf1 \trbrdrh\brdrs\brdrw15\brdrcf1
\trbrdrv\brdrs\brdrw15\brdrcf1 \clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx964\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb
\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx3425\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx5530\clbrdrt\brdrs\brdrw15\brdrcf1 \clbrdrl
\brdrs\brdrw15\brdrcf1 \clbrdrb\brdrs\brdrw15\brdrcf1 \clbrdrr\brdrs\brdrw15\brdrcf1 \cellx6269 \pard \widctlpar\intbl 6\cell 4 hours\cell 5 hours\cell 20%\cell \pard \widctlpar\intbl \row \pard \widctlpar
\par Examin
ing the table, you will see that when the actual response time is close to the calculated response time the FHR is close to zero, but if the actual response time is much greater than the calculated response time the FHR starts to climb. For the FHR, zero
is best and 100% is worst.
\par
\par In example 3, we see that the {\b\i actual response time }is less than the {\b\i calculated response time} (perhaps Oracle has used a new access method that we did not know about, perhaps some unexpectedly propitious buffering effect has elimi
nated some expected wait time). So the presence of the {\b\i least()} function in the formula has given us an FHR of zero.
\par
\par Examples 4 and 5 demonstrate the pathological problem of {\b\i hit ratios}
. In example 4 we have an operation where the FHR is 90% (nominally appalling), however the total elapsed time is only one hundredth of a second and who is going to complain that they have lost a whole nine-thousandths of a second ? However, what if you h
ave to perform this operation 6,000,000 times per day ? In this case we h
ave made a mistake in granularity - we are not really interested in example 4, but in example 5. Does that six million occurrences of the operation take 100 minutes, or 1,000 minutes. A {\b\i hit ratio}
in a vacuum is no good - you need a second number to give you an idea of the scale at which the {\b\i hit ratio} applies.
\par
\par Example 6 highlights another limitation of hit ratios - in this case, the FHR is a fairly reasonable 20%. However it applies to a large batch job. If you have only four and a half hours as the window to complete this job it is {\b\i totally irrelevant}
that your FHR is pretty good - the actual run time won't fit the window.
\par
\par \pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 So what is the FHR ?
\par \pard\plain \widctlpar \f4\fs20 Unlike any other of the other {\b\i hit ratios} invented for Oracle databases the {\b\i FAN hit ratio}
(when used with a little caution, as indicated by examples 4, 5 and 6 above) is a single number that supplies two critical pieces of information. More significant
ly, though, the FHR is not just a meaningless number, we can equate it with a real-life phenomenon that can be experienced at a gut level by the practicing DBA.
\par
\par At a human-comprehension level, the {\b\i FAN hit ratio} is the probability that something nasty is going to hit the fan. Over and above this, however, the {\b\i FAN hit ratio}
is also a pretty accurate indicator of the speed of impact - the higher the ratio gets, the harder the substance hits.
\par \pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 Future Development
\par \pard\plain \widctlpar \f4\fs20 Work is still being done to refine the FAN hit
ratio. The author has already identified a couple of serious deficiencies in using the ratio as an absolute determinant of trouble. For example, a heuristically based guideline tells us that a {\b\i FAN hit ratio}
of about 50% is likely to be perfectly adequate for most mid-range system whereas for a large-scale, high-throughput, systems a value as low as 3% may not be adequate. Clearly the FHR requires a factor to ensure that its value is automatically increased
according to the scale and importance of the system - a Big System (BS) factor is under investigation.
\par
\par Furthermore, two variant of the FHR have already been discovered. The first is the so-called {\b\i flying pig hit ratio} (FPHR) where the {\b\i calculated response time} (CRT) is replaced by the {\b\i desired response time}
(DRT). Typically, this hit ratio will be seen in management reporting systems where a sub-second response time is required for reports that access tens of thousands of data items. The difficulty here is that the use of the DRT guarantees that the FHR wil
l almost inevitably be very close to 100%. (In real-world terms, this syndrome can be recognised by the complaint: "Whatever we do the user is never happy").
\par
\par The second variant, known provisionally as the {\b\i friendly user hit ratio} (FUHR), is derived from the more tolerant use of the {\b\i acceptable response time}
(ART). Recent research suggests that this variant is in fact the best ratio to use as a target for tuning efforts. If you use the original definition of the FHR as your target, it is very easy to be tempte
d into wasting large amounts of effort trying to achieve the elusive {\b\i perfect zero}
result. (This is of course a consequence of the variant of the Pareto rule that tells us that 80% of the effort will be spent on the last 20% of the work.) However, if you switch from the {\b\i calculated response time} (CRT) to the {\b\i
acceptable response time} (ART), then the first 20% of the effort may indeed be sufficient to reduce the resulting modified FUHR to zero.
\par
\par There is, inevitably, a problem here that the author is still struggling to resolve. Abbreviation Confusion Syndrome (ACS) can occur, resulting in a major indirection of effort; if we substitute the {\b\i acceptable response time} (ART) for the {\b\i
calculated response time} (CRT) in formula (2) above, we see:
\par
\par \pard \fi720\widctlpar FUHR\tab =\tab 100 * (1 - least (1, ART/ ART ) )
\par \pard\plain \s23\widctlpar \f4\fs20 {\lang2057
\par Careless use of this formula will, inevitably, result in the na\'efve DBA cancelling the ART with the ART, to produce the result:
\par
\par }\pard\plain \fi720\widctlpar \f4\fs20 FUHR\tab =\tab 100 * (1 - least (1, 1/1 ) )
\par FUHR\tab =\tab 100 * (1 - 1)
\par FUHR\tab =\tab 100 * (0)
\par FUHR\tab =\tab 0
\par \pard \widctlpar
\par With the modified FUHR apparently evaluating to zero the na\'ef
ve DBA could easily be deceived into thinking that their database is running perfectly. (Again, restating this in real-world terms, we have the well-known argument: "The users are lucky to get anything").
\par \pard\plain \s23\widctlpar \f4\fs20 {\lang2057
\par }\pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 Conclusion
\par \pard\plain \s23\widctlpar \f4\fs20 {\lang2057 This document is a work in progress, and research is still going on regarding the possible suitability, or safety, of using the }{\b\i\lang2057 FAN hit ratio}{\lang2057
and its variants as the basis for a tuning methodology. Any attempt by the reader to use the }{\b\i\lang2057 FAN hit ratio}{\lang2057
on any system, under any circumstances whatsoever, is at the reader's own risk. However, there are several serious points in this document that should not be casually ignored.
\par }\pard\plain \s2\sb120\sa120\keepn\widctlpar\noline \b\i\f4\fs40 Acknowledgements
\par \pard\plain \s23\widctlpar \f4\fs20 {\lang2057 Mogens Norgaard of Miracle AS (Denmark) for supplying the 15-year old malt whisky that made this important piece of research possible.
\par The audience of the UKOUG Conference 2001 for laughing in all the right places.
\par
\par }\pard\plain \widctlpar \f4\fs20 {\i Jonathan Lewis is a freelance consultant with more than 15 years experience of Oracle.
He specialises in physical database design and the strategic use of the Oracle database engine, is author of 'Practical Oracle 8I - Designing Efficient Databases' now in its 3}{\i\super rd}{\i
print run from Addison-Wesley. He maintains a web-site of Oracle-based material at www.jlcomp.demon.co.uk ,which includes The Co-operative Oracle Users' FAQ for the Oracle-related Usenet newsgroups. This article was first publ
ished in the magazine of the UK Oracle User Group.
\par }\pard\plain \s23\widctlpar \f4\fs20 {\lang2057
\par }}