Hints and a masterstroke
The magazine of the Northern California Oracle User Group has recently taken to publishing a series entitled “Ask the Oracles”. Note the plural in that title – it’s a very important detail. Each issue, the editor puts the same question to half a dozen specialists who then write a short response – just a few hundred words – summarizing their views on the question. It’s a brilliant idea, and one that can only help to improve the quality of information that the reader receives. There are two major benefits to the strategy.
First – there will always be variation in opinion, or at least emphasis, even among the people who really do qualify as experts; moreover slight differences in interpreting the question could also lead to different aspects of the issue being highlighted from different perspectives. All in all, a set of short notes is likely to produce a balanced and intelligent “discussion” of a topic that allows the reader to recognize how, and where, comments from the experts might be most applicable to their current circumstances.
Secondly – there is always the benefit that the group approach will encourage good writing, sensible thought, and quality information. There is a lot of rubbish floating around on the internet claiming the title of “expert advice”, and it won’t go away until the commercial benefit of churning it out disappears. There are plenty of websites where quantity is more important than quality; and commercial sites where the editor has no idea of (and possibly no interest in) the correctness of what they publish. So how do you separate the wheat from the chaff – put little bits of each side by side, and suddenly it’s a lot easier for people see the difference. (I’m not expecting to see any chaff merchants in the NoCOUG series, by the way, as it’s a non-profit organization driven by Oracle users for the benefit of Oracle users – but if some chaff does manage to creep I think you’ll be able to spot it).
There is a third possible benefit – some of the biggest chaff merchants may decide they want to play; and if they do the only way to succeed is to write some good material instead of their usual dross – which means they have to stop churning out chaff – so everyone wins.
So congratulations of Iggy Fernandez (Editor of the NoCOUG Journal) for coming up with the idea, and good luck with future issues.
The reason for this note was that I was one of the “Oracles” asked to respond to the question “Does the Optimizer need a hint?” (Published as Does the Optimizer need a clue”. The other contributors for this question were: Gaja Vaidyanatha, Guy Harrison, Cary Millsap, Chris Lawson, and Dan Tow. I can’t reprint the comments from the other experts, of course, but the following was my contribution:
Does the Optimizer need a hint?
If you design your application perfectly, make all the smart choices with data structures, create all the relevant constraints, generate suitable statistical information about your data, and write carefully crafted code, then you will still find that there are a few statements that need hints before the optimizer follows the “best” execution path.
There are many reasons why the optimizer may need help – including bugs, simple deficiencies in the optimizer model, and the inherent problems of collecting, storing, and processing the complex statistics needed to describe real-world data thoroughly.
Counter-intuitively, another reason why the optimizer may need help is because there are some extremely cunning strategies built into the run-time engine. Consider, for example, the simple query:
from emp outer
where outer.sal > (
from emp inner
where inner.dept_no = outer.dept_no
There are two major strategies that the optimizer could adopt for this query – create a result set with the structure (deptno, avg_sal) and do a join to the driving table (an unnest operation), or scan the driving table and execute the subquery whenever necessary (a filter operation). If you want the unnest to happen you could include the /*+ unnest */ hint in the subquery; to force the filter operation you could include the /*+ no_unnest */ hint.
Of course, you might try to avoid using hints by manually unnesting, rewriting the query as:
from emp outer
select dept_no, avg(inner.sal) avg_sal
outer.dept_no = inner.dept_no
and outer.sal > inner.avg_sal;
Alas, if you do this in recent versions of Oracle you might then need to stop the optimizer from doing a cunning – but possibly catastrophically inefficient – piece of complex view merging by including the /*+ no_merge */ hint in what is now the inline view (you could also achieve this through a /*+ no_merge(inner) */ in the main query).
But why might you want to control the strategy that the optimizer chooses for the original query anyway? Because there is a clever trick, known as scalar subquery caching, that can occur at run-time to minimize the number of times the filter subquery is executed – but it is impossible for the optimizer to know how many times the filter subquery will actually run. (The optimizer may be able to work out the minimum number of times the filter subquery has to run, but that’s not necessarily a good estimate of the actual run-time activity).
In this specific example it is likely that unnesting will be the better option– in other cases it will be less obvious. And if the optimizer chooses the wrong option you have to give it a hint, or rewrite the query, to make it do the right thing.
But look at the comment I made about rewriting this query. In Oracle 8i, my rewrite with the inline view could be a good idea – in Oracle 9i the inline view might get merged, with disastrous side-effects on performance.
The same type of issue appears with hinting – you find a hint that seems to solve a problem in one version of Oracle and causes a problem when you upgrade to the next version. (The ordered hint is a good example of this in 8i, and the push_subq hint is a good example in 9i). The biggest problem with hints is that they are badly documented – it is almost impossible to find out exactly what each specific hint is supposed to do, and if you don’t know what a particular hint does, how can you work out why it seems to solve a particular problem.
Hints can be very useful to solve urgent problems – but my general advice is (a) don’t use them as a first resort, (b) check whether the real problem is in the statistics (c) if you really need to hint your SQL, you probably need an average of at least one hint per table to lock in the execution path you expect and (d) assume that you’re going to have to revisit and retest any hinted SQL on the next upgrade.