JL Computer Consultancy

Circular References – proof by restatement (updated 20th May).

Nov 2004


I was reading an article by Don Burleson on DBAZine quite recently when I came upon a wonderful example of an issue that has kept Oracle users stumbling around in the dark for years: the propagation of legend, fairy tale, disinformation, and “myth”.

The article was entitled “Busting the Oracle Myth Busters”. I won’t comment on how correct Mr. Burleson’s perception (or representation) of real life might be, but I would like to draw your attention to one very important point that I just happened to notice tucked away in one section of this article. It was such a good example of something that happens so often in the Oracle publishing arena.

Towards the end of the document, in a section on index rebuilds, you will find the following


Detractors of frequent index rebuilding say that Oracle indexes are self-balancing, always optimal, and always fast. They discount Oracle’s own guidelines (MetaLink Note 77474.1) for when to rebuild an index and claim that nobody has ever documented a clear-cut set of criteria to use to predict when an index should be rebuilt


I have to say that I know many people who are detractors of blindly indulging in “frequent index rebuilding”, and I think that they are all aware that B*trees are (by definition) always balanced – but I have yet to hear any of them say that B*trees are always optimal and always fast. Be that as it may, you can definitely put me into the “detractors” group and tell me that I’m in the wrong for being there –that’s a perfectly rational thing to do, and I’m quite happy to be corrected, all you have to do is provide a reasonably convincing proof of your point, or demonstrate my error.

But there are two references in this paragraph alone that are supposed to make that convincing demonstration - so let’s check them:


Reference 1:

Note 77574.1 (there was a typo in the publication, but I found the correct Metalink note): Guidelines on When to Rebuild a B-tree index.

If the BLEVEL were to be more than 4, it is recommended to rebuild the index”.

Consider this – with a blevel of five or more (blevel has to be a whole number, and 5 is the first whole number that is more than 4) you have a height of six of more.  I think most people (or the ones who understand a bit about how B-tree indexes work) would be concerned about an index with a height of six and want to investigate it a little – and there’s a chance that the index should be rebuilt. On the other hand, if the index is supposed to have a height of 6, then the table and index will be pretty big, and the cost of rebuilding it will be enormous; so I certainly wouldn’t want an automatic program to rebuild every B-tree index of blevel more than 4.

Mind you, there are people who discount Oracle’s advice - Mr. Burleson, for example, has been known to state that B-tree indexes should be rebuilt when their height exceeds three and another of the references in “Mythbusters” even points to an article by Mr. Burleson where he seems to approve of this height as a good indicator.

In passing – if you want to see how to build a truly degenerate index with height 24 in less than one second, then read “How high can you go ?”, also published on the DBAZine website. There ARE cases where b-tree indexes behave in an extreme fashion – but the good DBA will know which they are because the anomaly will be a side-effect of the business use of data, not an implicit issue of b-trees.


Reference 2:

The article on “when to rebuild an index” contained several errors, and a number of unsubstantiated claims. In particular, when I read it it contained a description of index leaf blocks “spawning” new levels so that an index with a hot spot would end up “unbalanced” because the hot spot would have a much greater height than the rest of the index.

I wrote to the author of this article, asking if he could supply a test case to demonstrate this behaviour. After a brief, friendly exchange of email, it transpired that this behaviour was not something he had seen or tested, it was just something that he had read about somewhere. After checking his sources, it turned out that the original comment came from some material by Don Burleson - in which the claim had also not been proved.

So there you have the biggest problem with the quality of information published about Oracle:

            Writer A publishes some incorrect idea – without proof

            Writer B quotes writer A, without attribution, and without proof.

            Writer A quotes writer B as proof that writer A is correct.

In passing, the author has now changed the article to remove the incorrect description – although unfortunately he still uses the term “spawning”, and does not describe the way that an index split propagates upwards (see Unbalanced Indexes – also published by DBAZine - for a correct description of leaf-block splitting. The images don’t reproduce well on my browser, so the original source is here ).


Conclusion:


Myths could not propagate in the Oracle market if the material published by “professionals” was subject to stricter (self-imposed) rules of quality. If you plan to write articles that could have a serious impact on the way in which working DBAs spend their time, then any significant claim you make should come with a proof, or a reference to a supporting article which includes a proof. At the very least, you should be able to supply a proof (perhaps in the form of a reproducible test case) on request.

 

“circular reference” does not constitute proof.

 


Addendum: (6th Nov 2004 – updated 11th Nov)

I was Googling for articles on Circular References recently when I found an article dated 18th Nov 2004 on one of the Burleson websites that carried an item head-lined “Oracle test-case “Proofs” are often invalid”.  The article has since been removed, or relocated, and replaced by a pointer to the catalog of presentations at Oracle Open World in San Francisco. (You will need to be a registered attendee with a password to see the agenda, though).

The article claimed to have found a web page that suggests that no Oracle experiences should be considered valid unless they can be proven with a test case. The web page referenced happens to be the one you are reading now; and the article quotes the opening paragraph of my conclusion (but omits the line about “professionals” subjecting themselves to stricter rules of quality) as the source of the suggestion. Read that paragraph again, and then ask yourself how the author could get from expressions like:  “serious impact” and “significant claim” to a sweeping generalization like “no Oracle experiences”.

I don’t think many people could have taken my conclusion to mean that (for example) if an article mentioned that ‘select * from dual’ normally returns just one row, it should have a test case included to prove the point. But that’s how the author seems to have interpreted things. I can only assume the author was driven by the need to create a story rather than by a desire for careful analysis.

The article then quotes Mike Ault as saying:

"Practicing DBA's know that artificial "proofs" on PC's often breakdown at high loads.  In fact, many performance problems only come after exceptionally high processing that directly contradict the test cases offered as proof."

(I think Mike has been taken out of context here. The central point of the original discussion was the mechanism of leaf block splitting and I think we can be fairly sure that there is no code in the Oracle kernel that goes: “wow, the load’s going up, I’d better propagate leaf block splits downwards instead of upwards”).

I agree with Mike’s statement – I spend quite a lot of my time demonstrating that test cases and proofs of concept have important flaws in them. I also find that a large fraction of my proofs are about things NOT working – and in most cases things that don’t work on a small scale still don’t work when you increase volume, speed and concurrency.

But the fact that a proof may break down at high loads isn’t an excuse for NOT attempting to supply any proof at all. If you offer a proof you have at least given other people the opportunity to review it, evaluate it, and (using their prior experience) identify the flaws or omissions that could make the proof irrelevant for their circumstances.

Consider a recent article where Don Burleson has stated quite categorically that rebuilding indexes online is low-risk, unobtrusive and cheap. As proof (and a decent explanatory statement is often totally sufficient for “proof”) he gives us the following (paraphrased) statements:

“Indexes rebuilds are low risk because the new index is built as a temporary segment and the old index is not destroyed until the new one is ready”. 

But many readers will see the obvious flaw as soon as they read the proof – he hasn’t considered concurrency. Imagine that another session is executing a long-running index-driven (read-only) query whilst you are going through your “automatic rebuild” of indexes. What happens to this other session when the switch from the old index to the new index takes place? Nothing at all – it keeps on using the old physical image of the index, even though the space has now been freed. Of course, as your automatic program rebuilds the next index, it might overwrite that free space and the other session will crash with Oracle error ORA-01410: Invalid Rowid.

“Index rebuilds are unobtrusive because they don’t interrupt availability and DML activity.”

Again many readers will see the obvious error as soon as they read the proof – Even an on-line rebuild locks the underlying table briefly at the start and end of the rebuild, and this definitely could have an effect on availability, and could interrupt DML activity.

“Online rebuilds are cheap because the computing resources and disk requirements are negligible.”

Again, the statement supporting the claim makes it possible for the reader to draw their own conclusions. In this case, there is no absolute right or wrong; it would be helpful if Don had also pointed out that the online rebuild has to do a tablescan and sort, and builds a materialized view log, which requires a row-level trigger on the table. Nevertheless, the statement is valid – some people will rightly decide that they may as well do the rebuild because they do consider the cost to be marginal, and think they may get some benefit. (I’ve made a similar comment to Don’s in an article of my own about index rebuilds).

The final section of the article is a quote from Don Burleson which starts:

Many people get caught-up in non-productive minutiae, such as how indexes behave, while loosing focus on the real-world. 

If understanding how indexes behave falls into the category of “non-productive minutiae”, why has Don written so many articles telling us to keep rebuilding them, and why does he want us to put them into the tablespace with the largest blocksize? And if he thinks that a knowledge of indexes falls into “non-productive minutiae” does that mean he thinks we shouldn’t bother to go and hear his presentation on indexes at Oracle World in December.

The quote finishes with:

It is ludicrous to suggest that test cases are required to "prove" real-world phenomenon such as the breakdown of Automatic Segment Space Management (bitmap freelists) at his DML load rates.  All real-world DBA's see it, and it's beyond stupid to suggest a proof for the obvious

I’m not really sure what this refers to – but it looks as if it might be a reference to a note I wrote about a bug in 9.2.0.2.

At this point I realized that I was reading an article dated 18th Nov 2004 on 6th Nov 2004: so maybe I’ve managed to find an article that was still under construction. If so, the content when it becomes ‘official’ may be different, and I’ll comment further then.


Addendum: (20th May 2005)

I gave a couple of presentations at the IOUG-A annual conference in Orlando earlier on this month, and while I was there, I picked up a copy of Select – the magazine of the IOUG-A. One of the articles was called “Using Oracle Multiple Blocksizes”, apparently written by someone called Alexey B. Danchenkov. The article started:

Databases with multiple blocksizes have been around for more than 20 years, and were first introduced in the 1980s as a method to segregate and partition data buffers. Once Oracle adopted multiple blocksizes in 9i in 2001, the database foundation for using multiple blocksizes was already a well-tested and proven approach. Non-relational databases such as CA IDMS/R network database have been using multiple blocksizes for nearly two decades.

Strangely, this opening paragraph, reminded me of something. Compare it with this – the opening paragraph of an article published under the byline of Don Burleson, on his website, several months ago.

Databases with multiple blocksizes have been around for more than 20 years and were first introduced in the 1980’s as a method to segregate and partition data buffers.  Once Oracle adopted multiple blocksizes in Oracle9i in 2001, the database foundation for using multiple blocksizes was already as well-tested and proven approach.  Non-relation databases such as the CA IDMS/R network database have been using multiple blocksizes for nearly two decades.

What a coincidence!

And the coincidence goes on, and on, and on. The Danchenkov article matches the entire Burleson article virtually word for word. In fact, the only bits that are different are the closing paragraphs, which advertise a couple of books from Rampant Techpress – one written by Don Burleson, and the other (from which this article was an extract) written by Alexey B Danchenkov but edited by Don Burleson.

Suddenly realization dawns: that’s why the two texts are so similar – Alexey B Danchenkov wrote it several months after Don Burleson had applied his inimitable editing style to it. And wasn’t Don Burleson lucky to snap up this budding new author for Rampant Techpress – he must be worth reading because he’s already been published in Select Magazine.

I emailed the editor of Select about this, and have received a reply explaining that the article was actually co-authored by Danchenkov and Burleson, and that they (Select) had inadvertently left Burleson’s name off the article and would be running a correction in the next issue of Select.

What a pity that Danchenkov and Burleson hadn’t been able to find a single extract from the book that had some input from Danchenkov. What a pity that they hadn’t even been able to find an extract that wasn’t just a reprint from an existing web-based article.


Back to Index of Topics