JL Computer Consultancy

Can we have a sensible debate?

Apr 2005
 (updated Aug 2007)


A few days ago, Don Burleson let me know that he was launching a discussion of [Aug 2007: the Forum thread identified by the following URL seems to have been deleted some time in the last two years] “Empirical” vs. “Research” DBAs on the forum that he operates, and gave me the invitation: Please feel free to join the discussion, if you want.” I took a look at the thread – but haven’t bothered to contribute to it. (You need an account on the site to do so, and a website has to be really useful before I go to the trouble of memorizing yet another set of account details – when I explained this point of view to Burleson, he reported is as “Lewis doesn’t want to join a forum where he can be identified”).

Should I have joined the debate? Think about the following before you decide.


Quoting as at 1st April: The title of the thread seems to be:

Do single-user scripts "prove" Oracle Performance?, Let's examine the issue

The opening article, written by Don Burleson starts:

Today we see two “types” of Oracle DBA’s, each saying that other “types” of DBA’s are perpetuating Oracle Myths. The Research DBA seeks “laws”, tautologies about Oracle behavior that are true, always true. Conversely, the Empirical DBA seeks “correlations” and deals in probabilities rather than absolutes.

- The Empirical DBA observes real-world situations, notes correlations and then generalizes their “rules” and applies their rules and observations to new databases. The Empirical DBA believes that real-world Oracle databases with thousands of active users and hundreds of transactions per minute are often impossible to simulate and they do their testing with large, multi-user simulations.

- The Research DBA has the motto is “Prove it,” and “Trust, but Verify”. They love to perform research on their Oracle systems, creating conditions and proving the behavior of Oracle via reproducible tests. The Research DBA believes that a database can be described with SQL*Plus test scripts and every assertion about database behavior can be proven with research.

and it closes with:

So, do SQL*Plus scripts accurately reflect real-world Oracle behavior, or do you need full-blown multi-user benchmarks?

Remember the rule here that everyone treat the others with respect and dignity, even if you strongly disagree. . . .


To highlight the inherent futility of attempting to squeeze anything useful out of the proposed “debate”, I’d like to propose a different debate as an analogy. The topic of the debate is:  “Amphibious Birds vs. Emperor Penguins”, and this is how I’d like to start:

Do the Wright brothers “prove” Jumbo Jets? Let’s examine the issue.

Today we see two types of DBA – the “Amphibious Bird” DBA, and the “Emperor Penguin” DBA. The Emperor Penguin [ed: see footnote] is obviously a daft creature with an idiotic and pathetic lifestyle. The Amphibious Bird is a dynamic creature that loves to get on in the world and can deal with all sorts of challenges.

Amphibious Birds are marvels of evolution, able to face challenges in two completely different environments in the real world. Whether they are operating in the air or in the water, they are able to detect the currents around them, and respond to them with an instantaneous change in direction – even doing U-turns at break-neck speed. They don’t bother to learn the laws of aerodynamics or hydrodynamic theory to do this, they simply apply the experience and instincts they have acquired over the years to deal with the constantly changing world around them.

Emperor Penguins spend most of their time standing around on the ice doing nothing, can’t fly, and are such feeble birds that they only have a single chick in a season. They can hardly even walk properly, and when traveling may do half the distance sliding along on their fronts. They spend so much time hanging around in gangs, wearing their silly black and white costumes, that they barely notice the real world going by.  They wouldn’t recognize a challenge if it jumped up and smacked them in the face.

So, do Alcock and Brown accurately reflect the principle of flight, or do you need to build a Jumbo Jet?


Of course the analogy is a little over-blown. But it’s almost as sensible and meaningful as the original. So let’s take the original piece by piece.

Do single-user scripts "prove" Oracle Performance?

Since the word “prove” is in quotes it is possible that the author has some idiosyncratic interpretation of the word in mind. If so, how are we supposed to debate the question if we don’t know what the author’s special chosen interpretation is?

So let’s assume the quotation marks are there only for emphasis –a fairly common convention on the internet – and that the word has its generally accepted meaning, as in “convince me, I need to see something that will make me believe” (rather than the high-precision mathematical sense of “derive it from the axioms and rules of the formal system”) .

So what does it mean to “prove Oracle performance”.  We may try to prove that “the sky is blue”, we may try to prove that “Jumbo jets can fly”, we may try to prove that “excessive latch activity will limit scalability in a high concurrency system”. Note that everything we try to prove is described by a sentence (or more technically, an assertion). “Oracle Performance” is a noun (or noun phrase for the sticklers), you cannot prove a noun.  (Of course, we could try to prove Einstein’s Theory of General Relativity, but in that case the noun phrase “Einstein’s Theory of General Relativity” is a convenient notation for representing the complex assertion that makes up the theory).

What is the expression ‘single-user script’ supposed to mean. Is it a script that only one person is allowed to execute, or is it a script that may only be executed from one session at a time, or is it a script that is only allowed to run on a machine with a single CPU. And whatever it means, why restrict the debate to such a narrow toolset?

“Do” – such a small word, and yet so easy to use incorrectly. Start a question with the word “Do”, and you imply that you are looking for a simple Yes/No answer.  Do birds fly? Do SQL*Plus scripts contain 100 lines of text? In the absence of a qualifier such as ‘all’ or ‘some’ (Do all birds fly? Do some SQL*Plus scripts contain 100 lines of text?) such questions tend to be lacking in sense – but when the qualifier is put in place the corrected questions then tend to be trivial and not worth debating.

Clearly the author had some desire to express a relationship between scripts, proof, and Oracle Performance. So let’s try and produce a question that might be what the author intended. (I have to say that I had to reword this question three times before I was happy that I had eliminated any ambiguity.)

Are there any aspects of Oracle’s performance characteristics that can be analyzed and explained beyond reasonable doubt using a simple script (or small set of simple scripts)?

Of course the question has just become trivial – the answer is yes, as Tom Kyte frequently shows on his website. And it’s not just performance characteristics – there are other questions about Oracle that succumb to the same treatment such as ‘do freelists get unbalanced’, ‘can an index grow to different depths in different locations’.

Note that I have emphasized the word “any” in the re-written question. If you changed that to “all”, the answer would not be the same. The number, or complexity, of the scripts would probably have to increase with the subtlety of the problem; and the degree of certainty would probably go down – even for a ‘full-blown benchmark’.


Empirical DBAs and Research DBAs.

There are two problems with the first paragraph in this section. First, the author presumably does not know the meaning of the word empirical, and therefore does not realize that practical research is all about empiricism (acquiring knowledge by experiment and observation). I recall the same author once tried to divide DBAs into ‘pragmatic’ DBAs and ‘scientific’ DBAs – because he did not realize that scientists are the ultimate pragmatists. This latest error is just more of the same, though possibly a little more understandable as there are fields of research that may not require experimentation. Researchers in history (say) cannot change the past to find out what might have happened. Researchers in some branches of mathematics have the luxury of dealing with the pure logic of formal systems. (For such mathematicians, of course, the words theory (or theorem) and proof take on meanings that are much more precise than the less formal meanings used by the average person. Ironically, Gödel's theorem tells us that any sufficiently strong formal system must contain theorems that can be known to be true, but cannot be proved to be true within the formality of the system).

Secondly, the author clearly exposes his intention of avoiding anything resembling a sensible discussion by his extreme definition of a ‘research dba’.  This may seem an extreme inference, but consider the alternative. ‘Research DBAs’ (I am assuming that Don Burleson is trying to classify the Tom Kyte’s of this world) seem to spend a noticeable fraction of their time pointing out that the directives issued by self-proclaimed experts are often sweeping generalizations that are correct only in certain circumstances, and may even be wrong most (if not all) of the time. So the only alternative interpretation of Don Burleson’s definition for a ‘research DBA’ is that he literally cannot tell the difference between the statements:

                        Your assertion is correct only in certain limited circumstances.

and

                        Your assertion is always wrong because it’s not always right.

Before proceeding to the second and third paragraphs, I think we need to replace the words ‘empirical’ and ‘research’ with something more representative of what the debate is supposed to be about. How about using a couple of non-judgmental choices like: careless and careful; or perhaps slipshod and thorough; or haphazard and organized; possibly chaotic and structured. (Spot the pattern – am I trying to introduce a bias in your thinking before you even start thinking).  Okay, let’s stick with ‘optimistic’ and ‘realistic’.

- The optimistic DBA observes real-world situations (practises on your production database system), notes correlations (using a sample size of 1, and doesn’t necessarily worry too much about causation) and then generalizes their “rules” (guesses) and applies their rules and observations to new databases (the next production database they come to).

The optimistic DBA believes that real-world Oracle databases with thousands of active users and hundreds of transactions per minute are often impossible to simulate and they do their testing with large, multi-user simulations. (But they don’t seem to bother with doing any small-scale tests, or understanding how to create, observe, and predict, based on small scale tests – so how confident are you that an optimistic DBA could build an appropriate, large, multi-user, simulation and interpret the results properly – and what’s the difference between a simulation and a set of SQL statements fired at a database through some form of a scripting mechanism?)

- The realistic DBA has the motto is “Prove it,” and “Trust, but Verify”. They love to perform research on their Oracle systems (but only on the test systems, not on the production systems), creating conditions and proving the behavior of Oracle via reproducible tests. (So it sounds like they might be good at doing simulations that range from small and cheap to large, multi-user and expensive).

The realistic DBA believes that a database can be described with SQL*Plus test scripts (but that’s just restating Ted Codd’s non-subversion rule – of course a database can be described by the SQL used to create and populate it) and every assertion about database behavior can be proven (or shown to be false) with research (and fortunately the realistic DBA is aware of the concept of cost/benefit analysis and knows how to use the available tools to get high quality information at minimum cost without having to run to an expensive, large, multi-user simulation unnecessarily).


Finally, the sentence that apparently launches us into debate.

So, do SQL*Plus scripts accurately reflect real-world Oracle behavior, or do you need full-blown multi-user benchmarks?

Not only do we see the use of the word “do” again suggesting that there can only be a Yes/No answer about the use of SQL*Plus scripts (and why did the question arbitrarily change from “single-user” to “SQL*Plus” – are realistic DBAs considered to be too ignorant to use all the tools that are available?).

We even see the further restriction that there has to be an absolute dichotomy between SQL*Plus scripts and full-blown multi-user benchmarks (What happened to the simulations, suddenly we have to have real users?). Have you ever stopped to consider how many keyboard operators it would take to reproduce Oracle’s latest TPC-C benchmarks if it was a proper multi-user benchmark? Do you think that maybe Oracle scripted the whole thing?

So let’s be nice to Don, and help him write some sensible questions:

Can test scripts accurately reflect features of Oracle behavior that would appear on production systems?   (Answer: Yes)

Is it possible to demonstrate, without resorting to multi-user simulations, that some common notions of how Oracle behaves are woefully inadequate? (Answer: Yes)

Do you sometimes need a complex model to predict how Oracle would perform at the concurrency levels you expect on your production systems? (Answer: Yes)

Do you always need a complex model to show that Oracle won’t perform at the concurrency levels you expect on your production systems? (Answer: No)

Is it ever impossible, or at least financially unfeasible, to create a sufficiently complex model to predict how Oracle would perform at the required concurrency levels? (Answer: Yes)

Are some complex models totally unrealistic scenarios that are only pretending to reflect the real world so that Oracle’s behavior can be made to look good? (Answer: popular opinion has it that this is an automatic yes if the model has been designed to meet a TPC council benchmark)


Conclusion

So what’s the debate really about?

Clearly it’s not intended to be a rational debate about whether or not you should actually try to test things before you implement them on production. Moreover, there are obvious indications that you are supposed to realize that testing is a very complex, expensive and subtle operation that you should leave to real experts  – with the corollary that if a test is simple to set up, it is obviously worthless.

So maybe the whole thing is just an advert for a new book from Rampant Press (the Don Burleson publishing company) which apparently includes a description of a benchmark that spreads over dozens of pages according to comments made by Don Burleson at later points in the thread.

So if you want to read up on benchmarks why not try a free 206 page benchmark. Go to www.tpc.org and look at

TPC-C – OLTP  ->  Top Ten results by performance  -> All  -> HP Integrity rx5670 Cluster 64P 

When you get to the screen summarizing the HP Integrity results, you will find a URL at the bottom of the page that allows you to download the Full Disclosure Report for the benchmark. It makes interesting reading – especially after you’ve spotted the five or six critical paragraphs that tell you exactly which features Oracle depended on to get the performance they needed. Someone clearly applied some careful thought about how to use Oracle properly before they built the model.


Footnote:

For those who may not be familiar with Antarctic birds, the Emperor Penguin is an amphibious bird. My comments about its daft and idiotic lifestyle were made purely for the purpose of demonstrating a point and do not reflect my real opinion of Emperor Penguins. The Emperor Penguin is an extraordinary creature that has evolved to survive in one of the most hostile environments on the planet. If you want to learn more about Emperor Penguins, there is at least one website dedicated to this creature. And a few notes on general penguin adaptation.

Wilbur and Orville Wright were credited with the first successful sustained powered flight in a “heavier than air” machine on 17th Dec 1903 at Kill Devil Hill (Kittihawk). The distance of their first flight was 120 feet – about half the length of a Boeing 747.

In June 1919, John Alcock and Arthur Whitten Brown were the first people to fly an aircraft non-stop across the Atlantic.  They flew a modified Vickers Vimy IV biplane from St. John’s, Newfoundland to Clifden, Ireland in a time of 16 hours and 27 minutes.


Footnote 2:  5th April 2005

References to penguins in the above article should not be interpreted as being related in any way to Linux.


Footnote 3:  5th April 2005

You might note that the last line in the introduction to the debate reads:

Remember the rule here that everyone treat the others with respect and dignity, even if you strongly disagree. . . .

It is interesting to note, therefore, the following entry from Don Burleson in the body of the article.

Posted: Apr 1 2005, 11:04 AM

I'm NOT going to bash another Oracle author, it's unprofessional. . . .

 

I was interviewing a DBA yesterday (30 years DBA experience, Certified Oracle Master, ex-Oracle University instructor), and I asked "What do you think about the work of Tom Kyte? "

He replied that he suspected that Kyte had very little real-world experience diagnosing and tuning large production databases. (I've wondered about that myself, since I cannot find his experience and qualifications anywhere online.)

He also said that he was troubled by the "lab-only" approach, and the mis-leading conclusions from his "proofs".

 

So Don is prepared to state that “someone else suspected that Kyte has very little real-world experience …” That doesn’t sound respectful or dignified to me, it just sounds like a little bit of malicious gossip. Notice, of course, the absence of any proof – but Don, of course, doesn’t approve of proof, an opinion based on hearsay with no concrete evidence is totally sufficient. Anyway, it’s not Don that is failing to treat Tom with a lack of respect and dignity, he’s only quoting someone else.

 

Moreover, as we see in another post from Don a couple of days later, it’s not actually ‘everyone’ that should be treated with respect and dignity, it’s just forum members, and Tom isn’t a forum member – so that’s alright then.

 

Posted: Apr 3 2005, 11:26 AM

Please know that anyone who fails to treat other forum members with respect and dignity will not be allowed here.

Of course, this isn’t the first time that Don has displayed his belief that his rules of conduct are only for other people to obey. Somewhere around page 8 in this debate he makes the point (to address – or avoid addressing - the comment from Bill S. that “This is America, and while you can say what you want, if you stand there yelling about how this is false you are going to have to PROVE it is false sooner or later”)

 

Hi Bill,

Let's be careful not to offend non-USA folks here:

 

            quote              

            This is America, and while you can say what you want

end quote

On the other hand, in a response to an item on Mike Ault’s blog, (Be careful in what you prove) he says:

Hi Mike,

Is it just me, or does it seem like the vast majority of these "script kiddies" come from outside the USA? Why is it that with the overwhelming majority of Oracle users in the United States, there is such a disproportional amount of this nonsense from foreign lands?

JPL: This remark has since been deleted from the blog but has been quoted on page 5 of another debate proposed on Don Burleson’s forum at http://dba.ipbhost.com/index.php?showtopic=1239&st=60 


Footnote 4:  5th April 2005

Since writing this article, I have received various emails expressing approval of my criticism of authors who ought to apply more thought to what they say and how they say it when discussing technical matters. However I have also received a complaint from Australia about the article which, in the interests of fairness, I reproduce below:


 

Back to Index of Topics