JL Computer Consultancy

When is a script a proof ?

Apr 2005


In a recent forum thread, Don Burleson announced the following:

Mar 31 2005, 02:18 PM

Sorry, I guess I'm not being clear. Please note:

- Scripts CAN AND DO PROVE how Oracle works

- Scripts CANNOT prove anything about Oracle performance and tuning

I don’t think a statement could be much clearer than that. So we see from Don’s perspective that a set of scripts can be considered to be a proof; but if the purpose is to prove something about Oracle performance then a set of scripts cannot do the job.

I’d like to offer the following set of scripts as a proof of an important assertion about Oracle performance. After you’ve examined the scripts, you can decide for yourself whether or not “Scripts CANNOT prove anything about Oracle performance and tuning”.

You may, of course, decide that I am not using scripts, that the demonstration does not constitute proof, or that the demonstration is not about performance – that’s your choice – but at least you will have a concrete example of what I mean when I say: “I can show you a proof about a performance issue.”


Assertion: For systems that use large numbers of high-precision SQL statements in a highly concurrent fashion, it is important for performance and scalability reasons to make coding with bind variables the general practice, only allowing ‘literal strings’ values to appear for special cases.

Proof: Demonstrate the relative impact of bind variables vs. literal strings for SQL. Test case created on 9.2.0.6, XP Pro.


Script 1: Data definition

create table t1 as

select

rownum id,

lpad(rownum,10) small_vc,

rpad('x',100) padding

from

all_objects

where

rownum <= 10000

;

 

alter table t1 add constraint t1_pk primary key(id);

 

begin

dbms_stats.gather_table_stats(

user,

't1',

cascade => true,

estimate_percent => null,

method_opt => 'for all columns size 1'

);

end;

/


Now we write one script that generates a lot of parse calls using ‘literal string’ SQL, and another script that generates the same number of parse calls, produces the same results, but uses ‘bind variable’ SQL. Run each a few times and report the resource consumption on the Nth test.

Note: the snap_xxx packages are simple packages I wrote several years ago to query various v$ views (or occasionally x$ objects) and generate snapshot data. The start_snap procedure queries the view and stores the results in a pl/sql table; the end_snap re-queries the view, calculates the difference between the current values and stored values, and reports the difference. The first article I published about this technique is here, and a more recent article with the full text for package snap_my_stats is here.

To minimise side effect and isolate the parse costs, the db_cache_size is set to a size that will hold the entire table and index used in the test, and the shared pool is sized to hold all the SQL generated. The code loads the table and index into the buffer cache before the main test starts.

The main body of the test simply runs through a pl/sql loop, generating and executing a high-precision query using native dynamic SQL (execute immediate). The point of using NDS for this test is that it always generates a parse call, rather than using the built-in pl/sql optimization of holding cursors open, and so it is a closer approximation to the behaviour of a typical front-end programming language.


Script 2: Bind variable usage

set timing on

alter system flush shared_pool;

 

begin

snap_sgastat.start_snap;

snap_latch.start_snap;

snap_my_stats.start_snap;

end;

/

 

set autotrace on

 

select /*+ index(t1 t1_pk) */

count(small_vc)

from

t1

where

id >= 0

;

 

set autotrace off

 

spool bind_stats

 

declare

v1 varchar2(10);

begin

snap_sgastat.start_snap;

snap_latch.start_snap;

snap_my_stats.start_snap;

for i in 1..10000 loop

 

execute immediate

'select small_vc from t1 where id = ' || ':b1'

into v1 using trunc(dbms_random.value(1,10000));

 

end loop;

end;

.

/

 

execute snap_my_stats.end_snap

execute snap_latch.end_snap

execute snap_sgastat.end_snap

 

spool off


Script 3: Literal string usage

set timing on

alter system flush shared_pool;

 

begin

snap_sgastat.start_snap;

snap_latch.start_snap;

snap_my_stats.start_snap;

end;

/

 

set autotrace on

 

select /*+ index(t1 t1_pk) */

count(small_vc)

from

t1

where

id >= 0

;

 

set autotrace off

 

spool literals

 

declare

v1 varchar2(10);

begin

snap_sgastat.start_snap;

snap_latch.start_snap;

snap_my_stats.start_snap;

for i in 1..10000 loop

 

execute immediate

'select small_vc from t1 where id = ' || trunc(dbms_random.value(1,10000))

into v1;

 

end loop;

end;

.

/

 

execute snap_my_stats.end_snap

execute snap_latch.end_snap

execute snap_sgastat.end_snap

 

spool off


Results

The following is an extract from the results. Some of the figures are included to show the differences in resource consumption. Some are included to indicate that the two tests were producing the same “end-user results” using the same access paths.

Note: 15th April:

Thanks to a reader who ran the tests on their own system, I discovered that I had forgotten to increase the shared pool when running my tests. This made some difference to the CPU costs in the ‘literals’ case, and a huge difference to the memory requirements. The change in size of the shared pool had no impact on the ‘bind variable’ test.

You will note that the CPU times marked with an asterisk are larger than the elapsed times. CPU times are not 100% trustworthy, typically because of the granularity errors that Cary Millsap discusses in his book. (NB In the first issue of this table, I had left a zero for CPU used by this session in the ‘binds’ column – this was a copy error when transferring results from original output).

Title

Bind variables

Literals

(small shared pool)

Literals

(big shared_pool)

Test completion time (seconds)

1.07

15.01

14.05

Session Stats

 

 

 

Recursive cpu usage (cs)

125*

1,411

1,400

CPU used by this session (cs)

133*

1,514*

1,453*

Parse time cpu (cs)

31

1,238

1,203

Enqueue requests

3

9,739

10,000

Consistent gets

30,081

30,590

30,030

Consistent gets – examinations

30.032

30,231

30,013

Table fetch by rowed

10,029

10,149

10,010

Index fetch by key

10,005

10,063

10,002

Latch Activity (willing to wait gets)

 

 

 

Enqueues

12

19,504

20,021

Row cache enqueue latch

20

155,546

160,002

Row cache objects

22

155,631

160,003

Shared pool

50,405

315,431

310,370

Library cache

110,468

229,785

230,303

Library cache pin

80,264

90.802

90,160

Library cache pin allocation

40,128

60,544

60,073

Child cursor hash table

25

68,177

70,014

SGA statistics (changes)

 

 

 

Shared pool free memory

-60,372

-2,383,064

-132,850,108

Shared pool library cache

18,496

785,008

35,554,564

Shared pool sql area

12,920

1,633,652

96,858,308

From these results, I think we can note that:

        The literal string option is much slower than the bind option

        The literal string option uses much more CPU than the bind option

        The literal string option is much more aggressive on latching than the bind option

        The literal string option is much more aggressive on memory manipulation than the bind option.

We can therefore conclude that in a highly concurrent OLTP system, where many users frequently execute statements that are essentially the same, high-precision, statement with different ‘input parameters’, then the coding strategy that uses bind variables to avoid producing large numbers of marginally different SQL statements will use far less machine resource, and therefore offer better performance.

If, further, we understand the significance of latch acquisition – specifically that Oracle almost invariably uses it to protect linked lists by serialising access to those lists – we will appreciate that the latching information is a clear indication that the coding strategy using bind variables will also cause much less contention, hence will scale better and offer better performance at a higher degree of concurrency.

If you have trouble realising the implications of the increased latching, it is very easy to modify the tests slightly and run multiple copies on a small server with a couple of CPUs (you can even get laptops with 2 CPUs these days). Use dbms_random (after seeding it) to vary the query’s ‘input parameter’; make sure that each test restricts its queries to a different range in the table; put a small sleep into the loop so that you don’t simply demonstrate that a CPU can’t run at more than 100%;


Conclusion

I believe that this article uses a very small set of very simple scripts to prove an important point about Oracle performance.

You may wish to argue that :

        I haven’t used scripts

        This demonstration doesn’t constitute proof

        This demonstration isn’t about performance

If so, then that’s fine with me; but at least you now have a concrete example of why I think it is possible to prove points about Oracle performance with simple scripts, and something about which to argue.


Footnote:

I did try to word my hypothesis carefully to avoid one of the simple counter-arguments about bind variables. It is easy to generate examples where the resources saved on parsing are far outweighed by the resource spent using a bad execution path. To cater for problems of non-uniform data distributions in OLTP systems, it is perfectly reasonable to have several variants of the ‘same’ SQL, along the lines of:

select * from talblex where status =’new’ and col1 = :b1 and col2 between :b2 and :b3;

select * from talblex where status =’In Transit’ and col1 = :b1 and col2 between :b2 and :b3;

select * from talblex where status =’Delivered’ and col1 = :b1 and col2 between :b2 and :b3;

Of course, these literal values may be concealed from the front-end through views:

select * from talblex_new_view where col1 = :b1 and col2 between :b2 and :b3;

select * from talblex_transit_view where col1 = :b1 and col2 between :b2 and :b3;

select * from talblex_del_view where col1 = :b1 and col2 between :b2 and :b3;

Once you know how Oracle works, and where the performance costs are going to be, it’s so much easier to build a high-performance system.


Exercises:

The test case is a simple loop that sends essentially the same statement to the database, introducing a ‘parameter’ change on each iteration. Code the controlling loop in your favourite programming environment to see if this makes any difference.

Test the scale of the issue when using a more complex query – such as selecting ten columns to return 4 rows from a three table join where there are two or three indexes on each table.

Check what happens if the test query is changed to an anonymous pl/sql block using the begin/end syntax; repeat using the call syntax.

Philosophical question:

        If I put the test code into a package that I create using a script, am I using a script for testing?

        If I use native compilation on that package, am I still using a script for testing?

        If I switch to an interpreted language (3GL or 4GL) with embedded SQL, am I still using a script for testing?

        If I get a compiler to compile that code, am I still using a script for testing?

        If there happens to be no interpreter for the language I like most and I go straight to compiling, am I still using a script for testing?

        What’s so special about scripts?


Addendum: 15th April.

I’ve had an interesting follow-up to this post:

One reader ran the scripts (using only the snap_my_stats procedure) on his copy of 9i, and then repeated the process on 10g – and got some rather different results. It turns out that there is an optimisation built in to pl/sql in 10g that can reduce the work needed to handle native dynamic SQL, and that difference could make you want to change the way you’ve written some of your code to improve its scalability. I’ll probably manage to write a short note about the changes later.

The important point to note, though, is that I have published a test script that someone else used to discover a new option for enhancing their code that they would not have discovered in any other way. Test cases are very useful not just for the moment you write them – but for every upgrade and every new venture. (The need to run the test on 10g and take a note of the results also made me spot the error in my 9i test that I mentioned above – so I’ve also enhanced the test case by adding a snapshot check on x$kghlu to make sure that Oracle hasn’t had to flush any chunks from the shared pool during the test).


Addendum 2: 15th April.

No matter how much you write, there are always more points to cover. The purpose of the article was to show that simple scripts could prove points about Oracle performance. The chosen topic for the “proof of provability” was the impact of using bind variables on the parse costs.

However, as a closing point, I mentioned that parse costs are not the only consideration and that there will always be cases where the judicious use of literals is important. This led to a brief exchange of email with Mike Ault about a point he had recently made about range-based predicates in a forum discussion on the use of bind variables.

Changing code from “literal value” usage to “bind variable” usage generally does have an impact on the optimizer’s arithmetic. For the type of high-precision SQL that you normally get in high-concurrency OLTP systems, the change is often going to be insignificant. However, some types of construct, and some data patterns, are more susceptible to problems than others. In particular, range-based predicates can be more of a problem than equality predicates because if the optimizer sees bind variables (and doesn’t peek into them) in a range-based predicate it uses a fixed selectivity of 5% (unbounded range) or 0.25% (bounded range) in the arithmetic. On top of this, of course, the whole bind-variable / selectivity / costing thing changes with both histograms and bind-variable peeking.


Back to Index of Topics