JL Computer Consultancy

db_block_buffers and extents - producing a bad mix (v7)

Prior to Aug 1999

Updated Nov 2009


There are two topics that come up with some frequency on the Oracle newsgroups (comp.databases.oracle.server is the one I read most). The first is whether it is possible to have too many blocks in the db_block_buffer, the second is whether or not there are drawbacks to having large numbers of extents in an object.

In general the answer to both questions, individually, is 'No, not really', although I have to admit that I have a general aversion to a very large number of extents and tend to view 20 as the right ball-park figure.

Having said that, ask yourself when you might you want have a very large value for db_block_buffers ? And why might you want to have a large number of extents in an object ? The answer is 'when you have a very large database'. Then you have to ask yourself what else are you likely to have if you have a very large database ? The Parallel Query Option, and that is when the trouble begins. At this point you might like to go straight to the summary and skip the little story..

A little while ago a DBA asked me why a query was taking 45 seconds to execute in parallel when it took about three seconds to execute serially. The query was a simple probe of a partition view on a 7.3.4 database running on Sun. A check of the access path showed that the same path was being used in both cases, this was the only activity on the database and there was NO physical I/O taking place. Repeating the query with SQL_TRACE on showed that the access path was actually as specified by explain plan, and that the time lost was in the execute phase, not the parse phase (which has been a problem in the past with partition views). I ran my usually 2-stage process for solving such problems:

Step 1, - repeat the query, dumping wait states -

        alter session set events '10046 trace name context forever, level 8';

The trace file listed 557 lines looking like this -

        WAIT #1: nam='rdbms ipc reply' ela= 8 p1=3 p2=21474836 p3=0

Quick calculation time: 557 * 8 / 100 = 44.56. Our time lost was the session waiting for DBWR (process id = 3) to reply to some sort of message. But what sort of dialogue was going on ?

Step 2, a slightly more sophisticated attack; I wheeled in a couple of packaged procedures that take snapshots of v$ objects and calculate differences over time. In this case I fired up the ones for v$session_event and v$sysstat with a 60 seconds snapshot time and re-ran the query. The significant results were:

        v$sysstat              DBWR Checkpoints               557
        v$session_event        rdbms ipc report               557

Spot the co-incidence ? At this point I remembered a call I had made to Oracle support the previous year about a problem with Oracle 8 (possibly 8.0.3), parallel query and parititioned tables when running Oracle Parallel Server. I executed the following SQL statement:

select  sum(extents)
from    dba_segments
where   segment_name in (
               {list of tables in the partition view}
        );

You will not be surprised to learn that the answer was 557.

Extent checkpoints

When a Parallel Query runs in newer versions of Oracle, it has the option to use direct reads, bypassing the SGA completely (there is an exception to this, which will appear in an article on small tables and cache buffering). In general this is a good thing; however to avoid the risk of reading a block from disc which is OLDER than a block in the SGA, all blocks that are likely to be addressed by the query have to be flushed to disc before the query starts. This is achieved by a process called the Extent-based Checkpoint - a call goes to DBWR for every extent that the query may cover, and DBWR hunts out any dirty blocks from that extent in the SGA, and writes them to disc.

This does not usually take much time, but if you have a large number of buffers, DBWR has to work through a large number of hash-chains checking for dirty blocks which have not yet been moved to the dirty list. If there are a lot of extents then the process has to be repeated a large number of times.

In my case, with 32 tables of about 18 extents each, and an SGA of 250,000 blocks each scan took a bare 8/100 of a second, but there were 557 of them to do !!!

4 Possible Solutions:

Move to Oracle 8 - instead of issuing extent based checkpoints, Oracle issues object-based checkpoints, or, as the new statistics in v$sysstat listed below show 'object flush calls'. These will still take time in a large SGA, but there will be far fewer of them. (I still have to check whether the OPS version of 8.0.4/5 issues global checkpoints before running a parallel query, as the side-effect of regular global checkpoints can be totally catastrophic on large OPS databases - given the appearance of the phrase cross instance in the stat names below I suspect that global checkpoints are no longer used).

        DBWR Flush object cross instance calls
        DBWR Flush object call found no dirty buffers

Make as many tablespaces as possible READ ONLY. Oracle knows that an object in a read only tablespace cannot get dirty, and so it will not try to flush any of its extents. In our case this was the solution - the tablespaces were in read write mode for a major rebuild, they had always been in READ ONLY mode on the previous platform, and once the change was made the query time dropped back to a few seconds.

Don't make the db_block_buffers unnecessarily large. If you are using parallel query a lot, then you probably won't be using the buffer for most of your data accesses anyway, and you can probably use the memory more effectively for sorting or hashing.

If you have objects that are regularly involved in parallel queries and you cannot make them read only, and you really need a very large db_block_buffer, then consider rebuilding these objects with a small number of extents.

Footnote:

The same extent-based checkpointing takes place when you drop or truncate an object, and with a large db_block_buffer the checkpointing can take more time than all the updates on the space management tables.

Update (Nov 2009):

Following a delay of many years (at least 10) and a discussion on my blog (http://jonathanlewis.wordpress.com/2009/11/12/no-change/#comment-34834 ) it has been pointed out to me that the segment-level checkpoint disappeared almost immediately and its role was covered by a tablespace checkpoint. The segment level checkpoint only reappeared in 10gR2 when Oracle introduced a new linked list structure in the buffer header structure allowing all the buffers hold data from the same object to be chained together.


Back to Main Index of Topics