JL Computer Consultancy

Are partition tables always better than partition views ?

April 1999


The partition tables introduced in 8.0 are in many ways superior to the partition views of version 7.3, especially in very large database systems where the overhead of the large numbers of dictionary row cache entries needed to define a partition view can be very high, and the parse time can be exorbitant.

There is, however, one area where partition tables have clearly taken a step backwards in functionality. This is because the partition key is one-dimensional (even though it can be multi-columned).

Imagine a pan-European application, with large amounts of time-related data. For convenience and efficiency I would like to partition the data both by month and by country so that I can get maximum benefit from a query like:

        select .....
        from 
               partition_table
        where
               country = 'Germany'
        and     month = 199901
        and     product = 'Widget';

Assuming I want to keep a rolling 12 months of data, and since there happen to be 12 countries in the EU (at the time of writing), I can create a set of 144 data segments one for each combination of month and country.

With partition views, I can create these as a set of 144 separate tables with a constraint on each of the form:

        check (country = 'France' and month = 199906)

My example query will pick exactly the one table for the query. If I choose to examine widget sales for the whole year in Germany the suitably modified query would select exactly the 12 tables needed; if I choose to examine widget sales for the whole European Union in January again the suitably modified query would select exactly the 12 tables needed.

But what happens if I try to model this with a partition table: I have to create a partition key using syntax of the form:

        partition by range (month, country) (
        /*      199804  */
               partition b199804 values less than (199804,'Belgium#')
               partition d199804 values less than (199804,Denmark#')
               ...
               partition p199804 values less than (199804'Portugal#')
               partition s199804 values less than (199804,'Spain#')
        /*      199805  */
               partition b199805 values less than (199805,'Belgium#')
               partition d199805 values less than (199805,'Denmark#')
               ...
               partition p199805 values less than (199805,'Portugal#')
               partition s199805 values less than (199805,'Spain#')
               ...............
        /*      199903  */
               partition b199903 values less than (199903,'Belgium#')
               partition d199903 values less than (199903,'Denmark#')
               ...
               partition p199903 values less than (199903,'Portugal#')
               partition s199903 values less than (199903,'Spain#')
        )

alternatively I could do this the other way round (which leads to a need for 156 partitions, see below) -

        partition by range (country, month) (
        /*      Belgium */
               partition b199804 values less than ('Belgium',199804.1)
               partition b199905 values less than ('Belgium',199903.1)
               ......
               partition b199903 values less than ('Belgium',199903.1)
               partition b199904 values less than ('Belgium',maxvalues)
        /*      Denmark */
                partition d199804 values less than ('Denmark',199804.1)
               ......
               partition d199903 values less than ('Denmark',199903.1)
               partition d199904 values less than ('Denmark',maxvalues)
               .......
        /*      Spain   */
               partition s199804 values less than ('Spain',199804.1)
               ......
               partition s199903 values less than ('Spain',199903.1)
               partition s199904 values less than ('Spain',maxvalues)
        )

If by this point you are struggling to keep up, you've already discovered one of the painful features of partition tables - they are designed for handling continuously variable values, NOT for handling discrete values like a list of country names or month numbers.

Since partitions are defined by an open-ended upper bound our first option requires us to specify the limiting value of 'a little bit more than "Belgium"' if we want 'Belgium' to appear in the partition, hence 'Belgium#' as the tail of the partition key; similarly with our second option, because we want to store 199901 in a given partition we have to put the partition boundary a little higher than the value.

The mental gymnastics needed to deal with multi-column partitioning, and discrete value partitioning are not my biggest complaint though - I am more concerned with the loss of performance benefit due to the restricted partition elimination.

If I run my original 'Germany and 199901' query both partitioning strategies will give me elimination of all but the one critical partition. But what if my query is for 'Germany for the whole year' or 'The whole EU for 199901'.

The following table matches query condition against number of partitions accessed -

 

country= 'Germany'

month = 199901

Partition by country then month

13

156

Partition by month then country

144

12

In other words, whatever I do, one of my clsses of queries is always going to fail to eliminate partitions.

The problem is that partition keys behave just like multi-column indexes - if the leading columns of the key do not appear in the query's WHERE clause then the partition key cannot be used at all for partition elimination. There are ways to reduce the performance cost of this excessive partition access, but they all revolve around trading disk space and processing time to achieve a result that is already trivially available in partition views.

Back to Main Index of Topics

Adding partitions in the middle of a partitioned table.

Given my requirement for a rolling 12 months the least worst implementation in my case is likely to be the (month, country) partition key. However if the most commonly used queries includes the country but excludes the month then the performance demands of the end-users may require me to build the partition key in the order (country, month).

In this case, my 144 partitions has gone up to 156 - Why ? To make the administration easier and the cost cheaper when the time comes to roll in a new month.

The extra 12 partitions are always defined at the 'top-end' of each country, and cover the appearance of data which should not be there e.g.

        partition b199903 values less than ('Belgium',199903.1)
        partition b199904 values less than ('Belgium',maxvalues)

The highest dated data for Belgium is supposed to be 199903, so there should never be any data in the partition for ('Belgium', maxvalues). The reason for having this spare partition here is because we will eventually want to add data for 199904 without causing a lot of down-time due to partition invalidation. We do this by splitting the b199904 partition into 2, one for ('Belgium',19904.1) and the other, renamed to b199905, for ('Belgium',maxvalues). Since this partition will be empty when we split it the time taken by Oracle to do the actual split and revalidation will be very small.

Of course, if we were able to partition by (month, country) the process of rolling forward one month would simply require us to add 12 partitions (in the correct order) to the top of the table, and drop 12 partitions (truncating them first perhaps) in the correct order from the bottom of the table.


Back to Main Index of Topics