|Author's name: Mark D Powell
Author's Email: Mark.Powell@eds.com
|Date written: 4th Sept 2001
Oracle version(s): 18.104.22.168
|What's the quickest way of deleting all (or a large fraction of) the data in a table ?|
Back to index of questions
The answer depends on the circumstances, but if you need to delete all the rows in a table then the truncate command is the way to go as this command minimizes the writing of redo and undo (rollback) entries. The table and any indexes have their high water mark, stored in the header block, reset and except for the dictionary or bitmap updates for freed extents beyond the initial extent the job is done. If the table is to be reloaded and will need to reuse all or nearly all of the space it held then the reuse option can be used to eliminate even the need to update the dictionary space management tables or bitmap header for locally managed objects:
truncate table owner.tablename [drop storage| reuse storage]
The drop storage clause is the default and does not need to be provided.
When all the data cannot be deleted then your options are limited by whether you can get exclusive access to the table to perform the task or the table has to be kept available at all times and how your applications use the table. The next set of approaches all involve the same idea: Instead of deleting 90% of the data, extract the 10% of the data that is good to a new table and discard the original.
In the case where the in use applications only need insert access to the target table then if access can be stopped for a very short window the following approach can be used:
When no FK to target and no insert or delete triggers exist then 1- rename target to target_old 2- Re-Create the target table 3- recreate indexes, re-establish PK and UK constraints, and re-apply grants 4- allow access 5- extract good data from target_old and insert it into the recreated target 6- truncate the target_old table 7- drop the target_old table
If there is an insert trigger but no delete trigger or FK referencing the table then the re-insertion of the data needs to happen before the users are allowed access to the table since it is not desirable to have the trigger fire for data it has already processed, but as long as the time to select and insert this data is satisfactory then a create tables as select can be used for the re-create table step. This eliminates a step from the list above, but the ordering of the list above is designed to minimize the time the table needs to be unavailable for use.
If the target table has FK references to other tables then since all existing rows either meet the requirement or already violate it because the novalidate option was used in creating the FK constraints then they can be disabled in those cases where the good data is copied prior to allowing access to the table. This can save a fair amount of work and the constraints should again be re-enabled using the novalidate option to save performing unnecessary work. The existence of FK's referencing the target table complicates matters in that either the necessary deletes in the referencing tables must be done in a separate process allowing the FK to be disabled, or the deletes must take place against the target, and the re-naming/re-creating method cannot be used.
If the table is partitioned then native Oracle parallel DML can be used so that each partition has a delete job dedicated to it. This can greatly reduce clock time at the expense of system resources. For a normal table the delete can be logically partitioned to manually duplicate the parallel delete process be running multiple delete statements, each of which has its where clause limited to a specified set of rowid values. If you take this approach then I recommend that you create the indexes for this table with delete job number of transaction work areas itl, such as initrans 4. This will help to ensure each delete job can obtain an existing itl entry so that they do not have to dynamically allocate any, or worse have to wait because the space to allocate an itl is unavailable. When multiple delete jobs are ran the first job should have an unbounded lower range and the last job an unbounded upper range to ensure no section of the target table is missed. It is probably advisable to generate the rowids rather than hardcode them. Sample code to find the boundary rowids follows at the end of the FAQ. It would be fairly easy to change the dbms_output statements to an insert of a single row into a delete job control table that the delete jobs reference in their where clauses to control the delete ranges.
Still another concept is to partition the table such that all rows to be dropped exist in the same partition. This method requires that the data lend itself to partitioning on the columns that are used to determine that the data can be deleted and this is often not practical. If the data requires global indexes or a few rows within the partition key range end up not being valid for deletion then the benefit of portioning the table to support deletes is lost. Still the method should not be overlooked.
set serveroutput on declare -- counters v_ctr pls_integer := 0 ; v_ct_rows pls_integer := 0 ; v_rowid rowid ; v_rows_per_job pls_integer := 0 ; -- cursor c_get_rowids is select rowid from po_blanket_hist ; -- -- pl/sql table to hold boundry rowids -- type t_rowids is table of varchar2(18) index by binary_integer ; t_rows t_rowids ; I binary_integer ; -- -- get total row count -- begin select count(*) into v_ct_rows from po_blanket_hist ; -- -- In this example we are going to use 4 jobs -- v_rows_per_job := floor(v_ct_rows / 4) ; -- -- While data read, test, save rowid -- open c_get_rowids ; I := 1 ; loop fetch c_get_rowids into v_rowid ; exit when c_get_rowids%notfound ; -- quit when done v_ctr := v_ctr + 1 ; -- count rows returned if v_ctr = v_rows_per_job then t_rows(I) := v_rowid ; -- mod takes longer than using a counter v_ctr := 0 ; I := I + 1 ; end if ; end loop ; -- close c_get_rowids ; -- -- This is where we could save the rowids to a job control row -- dbms_output.put_line('Row Count is '||v_ct_rows) ; dbms_output.put_line('Rows per job '||v_rows_per_job) ; dbms_output.put_line('First Rowid is '||t_rows(1)) ; dbms_output.put_line('Second Rowid is '||t_rows(2)) ; dbms_output.put_line('Third Rowid is '||t_rows(3)) ; dbms_output.put_line('Forth Rowid is '||t_rows(4)) ; end ; / Row Count is 32695051 Rows per job 8173762 First Rowid is AAAAb+AFQAAALHWAAO Second Rowid is AAAAb+AFgAAAMiNAA5 Third Rowid is AAAAb+AGQAAADZnAAL Forth Rowid is AAAAb+AGQAAAKCMABB PL/SQL procedure successfully completed.
Further reading: If you have complex delete logic then the section on views for performance in Jonathan Lewis's book Practical Oracle 8i might be of interest. You can also see Metalink document id 19890.996 and 89799.996, which are forum threads. Oracle support did not offer anything not mentioned in this FAQ.
Back to top
Back to index of questions