Faking it (2nd Dec 2005)
Histogram SQL (13th Sept 2012)
Typo (2nd Dec 2008)
Page 168: paragraph 2: I say:
The maximum cumulative value in a frequency histogram – which appears in the column endpoint_number – should match the number of rows in the table minus the number of nulls in the column. If it doesn’t, then Oracle adjusts the cardinality calcIulation accordingly.
However, a mismatch between the column histograms and the table statistics can happen without any faking at all. All you have to do is use a sample to generate a histogram (or even the table level and column histograms simultaneously). The figures stored in the database for the histogram reflect the sample size exactly, but the figures stored in user_tables as the num_rows and in user_tab_columns as the num_nulls are scaled up to offset the sample size.
So (for example) if you sample at 5%, you may find that the highest count in the frequency histogram is 100,000, whilst the number of rows reported for the table is 2,000,000 (which is 100,000 * 1/0.05).
I will be expanding this note, with a worked example in the code depot, some time in the future.
Page 156: There is a complex SQL statement from a 9i trace file showing the work done by Oracle to derive the data needed to populate a histogram. The subsequent text then explains the meaning of some of the columns. While doing some work recently with 184.108.40.206 I traced the work done by Oracle in creating histograms and found that the SQL had been completely re-written. Notes on the changes are on my blog at: Histogram Generation « Oracle Scratchpad
Page 165: Table 7-1: Row 8, Column 2 (Description):
I’ve used the Predicate “skew = -1” in this row, with the description “Below high value”; this should be “Below low value”.