ࡱ> ܥhc ea,=#((++++++++++ + +++C+++++++++++++++7,X,O++++++++++++++++++++++K$nk++++++++++Analytic Functions Jonathan Lewis JL Computer Consultancy, UK Summary One of the most interesting, and possibly most exciting, developments to appear in Oracle 8.1 is the Analytic Function set, introduced in version 8.1.6 In earlier versions of Oracle even relatively trivial reports that required selectivity of summarised data could result in an annoying cost in machine, or programmer resources. Analytic Functions may be the key that allows you to produce simple SQL in many cases where you currently need a programmer, an expensive front-end tool, or a heavy-duty machine. In particular, they may make it possible to produce very powerful summary reports for web browsers from very simple code. Life before Analytic Functions. Consider the following data from a sales database representing the sales of computer software from stores scattered around the UK. Store Software Title Sales ------- ---------------------- ------ Glasgow Age of Umpires 72 Glasgow Crash Simulator 1934 Glasgow Dome 482 Glasgow Manic the Gerbil 913 Glasgow Portal Combat 315 London Age of Umpires 110 London Crash Simulator 247 London Dome 2167 London Manic the Gerbil 52 London Portal Combat 1824 We would like to report the two best-selling products by store. Traditionally, there were two main solutions to this problem - and a host of ambiguities. One option was to write a procedural program, based around an SQL statement that sorted the data by Store and Sales. A very bare examplein PL/SQL might look something like the following: declare cursor c1 is select store, title, sales from game_sale order by store, sales desc ; m_last_row c1%rowtype; m_out_ct number := 0; begin for r1 in c1 loop if (m_last_row.store != r1.store) then m_out_ct := 0; dbms_output.new_line; end if; if m_out_ct != 2 then dbms_output.put_line ( r1.store || ' - ' || r1.title || ' - ' || r1.sales ); m_out_ct := m_out_ct + 1; end if; m_last_row := r1; end loop; end; To keep the code short, it does no error trapping (e.g. for ORA-01555 snapshot too old), and it does nothing fancy to format the output. If you want a robust, aesthetically satisfying report you will need to put in a more significant programming error. Before you start, though, think about what happens if two titles tie for second place in Glasgow - this code reports one of them, then skips on to the next store. This is probably not the desired solution did you want to see BOTH of them or neither of them? You need to rethink the complexity of the code. What if you rewrite the code for this case so that titles carry on being reported until the sales drop ? What do you do if two titles tie for first place - did you intend to report a second place at all in this case, or should you stop reporting after the first place? When you describe the requirement carefully the procedural code can become quite complex. An alternative solution is the pure SQL one, code something like the following: Select store, title, sales from game_sale gs1 where 2 > ( select count(*) from game_sale gs2 where gs2.store = gs1.store and gs2.sales > gs1.sales ) order by store, sales desc; In this case with just the simple SQL*Plus front-end, you get a tidy little report, with the option for including totals at the end of the page, at the cost of a short and fairly simple SQL statement. What's the catch ? First, the amount of work done is large - the code looks clean and simple, but we actually access the GAME_SALE table twice in a rather expensive way - for each row in the outer select we issue a subquery which counts the number of rows for the same store where the value of sales is higher than the sales in the current row; if this count is strictly less than two then the row has an interestingly high sales figure and we keep it, otherwise we discard it. Secondly, the query may not do what we want - if two titles tie for first place then every other title will discover those two rows in the subquery so there will be no row reported with a 'second highest' value. Although PL/SQL code can be adjusted to cater for the subtleties of our exact requirement this isn't necessarily the case with pure SQL. The Analytic Solution The Analytic answer from 8.1.6 is as follows: select store, title, sales, in_store_rank from ( select store, title, sales, rank() over ( partition by store order by sales desc ) as in_store_rank from game_sale ) where in_store_rank <= 2 order by store, in_store_rank; STORE TITLE SALES IN_STORE_RANK ---------- -------------------- ------- ------------- Glasgow Crash Simulator 1934 1 Tonic the Gerbil 913 2 London Dome 2167 1 Portal Combat 1824 2 The power of Analytic functions appears in the clause over ( partition by ...) Historically, SQL supplies only a very limited functionality for sorting and grouping data. Oracle Corp. however has now introduced a very compact but flexible extension to SQL that allows you to 'partition' data sets, and apply secondary processing on a partition by partition basis. At the same time, it has introduced a few extra functions which are particularly useful when used in conjunction with this partitioning strategy. In the example I have used the RANK() function, which applies an ordering value to a set of ordered rows. By introducing a partitioning clause with its own 'order by' clause I have stated that I want the data split by STORE - since the partiton order clause is SALES DESC the RANK function is then applied to each store in turn, with the highest value of sales in each store getting the rank() of 1. Once each row has been given a rank within store, I can then select all those rows in first or second place (rank() is 1 or 2) and then list them in order of store and rank. If you are wondering about the problems of ties, I can choose an alternative function, dense_rank(), if I want to see all titles which match the two highest sales figures. The syntax is very compact but the proliferation of brackets and subordinate clauses rapidly becomes confusing, which is why I tend to make use of layers of 'in-line' views when using analytic functions. One layer can be used to summarise the data from a complex query, the next layer to introduce the relevant analytic functions, and the final layer to sort the output and take advantage of the values produced by the analytic functions. There is a lot more to analytic functions than a short note can cover. The presentation will give further examples of the available functions, how to use them, and a couple of hints on best practice. Jonathan Lewis page.  PAGE 1 "88.AppppO I=STZ[\]`auPaP uDPUc">?Ga 3]/YZ    1 H I O c  , 8 > ^ h i } (x$9:?U29FJQeo*-*`a t<=^_`aK @ Normalx]a &@& Heading 110Uc@ Heading 2U^@ Heading 3U(@( Heading 4 <Uc$@$ Heading 5<c(@( Heading 6<V]c"A@"Default Paragraph Font$@$ Normal Indent0+@ Endnote Text @ Header o#$@"$ Footnote TextU"O2" Double IndentOBActionU^ @R Footer 9r &>@b&Title <UckJ@rSubtitle<)@ Page Number*-@* Macro Text ]a O Plain Text]Omacro Tb0@b List BulletG h 4hO Macro TetxtO MacroTextaa"% *aa%!@CTimes New Roman Symbol "Arial1Courier New"P h2Lf2Lfh$Fu /P OChris Wray - Short contractJonathan LewisPre-Installed User #Root Entryeports that required selectivity of summar Fg cost iK$nkograWordDocumentFunctions may be the key that allows ymple SQL in many cases where you cur,eed CompObjxpensive front-end tool, or a heavy-duty machir, they may make it j powSummaryInformationrom very simple code. Li(ic Functions. Considom a  FMicrosoft Word Document MSWordDocWord.Document.69qOh+'0 ,8 ` l x Chris Wray - Short contractJonathan LewisNormalPre-Installed User1Microsoft Word fDocumentSummaryInformation8   FMicrosoft Word Document MSWordDocWord.Document.89qJL Computer Consultancy/ Chris Wray - Short contractor Windows 95@@XC(@|nk@|nku՜.+,0@Hhp x JL Computer Consultancy/ Chris Wray - Short contract