SQL/OLAP defines several functions that deal with ordered sets. The two inverse distribution functions are PERCENTILE_CONT and PERCENTILE_DISC. These analytical functions take a percentile value as the function argument and operate on a group of data specified in the WITHIN GROUP clause or operate on the entire data set.
These functions return one value per group. For PERCENTILE_DISC (discrete), the data type of the results is the same as the data type of its ORDER BY item specified in the WITHIN GROUP clause. For PERCENTILE_CONT (continuous), the data type of the results is either numeric, if the ORDER BY item in the WITHIN GROUP clause is a numeric, or double, if the ORDER BY item is an integer or floating point.
The inverse distribution analytical functions require a WITHIN GROUP (ORDER BY) clause. For example:
PERCENTILE_CONT ( expression1 ) WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )
The value of expression1 must be a constant of numeric data type and range from 0 to 1 (inclusive). If the argument is NULL, then a “wrong argument for percentile” error is returned. If the argument value is less than 0, or greater than 1, then a “data value out of range” error is returned.
The ORDER BY clause, which must be present, specifies the expression on which the percentile function is performed and the order in which the rows are sorted in each group. This ORDER BY clause is used only within the WITHIN GROUP clause and is not an ORDER BY for the SELECT statement.
The WITHIN GROUP clause distributes the query result into an ordered data set from which the function calculates a result.
The value expression2 is a sort specification that must be a single expression involving a column reference. Multiple expressions are not allowed and no rank analytical functions, set functions, or subqueries are allowed in this sort expression.
The ASC or DESC parameter specifies the ordering sequence as ascending or descending. Ascending order is the default.
Inverse distribution analytical functions are allowed in a subquery, a HAVING clause, a view, or a union. The inverse distribution functions can be used anywhere the simple nonanalytical aggregate functions are used. The inverse distribution functions ignore the NULL value in the data set.
PERCENTILE_CONT example This example uses the PERCENTILE_CONT function to determine the 10th percentile value for car sales in a region using the following data set:
sales region dealer_name ----- --------- ----------- 900 Northeast Boston 800 Northeast Worcester 800 Northeast Providence 700 Northeast Lowell 540 Northeast Natick 500 Northeast New Haven 450 Northeast Hartford 800 Northwest SF 600 Northwest Seattle 500 Northwest Portland 400 Northwest Dublin 500 South Houston 400 South Austin 300 South Dallas 200 South Dover
In the following example query, the SELECT statement contains the PERCENTILE_CONT function:
SELECT region, PERCENTILE_CONT(0.1) WITHIN GROUP ( ORDER BY ProductID DESC ) FROM ViewSalesOrdersSales GROUP BY region;
The result of the SELECT statement lists the 10th percentile value for car sales in a region:
region percentile_cont --------- --------------- Canada 601.0 Central 700.0 Eastern 700.0
South 700.0 Western 700.0
PERCENTILE_DISC example This example uses the PERCENTILE_DISC function to determine the 10th percentile value for car sales in a region, using the following data set:
sales region dealer_name ----- --------- ----------- 900 Northeast Boston 800 Northeast Worcester 800 Northeast Providence 700 Northeast Lowell 540 Northeast Natick 500 Northeast New Haven 450 Northeast Hartford 800 Northwest SF 600 Northwest Seattle 500 Northwest Portland 400 Northwest Dublin 500 South Houston 400 South Austin 300 South Dallas 200 South Dover
In the following query, the SELECT statement contains the PERCENTILE_DISC function:
SELECT region, PERCENTILE_DISC(0.1) WITHIN GROUP (ORDER BY sales DESC ) FROM carSales GROUP BY region;
The result of the SELECT statement lists the 10th percentile value for car sales in each region:
region percentile_cont --------- --------------- Northeast 900 Northwest 800 South 500
For more information about the distribution functions, see “PERCENTILE_CONT function [Analytical]” and “PERCENTILE_DISC function [Analytical],” in Chapter 4, “SQL Functions,” in Reference: Building Blocks, Tables, and Procedures.