CUME_DIST function

The cumulative distribution function, CUME_DIST, is sometimes defined as the inverse of percentile. CUME_DIST computes the normalized position of a specific value relative to the set of values in the window. The range of the function is between 0 and 1.

As the window moves down the input rows, the cumulative distribution is calculated for the expression specified in the window's ORDER BY clause. If the ORDER BY clause includes multiple expressions, the second and subsequent expressions are used to break ties if the first expression has the same value in adjacent rows. NULL values are sorted before any other value (in ascending sequence).

Example 1

The following example returns a result set that provides a cumulative distribution of the salaries of employees who live in California.

SELECT DepartmentID, Surname, Salary,
    CUME_DIST( ) OVER ( PARTITION BY DepartmentID
      ORDER BY Salary DESC ) "Rank"
  FROM Employees 
  WHERE State IN ( 'CA' );

This query returns the following result:

DepartmentID Surname Salary Rank
200 Savarino 72300.00 0.333333333333333
200 Clark 45000.00 0.666666666666667
200 Overbey 39300.00 1
Example 2

The CUME_DIST function provides a simple method to determine the median of a set of values. CUME_DIST can be used to compute the median value successfully in the face of ties and whether the input contains an even or odd number of rows. Essentially, you need only determine the first row with a CUME_DIST value of greater than or equal to 0.5.

The following query returns the product information for the product with the median unit price:

SELECT FIRST *
  FROM ( SELECT Description, Quantity, UnitPrice,
            CUME_DIST( ) OVER ( ORDER BY UnitPrice ASC ) AS CDist
           FROM Products ) As DT
  WHERE CDist >= 0.5
  ORDER BY CDist;

The query returns the following result:

Description Quantity UnitPrice CDist
Wool cap 12 10.00 0.5

For more information about the syntax for the CUME_DIST function, see CUME_DIST function [Ranking].