Analytical functions

Function

Analytical functions include:

Table 4-2 lists the analytical functions and their parameters. Unlike some aggregate functions in Table 4-1, you cannot specify DISTINCT in window functions.

Table 4-2: Analytical functions

Function

Parameters

AVG

( { column-name | numeric-expr } )

COUNT

( * )

COUNT

( { column-name | expression } )

DENSE_RANK

()

GROUPING *

( { GROUPING group-by-expression } )

MAX

( { column-name | expression } )

MIN

( { column-name | expression } )

NTILE

integer )

PERCENT_RANK

()

PERCENTILE_CONT

numeric-expr )

PERCENTILE_DISC

numeric-expr )

RANK

()

ROW_NUMBER

()

STDDEV

( [ ALL ] expression )

STDDEV_POP

( [ ALL ] expression )

STDDEV_SAMP

( [ ALL ] expression )

SUM

( { column-name | expression } )

VAR_POP

( [ ALL ] expression )

VAR_SAMP

( [ ALL ] expression )

VARIANCE

( [ ALL ] expression )

* The OLAP SQL standard allows Grouping() in GROUP BY CUBE, or GROUP BY ROLLUP operations only.

Windowing aggregate function usage

A major feature of the ISO/ANSI SQL extensions for OLAP is a construct called a window. This windowing extension let users divide result sets of a query (or a logical partition of a query) into groups of rows called partitions and determine subsets of rows to aggregate with respect to the current row.

You can use three classes of window functions with a window: ranking functions, the row numbering function, and window aggregate functions.

Windowing extensions specify a window function type over a window name or specification and are applied to partitioned result sets within the scope of a single query expression. A window partition is a subset of rows returned by a query, as defined by one or more columns in a special OVER clause:

OVER (PARTITION BY col1, col2...)

Windowing operations let you establish information such as the ranking of each row within its partition, the distribution of values in rows within a partition, and similar operations. Windowing also lets you compute moving averages and sums on your data, enhancing the ability to evaluate your data and its impact on your operations.

A window partition is a subset of rows returned by a query, as defined by one or more columns in a special OVER() clause:

OVER (PARTITION BY col1, col2...)

Ranking functions usage

The OLAP ranking functions let application developers compose single-statement SQL queries that answer questions such as “Name the top 10 products shipped this year by total sales,” or “Give the top 5% of salespeople who sold orders to at least 15 different companies.” These functions include the ranking functions, RANK(), DENSE_RANK(), PERCENT_RANK(), ROW_NUMBER(), and NTILE().

Rank analytical functions rank items in a group, compute distribution, and divide a result set into a number of groupings. The rank analytical functions, RANK(), DENSE_RANK(), PERCENT_RANK(), ROW_NUMBER(), and NTILE() all require an OVER (ORDER BY) clause. For example:

RANK() OVER ( [PARTITION BY] ORDER BY <expression> 
[ ASC | DESC ] )

The ORDER BY clause specifies the parameter on which ranking is performed and the order in which the rows are sorted in each group. This ORDER BY clause is used only within the OVER clause and is not an ORDER BY for SELECT. No aggregation functions in the rank query ROW are allowed to specify DISTINCT.

NoteThe OVER (ORDER_BY) clause of the ROW_NUMBER() function cannot contain a ROWS or RANGE clause.

The OVER clause indicates that the function operates on a query result set. The result set is the rows that are returned after the FROM, WHERE, GROUP BY, and HAVING clauses have all been evaluated. The OVER clause defines the data set of the rows to include in the computation of the rank analytical function.

The value expression is a sort specification that can be any valid expression involving a column reference, aggregates, or expressions invoking these items.

The ASC or DESC parameter specifies the ordering sequence as ascending or descending. Ascending order is the default.

Rank analytical functions are only allowed in the select list of a SELECT or INSERT statement or in the ORDER BY clause of the SELECT statement. Rank functions can be in a view or a union. You cannot use rank functions in a subquery, a HAVING clause, or in the select list of an UPDATE or DELETE statement. More than one rank analytical function is allowed per query in Sybase IQ 15.2.

Statistical aggregate analytic function usage

Summarize data over a group of rows from the database. The groups are formed using the GROUP BY clause of the SELECT statement. Aggregate functions are allowed only in the select list and in the HAVING and ORDER BY clauses of a SELECT statement. These functions include STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, and VAR_SAMP.

The OLAP functions can be used as a window function with an OVER() clause in a SQL query specification that conceptually creates a moving window over a result set as it is processed.

Distribution functions usage

The inverse distribution analytical functions PERCENTILE_CONT and PERCENTILE_DISC take a percentile value as the function argument and operate on a group of data specified in the WITHIN GROUP clause, or operate on the entire data set. These functions return one value per group. For PERCENTILE_DISC, the data type of the results is the same as the data type of its ORDER BY item specified in the WITHIN GROUP clause. For PERCENTILE_CONT, the data type of the results is either numeric, if the ORDER BY item in the WITHIN GROUP clause is a numeric, or double, if the ORDER BY item is an integer or floating-point.

The inverse distribution analytical functions require a WITHIN GROUP (ORDER BY) clause. For example:

PERCENTILE_CONT ( expression1 ) WITHIN GROUP ( ORDER BY expression2 [ASC | DESC ] )

The value of expression1 must be a constant of numeric data type and range from 0 to 1 (inclusive). If the argument is NULL, then a “wrong argument for percentile” error is returned. If the argument value is less than 0, or greater than 1, then a “data value out of range” error is returned.

The ORDER BY clause, which must be present, specifies the expression on which the percentile function is performed and the order in which the rows are sorted in each group. This ORDER BY clause is used only within the WITHIN GROUP clause and is not an ORDER BY for the SELECT.

The WITHIN GROUP clause distributes the query result into an ordered data set from which the function calculates a result.

The value expression2 is a sort specification that must be a single expression involving a column reference. Multiple expressions are not allowed and no rank analytical functions, set functions, or subqueries are allowed in this sort expression.

The ASC or DESC parameter specifies the ordering sequence as ascending or descending. Ascending order is the default.

Inverse distribution analytical functions are allowed in a subquery, a HAVING clause, a view, or a union. The inverse distribution functions can be used anywhere the simple non analytical aggregate functions are used. The inverse distribution functions ignore the NULL value in the data set.

Inter-row functions usage

The inter-row functions LAG and LEAD enable access to previous values or subsequent values in a data series. These functions provide access to more than one row of a table or partition simultaneously without a self join. The LAG function provides access to a row at a given physical offset prior to the CURRENT ROW in the table or partition. The LEAD function provides access to a row at a given physical offset after the CURRENT ROW in the table or partition. Use the LAG and LEAD functions to create queries such as “What was the stock price two intervals before the current row,” and “What was the stock price one interval after the current row.” See “Inter-row functions usage” in Chapter 2, “Using OLAP,” in the System Administration Guide: Volume 2.

Inter-row functions require an OVER (ORDER_BY) clause. See “Windowing aggregate function usage”.

Compatibility

The ranking and inverse distribution analytical functions are not supported by Adaptive Server Enterprise.

See also

See the individual analytical function descriptions in this chapter for specific details on the use of each function.

See Chapter 2, “Using OLAP,” in the System Administration Guide: Volume 2

For information on analytical function support of the LONG BINARY and LONG VARCHAR data types, see Chapter 9, “Function Support,” in Unstructured Data Analytics in Sybase IQ.