Window framing

For nonranking aggregate OLAP functions, you can define a window frame with a window frame clause, which specifies the beginning and end of the window relative to the current row.

<WINDOW FRAME CLAUSE> ::=
  <WINDOW FRAME UNIT>
  <WINDOW FRAME EXTENT>

This OLAP function is computed with respect to the contents of a moving frame rather than the fixed contents of the whole partition. Depending on its definition, the partition has a start row and an end row, and the window frame slides from the starting point to the end of the partition.

Figure 2-3: Three-row moving window with partitioned input

shows current row as part of sliding window grouped by partition A, current and C.

UNBOUNDED PRECEEDING and FOLLOWING

Window frames can be defined by an unbounded aggregation group that either extends back to the beginning of the partition (UNBOUNDED PRECEDING) or extends to the end of the partition (UNBOUNDED FOLLOWING), or both.

UNBOUNDED PRECEDING includes all rows within the partition preceding the current row, which can be specified with either ROWS or RANGE. UNBOUNDED FOLLOWING includes all rows within the partition following the current row, which can be specified with either ROWS or RANGE. See “ROWS” and “RANGE”.

The value FOLLOWING specifies either the range or number of rows following the current row. If ROWS is specified, then the value is a positive integer indicating a number of rows. If RANGE is specified, the window includes any rows that are less than the current row plus the specified numeric value. For the RANGE case, the data type of the windowed value must be comparable to the type of the sort key expression of the ORDER BY clause. There can be only one sort key expression, and the data type of the sort key expression must allow addition.

The value PREDCEEDING specifies either the range or number of rows preceding the current row. If ROWS is specified, then the value is a positive integer indicating a number of rows. If RANGE is specified, the window includes any rows that are less than the current row minus the specified numeric value. For the RANGE case, the data type of the windowed value must be comparable to the type of the sort key expression of the ORDER BY clause. There can be only one sort key expression, and the data type of the sort key expression must allow subtraction. This clause cannot be specified in second bound group if the first bound group is CURRENT ROW or value FOLLOWING.

The combination BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING provides an aggregate over an entire partition, without the need to construct a join to a grouped query. An aggregate over an entire partition is also known as a reporting aggregate.

CURRENT ROW concept

In physical aggregation groups, rows are included or excluded based on their position relative to the current row, by counting adjacent rows. The current row is simply a reference to the next row in a query’s intermediate results. As the current row advances, the window is reevaluated based on the new set of rows that lie within the window. There is no requirement that the current row be included in a window.

If a window frame clause is not specified, the default window frame depends on whether or not a window order clause is specified:

You can also define a window by specifying a window frame unit that is row-based (rows specification) or value-based (range specification).

<WINDOW FRAME UNIT> ::= ROWS | RANGE
<WINDOW FRAME EXTENT> ::= <WINDOW FRAME START> | <WINDOW FRAME BETWEEN>

When a window frame extent specifies BETWEEN, it explicitly provides the beginning and end of a window frame.

If the window frame extent specifies only one of these two values then the other value defaults to CURRENT ROW.


ROWS

The window frame unit, ROWS, defines a window in the specified number of rows before or after the current row, which serves as the reference point that determines the start and end of a window. Each analytical calculation is based on the current row within a partition. To produce determinative results for a window expressed in rows, the ordering expression should be unique.

The reference point for all window frames is the current row. The SQL/OLAP syntax provides mechanisms for defining a row-based window frame as any number of rows preceding or following the current row or preceding and following the current row.

The following list illustrates common examples of a window frame unit:

Row-based window frames In the example in Figure 2-4, rows [1] through [5] represent a partition; each row becomes the current row as the OLAP window frame slides forward. The frame is defined as Between Current Row And 2 Following, so each frame includes a maximum of three rows and a minimum of one row. When the frame reaches the end of the partition, only the current row is included. The shaded areas indicate which rows are excluded from the frame at each step in Figure 2-4.

Figure 2-4: Row-based window frames

Shows current row moving in row-based window frame

The window frame in Figure 2-4 imposes the following rules:

The following diagram applies these rules to a specific set of values, showing the OLAP AVG function that would be calculated for each row. The sliding calculations produce a moving average with an interval of three rows or fewer, depending on which row is the current row:

Shows how sliding calculations produce a moving average with an interval of three rows or fewer, depending on which row is the current row

The following example demonstrates a sliding window:

SELECT dimension, measure,
  AVG(measure) OVER(partition BY dimension
    ORDER BY measure 
    ROWS BETWEEN CURRENT ROW and 2 FOLLOWING) 
    AS olap_avg
FROM ...

The averages are computed as follows:

Similar calculations would be computed for all subsequent partitions in the result set (such as, B, C, and so on).

If there are no rows in the current window, the result is NULL, except for COUNT.


RANGE

Range-based window frames The previous example, Row-based window frames, demonstrates one among many row-based window frame definitions. The SQL/OLAP syntax also supports another kind of window frame whose limits are defined in terms of a value-based—or range-based—set of rows, rather than a specific sequence of rows.

Value-based window frames define rows within a window partition that contain a specific range of numeric values. The OLAP function’s ORDER BY clause defines the numeric column to which the range specification is applied, relative to the current row’s value for that column. The range specification uses the same syntax as the rows specification, but the syntax is interpreted in a different way.

The window frame unit, RANGE, defines a window frame whose contents are determined by finding rows in which the ordering column has values within the specified range of value relative to the current row. This is called a logical offset of a window frame, which you can specify with constants, such as “3 preceding,” or any expression that can be evaluated to a numeric constant. When using a window defined with RANGE, there can be only a single numeric expression in the ORDER BY clause.

NoteORDER BY key must be a numeric data in RANGE window frame

For example, a frame can be defined as the set of rows with year values some number of years preceding or following the current row’s year:

ORDER BY year ASC range BETWEEN CURRENT ROW and 1 PRECEDING

In the above example query, 1 preceding means the current row’s year value minus 1.

This kind of range specification is inclusive. If the current row’s year value is 2000, all rows in the window partition with year values 2000 and 1999 qualify for the frame, regardless of the physical position of those rows in the partition. The rules for including and excluding value-based rows are quite different from the rules applied to row-based frames, which depend entirely on the physical sequence of rows.

Put in the context of an OLAP AVG() calculation, the following partial result set further demonstrates the concept of a value-based window frame. Again, the frame consists of rows that:

Shows range based window frame with row, dimension, year, measure and OLAP average

The following query demonstrates a range-based window definition:

SELECT dimension, year, measure,
  AVG(measure) OVER(PARTITION BY dimension
    ORDER BY year ASC 
    range BETWEEN CURRENT ROW and 1 PRECEDING) 
    as olap_avg
FROM ...

The averages are computed as follows:

Ascending and descending order for value-based frames The ORDER BY clause for an OLAP function with a value-based window frame not only identifies the numeric column on which the range specification is based; it also declares the sort order for the ORDER BY values. The following specification is subject to the sort order that precedes it (ASC or DESC):

RANGE BETWEEN CURRENT ROW AND n FOLLOWING

The specification n FOLLOWING means:

For example, assume that the year column contains four distinct values, from 1999 to 2002. The following table shows the default ascending order of these values on the left and the descending order on the right:

Shows two columns: order by year ascending (1999 to 2002) and order by year descending (2002 to 1999)

If the current row is 1999 and the frame is specified as follows, rows that contain the values 1999 and 1998 (which does not exist in the table) are included in the frame:

ORDER BY year DESC range BETWEEN CURRENT ROW and 1 FOLLOWING

NoteThe sort order of the ORDER BY values is a critical part of the test for qualifying rows in a value-based frame; the numeric values alone do not determine exclusion or inclusion.

Using an unbounded window The following query produces a result set consisting of all of the products accompanied by the total quantity of all products:

SELECT id, description, quantity,
  SUM(quantity) OVER () AS total
FROM products;

Computing deltas between adjacent rows Using two windows—one over the current row and the other over the previous row—provides a direct way of computing deltas, or changes, between adjacent rows.

SELECT EmployeeID, Surname, SUM(salary) OVER (ORDER BY BirthDate rows between current row and current row)AS curr, SUM(Salary) OVER (ORDER BY BirthDate rowsbetween 1 preceding and 1 preceding) AS prev, (curr-prev) as deltaFROM Employees WHERE State IN ('MA', 'AZ', 'CA', 'CO') AND DepartmentID>10ORDER BY EmployeeID, Surname;

The results from the query:

EmployeeID  Surname    curr       prev        delta
------  			---------    --------  ---------  ----------
148			Jordan 				51432.000191			Bertrand 		 		29800.000			 39300.000			 -9500.000278			Melkisetian 			 48500.000				 42300.000			 6200.000299			Overbey 				39300.000			 41700.750			 -2400.750318			Crow 				41700.750			 45000.000			 -3299.250586			Coleman 				42300.000			 46200.000			 -3900.000690			Poitras 				46200.000			 29800.000			 16400.000703			Martinez 				55500.800			 51432.000 	 4068.800949			Savarino 				72300.000			 55500.800			 16799.2001101			Preston				37803.000			 48500.000			 -10697.0001142			Clark 				45000.000			 72300.000 			-27300.000 

Although the window function SUM() is used, the sum contains only the salary value of either the current or previous row because of the way the window is specified. Also, the prev value of the first row in the result is NULL because it has no predecessor; therefore, the delta is NULL as well.

In each of the examples above, the function used with the OVER() clause is the SUM() aggregate function.