Declaring an Aggregate UDF

Aggregate UDFs are more powerful and more complex to create than scalar UDFs.

After the UDF code has been written and compiled, create a SQL function that invokes the UDF from the appropriate library file, sending the input data to the UDF.

When implementing an aggregate UDF, you must decide:

  • Whether it will operate only across an entire data set or partition as an online analytical processing (OLAP) -style aggregate, like RANK.

  • Whether it will operate as either a simple aggregate or an OLAP-style aggregate, like SUM.

  • Whether it will operate only as a simple aggregate over an entire group.

The declaration and the definition of an aggregate UDF reflects these usage decisions.

The syntax for creating user-defined aggregate functions is:

aggregate-udf-declaration:
	CREATE AGGREGATE FUNCTION [ owner.]function-name
		  ( [ parameter , ... ] )
	RETURNS data-type
		[ aggregate-routine-characteristics ... ]
	EXTERNAL NAME library-and-entry-point-name-string

parameter:
	param-name data-type [ DEFAULT value ]

aggregate-routine-characteristics:
	  DUPLICATE { SENSITIVE | INSENSITIVE }
		-- is the server allowed to eliminate DISTINCT
	| SQL SECURITY {INVOKER | DEFINER}
	| OVER restrict
	| ORDER order-restrict
		| WINDOW FRAME 
		{ { ALLOWED | REQUIRED } 
			[ window-frame-constraints ... ] 
		  | NOT ALLOWED }
	| ON EMPTY INPUT RETURNS { NULL | VALUE }
  -- Call or skip function on NULL inputs

window-frame-constraints:
	  VALUES { [ NOT ] ALLOWED }
	| CURRENT ROW { REQUIRED | ALLOWED }
	| [ UNBOUNDED ] { PRECEDING | FOLLOWING } restrict

restrict:   { [ NOT ] ALLOWED } | REQUIRED

order-restrict: 
{ NOT ALLOWED | SENSITIVE | INSENSITIVE | REQUIRED

The handling of the return data type, arguments, data types, and default values are identical to that in the scalar UDF definition.

If an aggregate UDF can be used as a simple aggregate, then it can potentially be used with the DISTINCT qualifier. The DUPLICATE clause in the aggregate UDF declaration determines:

  • Whether duplicate values can be considered for elimination before the aggregate UDF is called because the results are sensitive to duplicates (such as for the built-in “COUNT(DISTINCT T.A)”) or,

  • Whether the results are insensitive to the presence of duplicates (such as for “MAX(DISTINCT T.A)”).

The DUPLICATE INSENSITIVE option allows the optimizer to consider removing the duplicates without affecting the result, giving the optimizer the choice on how to execute the query. Write the aggregate UDF to expect duplicates. If duplicate elimination is required, the server performs it before starting the set of _next_value_extfn calls.

Most of the remaining clauses that are not part of the scalar UDF syntax allow you to specify the usages for this function. By default, an aggregate UDF is assumed to be usable as both a simple aggregate and as an OLAP-style aggregate with any kind of window frame.

For an aggregate UDF to be used only as a simple aggregate function, declare it using:

OVER NOT ALLOWED

Any attempt to then use this aggregate as an OLAP-style aggregate generates an error.

For aggregate UDFs that allow or require an OVER clause, the UDF definer can specify restrictions on the presence of the ORDER BY clause within the OVER clause by specifying “ORDER” followed by the restriction type. Window-ordering restriction types:

  • REQUIRED – ORDER BY must be specified and cannot be eliminated.

  • SENSITIVE – ORDER BY may or may not be specified, but cannot be eliminated when specified.

  • INSENSITIVE – ORDER BY may or may not be specified, but the server can do ordering elimination for efficiency.

  • NOT ALLOWED – ORDER BY cannot be specified.

Declare an aggregate UDF that makes sense only as an OLAP-style aggregate over an entire set or partition that has been ordered, like the built-in RANK, with:

OVER REQUIRED 
ORDER REQUIRED 
WINDOW FRAME NOT ALLOWED

Declare an aggregate UDF that makes sense only as an OLAP-style aggregate using the default window frame of UNBOUNDED PRECEDING to CURRENT ROW, with:

OVER REQUIRED
ORDER REQUIRED
WINDOW FRAME ALLOWED 
	RANGE NOT ALLOWED
	UNBOUNDED PRECEDING REQUIRED
	CURRENT ROW REQUIRED
	FOLLOWING NOT ALLOWED

The defaults for the all various options and restriction sets are:

DUPLICATE SENSITIVE
SQL SECURITY DEFINER
OVER ALLOWED
ORDER SENSITIVE
WINDOW FRAME ALLOWED
CURRENT ROW ALLOWED
PRECEDING ALLOWED
UNBOUNDED PRECEDING ALLOWED
FOLLOWING ALLOWED
UNBOUNDED FOLLOWING ALLOWED

The server can be started with a library load path that includes the location of the UDF library. On UNIX variants, this can be done by modifying the LD_LIBRARY_PATH within the start_iq startup script. While LD_LIBRARY_PATH is universal to all UNIX variants, SHLIB_PATH is preferred on HP, and LIB_PATH is preferred on AIX.

On UNIX platforms, the external name specification can contain a fully qualified name, in which case the LD_LIBRARY_PATH is not used. On the Windows platform, a fully qualified name cannot be used and the library search path is defined by the PATH environment variable.

Note: Scalar user-defined functions and user-defined aggregate functions are not supported in updatable cursors.
Related concepts
Context Storage of Aggregate User-Defined Functions
Related tasks
Defining an Aggregate UDF