User-Defined Aggregate Functions in C/C++

For an aggregate function, the user must have a place to store their data between invocations of the function.

For example, if you write your own SUM() function, you must store the previous subtotal from the previous call to the function and then add the new value from the current call to the function.

Since an aggregate function can be called by many queries at overlapping times, you cannot use local storage space or global variables to store values such as subtotals.

TYour aggregate function must work with the server to store data in a place that will persist after the UDF returns from its call. Sybase CEP provides a pair of functions that allow you to store and retrieve data. The function

void C8SetState(C8Udf *ctx, const void *data, C8UInt data_size)

allows you to pass a sequence of bytes to the server and store those bytes. The complementary function

const void *C8GetState(C8Udf *ctx, C8UInt *data_size)

allows your UDF to retrieve bytes that it stored previously. The bytes stored and retrieved are specific to a particular occurrence of a particular function in a particular statement. If your Query Module has the following statements:

...
SELECT aggregate_foo(col1)
FROM Window1
...
SELECT aggregate_foo(col2)
FROM Window2

then each aggregate_foo will have a unique internal identifier that the Server uses as part of the context variable passed to your function. The context variable then allows the server to know which aggregate_foo's bytes to return when the UDF calls

C8GetState(ctx, ...)

The C8GetState() and C8SetState() functions are documented in more detail later in this chapter.

An aggregate function operates on a window. Your UDF must take into account not only newly arriving rows, but also expiring rows. To do this, your aggregate UDF will usually be called twice each time that a new record arrives. Your aggregate UDF will be called once with the value of the new record and once with the value of the expired record. Inside your aggregate UDF, you will distinguish between the new and displaced values by calling a function named C8IsPositiveMessage() or C8IsNegativeMessage().

Below is a sample of code that shows typical usage of these aggregate-related functions. The sample shows an aggregate UDF that performs the same work as the pre-defined AVG() function.

...
typedef struct _AvgData {
    C8Int m_sum;
    int m_count;
} AvgData;
...
AvgData initial_data = { 0, 0 };
C8UInt size = 0;
struct AvgData *data_ptr = NULL;
/* Get state (if any) from previous calls. */
data_ptr = (AvgData*)C8GetState(ctx, &size);
// If there is no state from previous calls, then 
// this is probably the first invocation and we must 
// allocate memory. 
if (data_ptr == NULL || (size != sizeof(AvgData))) {
    data_ptr = &initial_data;
}
/* Update the sum and count */
if(C8IsPositiveMessage(ctx)) {
    data_ptr->m_sum += C8GetInt(ctx, 0);
    data_ptr->m_count++;
} else {
/* Negative message - a row just exited the window */
    data_ptr->m_sum -= C8GetInt(ctx, 0);
    data_ptr->m_count--;
}
/* Set the result */
C8SetOutputFloat(ctx, 
(C8Float)(data_ptr->m_sum)/data_ptr->m_count);
/* Save state */
C8SetState(ctx, data_ptr, sizeof(AvgData));
return;
...

The initial_data variable is allocated as a local, non-static variable. This is the variable that the function uses to store data if this is the first time that the function has been called.

When data is passed to the C8SetState() function, C8SetState() copies that data from the user's local memory (for example, init_data) to persistent memory that the server allocates. The storage allocated locally (for example, init_data) does not itself persist after the UDF returns to the caller. Only a copy of the contents persists. The UDF is responsible for deallocating any memory that the UDF allocated, and the Server is responsible for deallocating any memory that the Server allocated. Do not try to deallocate any memory (such as the memory returned by C8GetState) that was allocated by the server.

You must also set the IsAggregator attribute to "true" in the XML file that describes the function.

You may want to look at the sample code for the runningAverage() function, located in the SybaseC8Repository, under examples/FeatureExamples/FunctionsAndOperators/UserDefinedFunctions/src.