my_md5 is a simple function that calculates the MD5 hash value of an input file.
When my_md5 resides within the dynamically linkable library my_shared_lib, the declaration for this example is:
CREATE FUNCTION my_md5(IN arg1 LONG BINARY) // RETURNS VARCHAR(32) // DETERMINISTIC // IGNORE NULL VALUES // EXTERNAL NAME 'my_md5@libudfex'
This declaration says that my_md5 is a simple scalar UDF residing in my_shared_lib with a descriptor routine named describe_my_md5. Since the behavior of a UDF may require more than one actual C/C++ entry point for its implementation, this set of entry points is not directly part of the CREATE FUNCTION syntax. Instead, the CREATE FUNCTION statement EXTERNAL NAME clause identifies a descriptor function for this UDF. A descriptor function, when invoked, returns a descriptor structure that is defined in detail in the next section. That descriptor structure contains the required and optional function pointers that embody the implementation of this UDF.
This declaration also says that my_md5 accepts one LONG BINARY argument and returns an VARCHAR(32) result value.
The declaration states that this function is deterministic. A deterministic function always returns the identical result value when supplied the same input values. This means the result cannot depend on any external information beyond the supplied argument values, or on any side effects from previous invocations. By default, functions are assumed to be deterministic, so the results are the same if this characteristic is omitted from the CREATE statement.
The last piece of this declaration is the IGNORE NULL VALUES characteristic. Nearly all built-in scalar functions return a NULL result value if any of the input arguments are NULL. The IGNORE NULL VALUES states that the my_md5 function follows that convention, and therefore this UDF routine is not actually invoked when either of its input values is NULL. Since RESPECT NULL VALUES is the default for functions, this characteristic must be specified in the declaration for this UDF to get the performance benefits. All functions that may return a non-NULL result given a NULL input value must use the default RESPECT NULL VALUES characteristic.