CREATE SPLITTER Statement

The Splitter construct is a multi-way filter that sends data to different target streams depending on the filter condition. It works similar to the ANSI 'case' statement.

Syntax

CREATE [[LOCAL]|OUTPUT]  SPLITTER name AS
{ WHEN condition THEN {target_streamname [, …]} } […] 
[ ELSE {target_streamname[,..]} ]
SELECT { column_list | * }
FROM source_name [{[alias] [KeepClause]}|{[KeepClause][alias]}]
;

Components

condition

Any expression that results in a 0 or 1.

name Any string specified to identify the splitter construct. Must be unique within a module or top level project.
target_streamname

Name of a stream or delta stream into which the filtered records are inserted. Must be unique within the module or top level project.

source_name

The source (stream, window, or delta stream) that provides input data on which the splitter logic is applied.

column_list

A set of expressions referring only to the columns in the source stream, constant expressions, constant literals, global variables and functions, or parameters.

Usage

The target stream or delta streams are implicitly defined by the compiler. The schema for the target streams are derived based on the column_list specification. All the targets are defined as either local or output depending on the visibility clause defined for the splitter. The default is local. Note that when the splitter has an output visibility, output adapters can be directly attached to the splitter targets, even though those targets are implicitly defined.

Each filter condition in a splitter can have one or more target streams defined. However, each target stream name can appear only once in the list. This allows the possibility to send an event down multiple paths in the graph as the example below shows.
Note: When a condition evaluates to true, the following conditions are neither considered nor evaluated.
The semantics of the splitter are that of a switch statement. Whenever the condition evaluates to true (non-zero value), the record as projected in the column_list is inserted into the corresponding target streams. If the source is a:
  • Stream, the targets are also streams.
  • Delta stream or window, the targets are delta streams.
If the source is a window or delta stream, the primary keys need to be copied as-is. The other columns can be changed.
Note: When the source is a window or a delta stream, the warning about unpredictable results being produced if one of the projections contains a non-deterministic expressions that applies for delta streams also applies for splitters.

Local DECLARE BLOCKS cannot be specified on SPLITTERS. However, functions, parameters, and variables in the global DECLARE BLOCK can be accessed in the condition or column expressions in the projection.

Examples

Create a Splitter

In the following example, if a trade event arrives where the Symbol is IBM or ORCL, then the event is directed to both ProcessHardWareStock and ProcessSoftwareStock streams. If a trade event arrives where the Symbol is either 'SAP' or 'MSFT', then it is directed to the ProcessSoftwareStock stream. All other trades are directed to the ProcessOtherStock stream.

CREATE SPLITTER Splitter1 AS
WHEN Trades.Symbol IN ('IBM', 'ORCL' ) THEN ProcessHardWareStock, ProcessSoftwareStock
WHEN Trades.Symbol IN ('SAP', 'MSFT') THEN ProcessSoftwareStock
ELSE ProcessOtherStock
SELECT * FROM Trades;

Performance Considerations

A splitter is typically more efficient both in terms of CPU utilization and throughput when there is more than a two way split than an equivalent construct composed of two or more streams that implement a filter. Unlike other streams in ESP, a Splitter and all its target streams run in a single thread. This means that the Splitter thread is responsible for distributing data to its dependents.

The Splitter is more efficient than its equivalent multi-threaded logic for these reasons:
  • The performance of a stream is inversely proportional to the amount of data that a source stream needs to distribute to its target. If a stream has two dependent streams, it needs to distribute twice the amount of data it produces (that is, one copy for each target stream). Similarly, if a stream has five dependencies it needs to distribute five times the data it produces. For example, this is the case when three filter streams depend on one source, with each filter only producing a third of the input data as output. In the case of a splitter, the source needs to distribute the data only once to the splitter and this reduces the load on the source stream.
  • The decrease in CPU utilization comes from the fact that you don't have three separate streams processing 100% of the input data to produce, for example, a third of the data as output. In the case of the splitter, the incoming data is analyzed only once and typically no more than 100% of the incoming data is distributed to the appropriate target streams when the filter condition is satisfied.

However, note that because the splitter is single threaded, its performance advantage degrades quickly when it needs to distribute the same data more than once. For example, there is more than one target stream for each filter condition or when the target streams themselves have many dependents.