TS_OUTLIER_IDENTIFICATION function [Time Series]

NoteThis function is available only with RAP – The Trading Edition Enterprise.

Function

Detects and determines outliers and simultaneously estimates the model parameters in a time series where the underlying outlier-free series follows a general seasonal or non-seasonal ARMA model.

Syntax

TS_OUTLIER_IDENTIFICATION (timeseries_expression, p_value, q_value, s_value, d_value, [, delta_value[, critical_value]])
OVER (window-spec)

Parameters

timeseries_expression A numeric expression, generally a column name, containing an element in a time series.

p_value An integer containing the p-portion of the autoregressive integrated moving average (ARIMA) (p, 0, q)x(0, d, 0)s model that the outlier free series follows.

q_value An integer containing the q-portion of the ARIMA (p, 0, q)x(0, d, 0)s model that the outlier free series follows.

s_value An integer containing the s-portion of the ARIMA (p, 0, q)x(0, d, 0)s model that the outlier free series follows.

d_value An integer containing the d-portion of the ARIMA (p, 0, q)x(0, d, 0)s model that the outlier free series follows.

delta_value (Optional) A double precision float value containing the dampening effect parameter used in detecting a temporary change outlier. The integer must be greater than 0 and less than 1. The default value is 0.7.

critical_value (Optional) A double-precision float value used as a threshold for outlier detection. The default is 3.0.

window-spec TS_OUTLIER_IDENTIFICATION is an OLAP function requiring an OVER () clause with an unbounded window. This function does not support value-based windows; for example, you cannot use a range specifier in the OVER () clause.

Usage

This function returns an outlier-free time series. TS_OUTLIER_IDENTIFICATION calls the function imsls_d_ts_outlier_identification in the IMSL libraries.

IMSL mapping

The arguments of TS_OUTLIER_IDENTIFICATION map to the IMSL library function imsls_d_ts_outlier_identification as follows:

params = imsls_d_ts_outlier_identification(n_objs, model[], z[], 0);

n_objs Contains the number of rows in the current window frame.

model An array containing the TS_OUTLIER_IDENTIFICATION arguments p_value, s_value, q_value, d_value: model[0] = p_value; model[1] = s_value; model[2] = q_value; model[3] = d_value;

z[] Contains the value of timeseries_expression for the current window frame.

If delta_value is non-null, the arguments of TS_OUTLIER_IDENTIFICATION map to the IMSL library function imsls_d_ts_outlier_identification as follows:

params = imsls_d_ts_outlier_identification(n_objs, model[], z[], IMSL_DELTA, delta_value, 0);

If critical_value is non-null, the arguments of TS_OUTLIER_IDENTIFICATION map to the IMSL library function imsls_d_ts_outlier_identification as follows:

params = imsls_d_ts_outlier_identification(n_objs, model[], z[],		IMSL_CRITICAL, critical_value, 0);

If both delta_value and critical_value are non-null, the arguments of TS_OUTLIER_IDENTIFICATION map to the IMSL library function imsls_d_ts_outlier_identification as follows:

params = imsls_d_ts_outlier_identification(n_objs, model[], z[], 		IMSL_DELTA, delta_value, IMSL_CRITICAL, critical_value, 0);

For detailed information on how the IMSL function imsls_d_ts_outlier_identification performs time series calculations, see IMSL Numerical Library User’s Guide: Volume 2 of 2 C Stat Library.

Example

This example shows an input data table, a SQL statement containing the TS_OUTLIER_IDENTIFICATION function, and the data values returned by the function. This example uses the following table (called DATASET) as its input data. The DATASET table contains 50 rows of time series data:

Table 4-57: Input data table DATASET

rownum

data

1

0.315523

2

0.485859

3

0.676886

4

1.97381

5

2.77555

6

2.73657

7

2.64233

8

4.26118

9

3.13641

10

4.16566

11

2.95952

12

2.14504

13

1.98799

14

0.805859

15

0.833405

16

2.29075

17

1.30045

18

0.467122

19

-0.170107

20

-0.256657

21

-0.382597

22

-0.505511

23

-1.90147

24

-0.981688

25

-1.43116

26

-1.39389

27

-2.34823

28

-2.91122

29

-0.927423

30

-0.044383

31

-0.389648

32

0.545008

33

0.614096

34

0.364668

35

1.16043

36

-0.654063

37

0.616094

38

2.00875

39

1.86696

40

2.80171

41

3.78422

42

4.11499

43

2.77188

44

4.00312

45

4.21298

46

5.00413

47

4.74498

48

4.89621

49

3.93273

50

4.31592

The following SQL statement detects and determines outliers on data from the data column:

select ts_outlier_identification(data,1,1,1,1,0.7,3.0) over (order by rownum rows between unbounded preceding and unbounded following) as res FROM DATASET

Sybase IQ returns 50 rows:

Table 4-58: Values returned from ts_outlier_identification

res

0.315523

0.485859

0.676886

1.97381

2.77555

2.73657

2.64233

4.26118

3.13641

4.16566

2.95952

2.14504

1.98799

0.805859

0.833405

2.29075

1.30045

0.467122

-0.170107

-0.256657

-0.382597

-0.505511

-1.90147

-0.981688

-1.43116

-1.39389

-2.34823

-2.91122

-0.927423

-0.044383

-0.389648

0.545008

0.614096

0.364668

1.16043

-0.654063

0.616094

2.00875

1.86696

2.80171

3.78422

4.11499

2.77188

4.00312

4.21298

5.00413

4.74498

4.89621

3.93273

4.31592

Standards and compatibility

See also

Chapter 2, “Using OLAP” in the System Administration Guide: Volume 2

IMSL Numerical Library User’s Guide: Volume 2 of 2 C Stat Library