Basic CCL

Questions and answers to basic questions about CCL queries. Discusses what CCL queries loos like, filter queries, time-based windows, row-based windows, syntax rules, and so on.

Questions


How is CCL different from SQL?

First and foremost, there is a difference in processing models: SQL queries are request-response queries and operate on tables. CCL queries are registered and continuously operate on data streams.

CCL also provides several extensions to SQL. These include an ability to abstract queries into hierarchies of parameterizable modules and projects, many kinds of windows that can be applied to data streams, OUTPUT EVERY / OUTPUT AFTER clauses, the ability to detect patterns of rows that occur or don't occur in one or more specified data streams, and several other extensions.

What does the simplest CCL query look like?

The following query takes all messages from InputStream and inserts them into OutputStream:

INSERT INTO OutputStream 
SELECT * 
FROM InputStream;

This is a continuous query. It continuously waits for data in InputStream. As soon as the data arrives, it is inserted into the OutputStream.

What does a simple filter query look like?

Here is an example of a simple filter:

INSERT INTO OutputStream 
SELECT * 
FROM InputStream 
WHERE InputStream.Amount > 1000;

This query passes only those messages to OutputStream where the Amount column of InputStream has a value greater than 1000.

What is a time-based window and how can it be used?

Simple filter queries are stateless, in other words, they do not store any state. However, many queries need to store a certain state, such as a portion of the data stream or a value of an aggregator. These queries are defined in terms of the data window on a particular stream.

For example, suppose that a stock trading application needs to keep track of the highest price of a stock in the past 15 minutes. Here is a query to do that:

INSERT INTO MaxPrice 
SELECT Trades.Symbol, MAX(Trades.Price) 
FROM Trades KEEP 15 MINUTES 
WHERE Trades.Symbol = 'IBM';

Note the FROM Trades KEEP 15 MINUTES part of the query. This defines a sliding 15 minute window over the Trades stream. This query produces output on every incoming message in Trades, as long as the value of the Symbol column is "IBM". The output does not include the current price of IBM, but the highest price of IBM within the 15 minute window.

Sybase CEP Engine keeps just enough data in memory to be able to produce the maximum value, and then discards old messages as they exit the window.

See also the discussion of rows-based windows, below, for more information about windows.

What is a rows-based window?

Suppose that you want to write a query similar to the one in the previous question, but instead of learning the highest price over the past 15 minutes, you want to know the price over the last fifteen trades. Then instead of using a time-based window, you would use a rows-based window:

INSERT INTO MaxPrice 
SELECT Trades.Symbol, MAX(Trades.Price) 
FROM Trades KEEP 15 ROWS 
WHERE Trades.Symbol = 'IBM';

Rows-based windows are very versatile. For advanced features of rows-based windows, see the Sybase CEP Frequently Asked Questions (Advanced) document or the Sybase CEP Programmer's Guide.

Can I use GROUP BY to process an incoming data stream, as if it were split into several streams, based on a specified attribute?

Yes, this is a very powerful feature of CCL. For example, previous questions keep the highest price of IBM over some time period. But what if we want to keep the price of all stocks in the stream over the past 15 minutes? This is where GROUP BY may be used. Here is an example:

INSERT INTO MaxPrice 
SELECT Trades.Symbol, MAX(Trades.Price) 
FROM Trades KEEP 15 MINUTES 
GROUP BY Trades.Symbol;

The GROUP BY clause here tells CCL that this query should keep separate states for all trades with different symbols. Logically, the stream is split into a set of streams according to the symbol. The resulting streams are merged back after processing, so that the output stream is again one stream that lists the maximum value of each stock over the last 15 minutes. There is one output value in MaxPrice for every input value in Trades. GROUP BY is quite powerful and can be used to split streams by IP address, Employee ID, Event ID, RFID Tag ID, Reader ID, or any other column.

How do I use the OUTPUT EVERY clause to produce output over regular time periods?

In general, CCL queries produce results whenever there are new messages in the input stream (see Sybase CEP Frequently Asked Questions (Advanced) for more details). However, sometimes it is a good idea to stabilize the stream: that is to produce data at predetermined time intervals, for example, every minute. OUTPUT EVERY is designed for this. Look at the following query:

INSERT INTO MaxPrice 
SELECT Trades.Symbol, Trades.Price, 
MAX(Trades.Price) 
FROM Trades KEEP 15 MINUTES 
GROUP BY Trades.Symbol 
OUTPUT EVERY 1 MINUTE;

This query will produce output once a minute, for those stocks where there was activity during that minute. To see outputs of the last value for stocks that had no activity, use the OUTPUT ALL EVERY clause.

How do I use the OUTPUT AFTER clause to delay output?

Sometimes it is necessary to delay all output from a query by some time period. For example, this is useful when setting up a timer based on each message. OUTPUT AFTER is used in this case. Here is an example:

INSERT INTO MaxPrice 
SELECT Trades.Symbol, Trades.Price, MAX(Trades.Price) 
FROM Trades 
KEEP 5 MINUTES 
GROUP BY Trades.Symbol 
OUTPUT AFTER 15 MINUTES; 

OUTPUT AFTER delays the entire stream, or shifts the time series by a specified time period.

Are there joins in CCL?

There are several kinds of joins in CCL. There is a join between a stream and one or more windows, and also a join between multiple windows. CCL also supports both inner and outer joins. Finally, there are joins between a stream and data coming from a traditional relational database.

What is a join between a stream and a window?

Suppose that every time an event happens in stream S1, you want to know whether an event also happened in stream S2 within the past 30 minutes, in all cases where the Id column in S1 and the Id column in S2 have the same value. This is the simplest illustration of a join between a stream and a window. It looks like this:

INSERT INTO Result 
SELECT * 
FROM
S1,
S2 KEEP 30 MINUTES 
WHERE
S1.Id = S2.Id;

Just as with a database join, this join performs a cartesian product between S1 and S2, subject to the join condition S1.id=S2.id . Unlike a database join, however, this join is continuous, in other words, it continuously publishes new rows into the Result stream as rows arrive in S1. The join remembers 30 minutes worth of rows in S2, so that it can compare incoming S1 values against stored S2 values.

In the case of stream-window joins, only rows arriving in the stream, in this case S1, trigger new output rows. Rows arriving in S2 do not result in new output rows. See the Sybase CEP Frequently Asked Questions (Advanced) document for more information on this subject.

What is a join between two windows?

Suppose that you have streams S1 and S2 (just like in the previous example) but whenever an event occurs in either stream, you want to know whether a corresponding event occurred in the other stream. In this case, you can join two windows:

INSERT INTO Result 
SELECT * 
FROM
S1 KEEP 30 MINUTES,
S2 KEEP 30 MINUTES 
WHERE
S1.id = S2.id;

This is the simplest case of a join between two windows. The windows may be of different kinds, durations, and so on. The WHERE condition used to filter query results may be arbitrarily complex.

How do I do event pattern matching with CCL?

CCL comes with a powerful event pattern matching syntax, introduced with the keyword MATCHING. For example, suppose that you have streams S1 and S2, and you want to be notified if two related events occur in S1 within a ten-second interval, but no event occurs in S2 during this time. Here is a query that watches for this pattern. In this example, the relationship between S1 and S2 is established based on column Id:

INSERT INTO Alerts 
SELECT * 
FROM
S1 AS a,
S1 AS b,
S2 AS c 
MATCHING [10 SECONDS: a, b, !c] 
ON a.Id = b.Id = c.id;

Note several aspects of this query:


  • The syntax of the query is somewhat similar to a join, but the windows are defined implicitly by the MATCHING clause (which is set to ten (10) seconds, in this case).

  • The "sequence" operator, designated by a comma (,) is used to denote one event happening after another in the stated order.

  • The "not" operator, designated by an exclamation point (!) is used to specify a non-event, in other words, an event that does not occur within the specified time period.

  • The ON clause is used to provide a condition that relates the events.

Just as with a join, a WHERE clause may also be used to provide extra conditions.

How do I use data from traditional databases, like Oracle or MySQL, in CCL queries?

CCL lets you embed SQL statements into CCL queries. SQL queries are executed on demand against the target database, and the results are cached in memory, if necessary. Here is a sample query to illustrate a database subquery:

INSERT INTO OutHistoricalComparison 
SELECT InTrades.Symbol, InTrades.Price, DbResult.ClosingPrice 
FROM
  InTrades,
  (DATABASE 'OracleDb'
   SCHEMA '../schemas/closing-price.ccs'
   [[SELECT closing_price
     FROM Price_history ph
     WHERE ph.Symbol = ?InTrades.Symbol
     AND ph.closing_date = current_date-1]]
  ) as DbResult 
;

This query is a join (or a correlated subquery) that joins the InTrades stream with an Oracle table called Price_history. Note the following characteristics of this query:


  • The Oracle SQL query is enclosed in double brackets ([[...]]) to clearly separate it from the CCL query. This syntax is required.

  • The SQL query refers to data from the stream column InTrades.Symbol to perform the join (note the question mark to indicate data from the Sybase CEP data stream).

  • The database connection 'OracleDb' must first be defined in a configuration file called c8-services.xml before being used in the database subquery.

  • Data from the database will be cached, according to a cache policy specified in the same file.

How do I learn more about CCL and Sybase CEP Engine?

For more information, please see the following documents:


  • Sybase CEP Getting Started tutorial.

  • Sybase CEP Frequently Asked Questions (Advanced).

  • Sybase CEP Studio Guide.

  • Sybase CEP Continuous Computation Language Programmer's Guide.

  • Sybase CEP CCL Reference.

  • Sybase CEP Integration Guide.

  • Sybase CEP Examples Guide.