A description of how to create an inquiry stream which receives a row whenever an inquiry is required. The row contained in this stream can be joined to a window which holds the desired data.
Here is an example of a simple join query that publishes the most recent trade for a stock whenever an inquiry is made for that stock symbol:
INSERT INTO TradeResults SELECT StockTrades.* FROM TradeInquiry, StockTrades KEEP LAST WHERE StockTrades.symbol = TradeInquiry.symbol GROUP BY StockTrades.symbol;
The TradeInquiry stream is a single column stream that receives a row with a stock symbol whenever someone needs to view the latest trade for that symbol. It can be thought of as a simple stream coming from a terminal, web page or other application. Here is the schema for the TradeInquiry stream:
The StockTrades stream is directed into a KEEP LAST window which, because of the GROUP BY clause, keeps the last row for each symbol. (Without the GROUP BY clause, only the most recent trade for the entire StockTrades stream would be kept.)
The output stream schema for TradeResults is the same as the schema for StockTrades. The use of the qualified term StockTrades.* puts all of the columns from StockTrades, but not the symbol column from TradeInquiry, into the output stream,
Here is an illustration of how this simple join query works:
Here are some important things to notice about this simple join query:
The FROM clause of the query lists the data sources used for the join. Multiple data sources in a join are separated by commas in the FROM clause.
When two or more data sources are used in a join query, only one of the sources can be a stream. The other sources must be windows.
This is a very important point to understand. A query can only process the most recent row from a stream, and only at precisely the moment when the row enters the stream. Since rows from different streams will almost never "exist" in a query at the same moment in time, it is virtually impossible to join two streams. Only one of the streams will have data at any given moment, so there will be nothing to join to in the other stream. Fortunately, a window can easily be used to maintain the state of the data from a stream so that when it needs to be joined to another stream, there will be relevant data in both the stream and the window to accomplish the join operation.
The rows in the stream and the window need to be matched up using a join condition. This is provided in the WHERE clause (WHERE StockTrades.symbol = TradeInquiry.symbol). The join condition in this query makes sure that the incoming row in TradeInquiry is matched only to rows in the StockTrades window with the same symbol.
Notice that at 07:15:19, the row with tradeid 5009 is inserted into the TradeResults stream. At 07:15:20, the row with tradeid 5008 is inserted into the TradeResults stream, even though it appeared in the StockTrades stream prior to tradeid 5009. The effects of the window can result in an output stream which has rows in a completely different sequence than the rows in any of the data sources.
Also notice that at 07:15:25, the row with tradeid 5009 is repeated in the TradeResults stream. Since no other trade for AAPL had occurred when the next inquiry for that stock arrived, the query simply publishes the same row it published for the previous AAPL inquiry at 07:15:19.
Because both the TradeInquiry and StockTrades schemas have a column called symbol, it is incorrect to use the column name symbol by itself. The CCL compiler would not know which column named symbol in the query to use. Instead, the query must use qualified names TradeInquiry.symbol and StockTrades.symbol. Preceding the column name with the name of the stream or window (separated from the column name by a period) makes the reference to the different symbol columns unambiguous.
There can be only one row timestamp for any row. Since a join merges two or more rows into a single row, the row timestamp associated with the output row of a stream/window join is taken from the row in the stream.