The same data source may be referenced more than once in the FROM clause by using different aliases for each copy of the same data source.
Each mention of the data source under a different alias may include its own KEEP clause, allowing you to base more than one type of window on the same data stream. References to the different aliases in the rest of the query are then treated as references to different data sources although the data sources actually have the same incoming data.
Consider the need to compare the incoming volume for a specific trade with the cumulative volume for the preceding 10 seconds (again, this example period is for illustration with our limited data stream, but this period could be 15 minutes, an hour or longer). You would like to see this comparison every time a new trade arrives for any stock symbol.
Here is an example that joins individual rows of data in the StockTrades stream with a window which is also based on the StockTrades stream:
INSERT INTO VolumeComparison SELECT StockTrades.symbol, StockTrades.volume, SUM(STWindow.volume) FROM StockTrades, StockTrades AS STWindow KEEP 10 SECONDS WHERE StockTrades.symbol = STWindow.symbol GROUP BY StockTrades.symbol;
The VolumeComparison stream has the following schema:
Column |
Datatype |
Description |
---|---|---|
symbol |
String |
The symbol for this trade. |
volume |
Integer |
The volume for this trade. |
vol10sec |
Integer |
The volume over the previous 10 seconds for this stock. |
In order to perform the volume calculation, a window is required on the StockTrades stream with a window policy that retains rows for 10 seconds. Though the two data sources refer to the same stream and the same columns, they are now treated as different and distinct. Rows that enter the StockTrades data stream are also routed to StockTradesWindow where they are retained for 10 seconds.
The GROUP BY clause groups the resulting output by stock symbol. Also, as noted in the Window chapter, it breaks the window into subwindows, each containing the data for one stock symbol.
The WHERE clause defines a join condition that matches the row in the StockTrades stream with the same group of rows in the window.
The SELECT clause calculates the sum of the volumes for the rows in the window and aggregates down to a single output row for each incoming stock trade.
Since the result of the join operation ensures that the symbol from both the StockTrades stream and the window is the same for any given row, it doesn't matter which data source is used as the source of the symbol column, but it is mandatory that one of the sources be chosen to qualify the column name symbol. Since both StockTrades and StockTradesWindow contain the symbol column, a compile error would result if the unqualified name symbol were used in the query.
Here is an illustration of how this self-join query works: