Detecting a sequence of events: Exercise

Detect patterns of an arriving row.

With Sybase CEP Engine, an event is an arriving row. In this tutorial, you will use a project to monitor for fraud by detecting patterns of rows representing incoming stock trade requests from customers and the resulting orders placed by brokers. The first goal is to detect "front running," defined as a broker placing an order for him or her self for the same stock as a customer, immediately before placing the order for the customer.

Follow these steps to open a project that searches for front running:

  1. Outside of Sybase CEP Studio, locate the directory SybaseC8Repository/version/examples/GettingStarted/Projects (where version is the version number of Sybase CEP Engine). Copy the directory EventPatternMatching into another location under SybaseC8Repository/version.
  2. Start Sybase CEP Studio.
  3. If the Explorer view shows any projects loaded into the workspace, unload them by clicking each in turn and then clicking Unload Project on the File menu (this is just to keep things uncluttered; other projects in Sybase CEP Studio won't interfere with the new project).
  4. Click Load Project on the File menu. Navigate to your copy of EventPatternMatching and then open the project file.

Examine and Run the Project

Follow these steps to examine and run the project:

  1. Use the Explorer view and the Properties view to examine the project components:
    • The first statement defines a schema named OrdersSchema.

    • The input streams StreamPhoneOrders, StreamPlacedOrders, and StreamWebOrders all use OrdersSchema and are each attached to a ReadFromCSVFile adapter. These streams carry stock trade information for an imaginary brokerage.

    • The properties for each input adapter include the specification File Has Timestamp Column and a Timestamp Column Format string. For this project, Sybase CEP Engine reads the timestamp for each row from the file rather than setting the timestamp to the current time. As a result, the Synchronization properties for the project as a whole specify Use message timestamp rather than Use current server timestamp.

    • The output stream StreamFraudAlerts carries rows identifying fraudulent orders.

  2. Click the project name in the Explorer view and then enter "30" into the Accelerated playback rate box on the Properties tab.:

    Accplay PNG

    When a project uses message timestamp, Sybase CEP Engine processes rows according to the timestamp contained in each row. For example, if the timestamps in two rows are 12 minutes apart, Sybase CEP Engine processes the second row 12 minutes after processing the first row. Accelerated playback speeds up this processing by the specified multiplier. For this project, the CSV files contain about 30 minutes of data, so accelerating the playback 30 times means it will take Sybase CEP Engine approximately one minute to process all of the rows. Accelerated playback can be useful when you're analyzing historical data when you're testing a project in development before deploying to production.

  3. Copy the following text into the Queries tab below the other statements (or copy the text from PatternMatchingCCL.txt between the lines "-- Pattern Matching Code Block 1" and "-- Pattern Matching Code Block 2"):
    -- Match a pattern indicating front running:
    --   Row with customer order
    --   Row with placed order for broker
    --   Row with placed order for customer
    
    INSERT INTO StreamFraudAlerts
    SELECT "Front Running", Broker.OrderId, 
    	Broker.Symbol, Broker.Customer
    FROM StreamPhoneOrders Phone, StreamPlacedOrders Broker, 
    	StreamPlacedOrders Cust
    MATCHING [10 SECONDS: Phone, Broker, Cust]
    ON Phone.Broker = Broker.Customer = Cust.Broker 
    	AND Phone.Symbol = Broker.Symbol = Cust.Symbol
    WHERE Phone.OrderId = Cust.OrderId;
    

    This Query statement includes the following clauses:

    • The SELECT clause specifies a literal string to be inserted into each output row.

    • The FROM clause joins three data sources from two input streams. Note that the clause assigns an alias to each data source without using the word AS.

    • The MATCHING clause looks for a row arriving on Phone, Broker, and Cust, in that order, within 10 seconds. Broker and Cust are two different aliases for the same input stream, so the sequence being matched is a row arriving on StreamPhoneOrders followed by two rows arriving on StreamPlacedOrders within 10 seconds.

    • The ON clause defines two conditions for the join, specifying that the values in the indicated columns be equal. When you define an ON clause with a MATCHING clause, you must define one or more equalities across all the data sources specified in the FROM clause.

    • The WHERE clause paces a further condition on the join, limiting the search to a sequence when the values in the OrderId column of Phone and Cust are equal.

    As a result, the query detects the following pattern of events:

    • A row arrives on StreamPhoneOrders.

    • A row arrives on StreamPlacedOrders with the same value in the Symbol column and with a value in the customer ID column (Broker.Customer) that matches the value in the broker ID column in the first row (Phone.Broker). This indicates that the broker has placed an order for him or her self for the same stock that the customer ordered.

    • A row arrives on StreamPlacedOrders with the same value in the Symbol column as the first two rows, the same value in the broker ID column as the first row and the customer ID column in the second row, and with the same value in the OrderId column as the first row. This is the final event, indicating that the broker placed the order for the customer.

  4. Open viewers for StreamPhoneOrders, StreamPlacedOrders, and StreamFraudAlerts. You can open a viewer for a specific stream by clicking the name in the Explorer view and then clicking the View Stream button (labeled with a magnifying glass) in the toolbar, or by clicking View Stream in the shortcut menu associated with the stream name. Note that the Query statement does not use StreamWebOrders, so a viewer for that stream will not display anything.
  5. Start the project and allow it to run for about a minute, until you see order 1005 in the StreamPlacedOrders viewer:


    POrdStream PNG

    Notice that, at the bottom of the viewer, there's a pair of rows that would match the pattern, except that the rows are more than 10 seconds apart.

  6. Examine the StreamFraudAlerts viewer:


    FAStream PNG

    The rows in the output stream include the value of the OrderId column from Broker, which is the row containing the fraudulent order, not the original customer request. Look through the rows in the other viewers to ensure that you can identify how the rows in the streams match the pattern in the Query statement.

  7. Stop the project.