Detecting deleted source records

Records that have been deleted from the source after the last update of the target dimension table still have a corresponding current record in the target dimension table.

  1. Key attributes of all current records in the target dimension table are read using an appropriate Data Provider component. See “Source components”.

  2. The existence of a corresponding record in the source is checked for each current target record based on the source key attribute.

    1. Choose an appropriate Lookup component. See “Lookup components”.

      If your source data does not reside in a database, use the lookup functionality of the Data Calculator component. See “Data Calculator JavaScript”.

    2. Select the lookup data from source. As this is a simple existence check, only the source keys are needed from all source records. However, lookups in ETL always return a value for a key, so you must select an appropriate return value as well.

    3. Add an attribute to the port structure to populate the lookup result. The lookup result determines whether a source record has been deleted. This attribute indicates the data state and allows the data to be filtered in the next step of the transformation process. The new attribute is selected as the value attribute of a Lookup component or the output attribute of a Data Calculator rule. See “Modifying port structures”.

    4. Set an appropriate lookup default value. The default value is returned by the lookup for nonexisting keys. To ensure that the lookup value correctly indicates the existence of records, set it to a constant that is different from all lookup values for the existing keys.

      Example:

      • Target data – select Key, Valid_From from PRODUCT_PRICE where Valid_To='9999-12-31'

      • Lookup data – select Key, '0' from PRODUCT

      • Value attribute – Removed (integer)

      • Default value – 1

      Performing this lookup results in records with attributes Key, Valid_From, Removed. The value of the Removed attribute will be 0 (lookup value) for all existing source records and 1 (default value) for all nonexisting records.