Consistent Recovery

Consistent recovery lets you set up a project that can recover its data if it is interrupted by a server or connection failure.

The consistent recovery feature can restore all the windows in a project to a consistent state after a server or connection failure. (Recovery consistency depends on following guidelines for log stores.) When consistent recovery is enabled, the server uses coordinated checkpoints to save data in log stores. When any log store fails to complete a checkpoint, all the log stores for that project roll back to their state as of the previous successful checkpoint. This rule ensures that even if a server or connection fails, all log stores in a project are consistent with one another. However, any input data that has not been checkpointed is not recovered upon restart.

You enable consistent recovery in the project configuration (CCR file), either in Studio or by manually editing the CCR file. See the Studio Users Guide for details.

Enabling consistent recovery has no effect if there are no log stores in the project. When you enable consistent recovery for a project, place the log stores on a shared drive where all the machines in the Event Stream Processor cluster have access to them.

In consistent recovery mode, a project treats commits issued by publishers as checkpoint requests. When the publisher receives the return of a commit from the project, it can notify its source that the data in question has been processed.

All guaranteed delivery subscribers to a window stored in a log store receive checkpoint notifications. GD subscribers can use this notification as an indication that it is safe to commit data in its target.

Consistent recovery works well with projects configured for cold failover if log stores are set up following the log store guidelines. When a project set for cold failover stops responding, the cluster restarts the project, typically on another host. Consistent recovery enables the restarted project to come back up to a consistent state corresponding to the last checkpoint. SAP does not recommend using consistent recovery with HA active-active mode (dual project instances) because there is no guarantee that the data produced in the primary instance is identical to the data in the secondary instance. This is a consequence of the nondeterministic nature of Event Stream Processor.

When consistent recovery is not enabled (which is the default state), the project does not ensure that all the log stores recover to the same point in time after a server failure. Some log stores may recover to a checkpoint state earlier in time than other log stores because the checkpoints across log stores are not treated as an atomic operation. When there is only one log store in the project, this is not an issue.

When you use consistent recovery, the recommendation that all input windows in a project and their direct or indirect dependents be placed in the same log store no longer applies. On the contrary, SAP recommends that you use multiple log stores placed on different disks to improve performance. Using multiple log stores is possible because consistent recovery ensures that all the log stores in the project are always consistent with each other.