Loss detection after rebuilding stable queues

To determine if any messages could not be recovered after the stable queues were rebuilt, the Replication Server performs loss detection. By checking Replication Server loss-detection messages, you can determine what kind of user intervention, if any, is necessary to restore all data to the system.

Replication Server detects two types of losses after rebuilding stable queues:

SQM loss, which refers to data lost between two Replication Servers, detected at the next downstream site
DSI loss, which refers to data lost between a Replication Server and a replicate database that the Replication Server manages

Both kinds of loss detection are addressed in the following sections.

If all data is available, no intervention is necessary and the replication system can return to normal operations. For example, if you know that the save interval for the route or connection is set for a longer length of time than the failure, you can recover all messages with no intervention. However, if the save interval is not set or is set too low, some messages may be lost.

A Replication Server that has detected a loss does not accept messages from the source. Loss detection prevents the source from truncating its stable queues. For example, if Replication Server RS2 detects that replicate data server DS2.RDB has lost data from primary data server DS1.PDB, Replication Server RS1 cannot truncate its queues until you decide how to handle the loss. As a result, RS1 may run out of stable storage. Before a loss is detected (that is, after the “Checking Loss” message is reported), you can choose to ignore losses for a source and destination pair.