Recovering from Partition Loss or Failure

Recover from Replication Server partition loss or failure when Replication Server detects a lost, damaged,or failed stable queue.

  1. Log in to the Replication Server and drop the failed partition:
    drop partition logical_name
    Replication Server does not immediately drop a partition that is in use. If the partition is undamaged, Replication Server drops it only after all of the messages it holds are delivered and deleted. See Replication Server Reference Manual > Replication Server Commands > drop partition.
  2. If the failed partition was the only one available to the Replication Server, add another one to replace it:
    create partition logical_name
    on 'physical_name' with size size
    [starting at vstart]
    See Replication Server Reference Manual > Replication Server Commands > create partition.
  3. Since the partition is damaged, you must rebuild the stable queues:
    rebuild queues

    When all stable queues on the partition are removed, Replication Server drops the failed partition from the system and rebuilds the queues online using the remaining partitions.

  4. After rebuilding the queues, check the Replication Server logs for loss detection messages.
  5. If Replication Server detected message loss, do one of:
    • Perform message recovery from off-line database logs
    • Request that Replication Server ignore the loss by executing the ignore loss command for the database on the Replication Server where the loss was detected.
Next
If you specify that Replication Server ignore message losses and you have rebuilt the queues of a Replication Server that is part a route, re-create subscriptions at the destination or use the rs_subcmp program with the -r flag to reconcile primary and replicate data.
Related concepts
Rebuild Queues Online
Loss Detection After Rebuilding Stable Queues
Related tasks
Recovering Messages from Off-line Database Logs