Basic RSSD recovery procedure

Use the basic RSSD recovery procedure to restore the RSSD if you have executed no DDL commands since the last RSSD dump. DDL commands in RCL include those for creating, altering, or deleting routes, replication definitions, subscriptions, function strings, functions, function-string classes, or error classes.

Certain steps in this procedure are also referenced by other RSSD recovery procedures in this chapter.

WARNING! Do not execute any DDL commands until you have completed this recovery procedure.

To perform basic RSSD recovery, follow these steps:

Shut down all RepAgents that connect to the current Replication Server.

Since its RSSD has failed, the current Replication Server is down. If for some reason it is not down, log in to it and use the shutdown command to shut it down.

Some messages may still be in the Replication Server stable queues. Data in those queues may be lost when you rebuild these queues in later steps.

Restore the RSSD by loading the most recent RSSD database dump and all transaction dumps.

Restart the Replication Server in standalone mode, using the -M flag.

You must start the Replication Server in standalone mode, because the stable queues are now inconsistent with the RSSD state. When the Replication Server starts in standalone mode, reading of the stable queues is not automatically activated.

Log in to the Replication Server, and get the generation number for the RSSD, using the admin get_generation command:

admin get_generation, data_server, rssd_name

For example, the Replication Server may return a generation number of 100.

In the Replication Server, rebuild the queues with the following command:
```
rebuild queues
```
See “Rebuilding queues online” for a description of this process.
Start all RepAgents (except the RSSD RepAgent) that connect to the current Replication Server in recovery mode:
```
sp_start_rep_agent dbname, recovery
```
Wait until each RepAgent logs a message in the Adaptive Server log that it is finished with the current log.
Check the loss messages in the Replication Server log, and in the logs of all the Replication Servers with direct routes from the current Replication Server.
- If all your routes were active at the time of failure, you probably will not experience any real data loss.
- However, loss detection may indicate real loss. Real data loss may be detected if the database logs were truncated at the primary databases, so that the rebuild process did not have enough information to recover. If you have real data loss, reload database logs from old dumps. See “Recovering from truncated primary database logs”.
- See “Loss detection after rebuilding stable queues” for background and details on loss detection.
Shut down RepAgents for all primary databases managed by the current Replication Server:
```
sp_stop_rep_agent dbname
```
Shut down Replication Server.
Execute the dbcc settrunc command at the Adaptive Server for the restored RSSD. Move up the secondary truncation point.
```
use rssd_name
go
dbcc settrunc('ltm', 'ignore')
go
dump tran rssd_name with truncate_only
go
begin tran commit tran
go 40
```
The begin tran commit tran go 40 command moves the Adaptive Server log onto the next page.

After completing step 10 and before continuing with step 11, run the following command to clear the locator information.
```
rs_zeroltm rssd_server, rssd_name 
go
```
Execute the dbcc settrunc command at the Adaptive Server for the restored RSSD to set the generation number to one higher than the number returned by admin get_generation in step 5.
```
dbcc settrunc ('ltm', 'gen_id', generation_number)
go
dbcc settrunc('ltm', 'valid')
go
```
Make a record of this generation number and of the current time, so that you can return to this RSSD recovery procedure, if necessary. Or, you can dump the database after setting the generation number.
Restart the Replication Server in normal mode.

If you performed this procedure as part of the subscription comparison or subscription re-creation procedure, the upstream RSI outbound queue may contain transactions, bound for the RSSD of the current Replication Server, that have already been applied using rs_subcmp. If this is the case, after starting the Replication Server, the error log may contain warnings referring to duplicate inserts. You can safely ignore these warnings.
Restart RepAgents for the RSSD and for user databases in normal mode.

If you performed this procedure as part of the subscription comparison or subscription re-creation RSSD recovery procedure, you should expect to see messages regarding RSSD losses being detected in all Replication Servers that have routes from the current Replication Server.