Use admin who_is_down to determine if the primary and destination Replication Server threads are up. The procedure below verifies that the Replication Servers and the Replication Server threads are up.
Verifying that the Replication Server and Replication
Server threads are up
Log in to the destination Replication Server and execute admin who_is_down. This command displays all the threads on this Replication Server that are down. If any threads are down, an error message should have been displayed in the Replication Server error log. Check the error log again. Also see the section that corresponds to the thread that is down:
DSI – see Chapter 8, “Data Server Interface Problems”. The problem could be duplicate keys or permissions failure.
DSI EXEC – see Chapter 8, “Data Server Interface Problems”.
RSI User – see Chapter 6, “Replication Server Interface Problems”.
RS User – see Chapter 5, “Subscription Problems”.
SQM – the SQM should not go down. Restart the Replication Server; you cannot resume the SQM thread.
SQT – see “13045: replication suspended because RSSD restarted”
User – this should have no effect on replication.
Log in to the primary Replication Server and use admin who_is_down to display all the threads on this Replication Server that are down. See the section that corresponds to the thread that is down as follows:
DIST – See “7035: Replication Server out of memory” and “13045: replication suspended because RSSD restarted”.
DSI – see Chapter 8, “Data Server Interface Problems”. The problem could be duplicate keys or permissions failure.
DSI EXEC – see Chapter 8, “Data Server Interface Problems”.
RepAgent User – see Chapter 7, “RepAgent Problems”.
RSI - see Chapter 6, “Replication Server Interface Problems”.
RSI User – see Chapter 6, “Replication Server Interface Problems”.
RS User – see Chapter 5, “Subscription Problems”.
SQM – the SQM should not go down. Restart the Replication Server; you cannot resume the SQM thread.
SQT – see “13045: replication suspended because RSSD restarted”
User – this should have no effect on replication.
The keepalive value is set too low and the TCP connection was terminated and never restarted, if the following conditions exist:
The DSI is down,
The RepAgent is not connected to the Replication Server or Adaptive Server, and
The entire (or part of the) network went down and was restarted.
If the DSI is up, check for data loss. Although data loss error messages show up in the Replication Server error log, these errors only show up once and may have occurred several days earlier. See Chapter 8, “Data Server Interface Problems”.
If you cannot log in to a Replication Server, it means that Replication Server is down. See the next procedure.