Verifying That Replication Server Threads are Up

Use admin who_is_down to determine if the primary and destination Replication Server threads are up.

  1. Log in to the destination Replication Server.
    If you cannot log in to a Replication Server, then it is down.
  2. Execute admin who_is_down.

    This command displays all the threads that are down on this Replication Server, and records error messages in the Replication Server error log.

  3. Log in to the primary Replication Server and use admin who_is_down to display all the threads on this Replication Server that are down.
    1. Check the Replication Server error log for these conditions:
      • The Data Server Interface (DSI) is down,

      • The RepAgent is not connected to the Replication Server or Adaptive Server, and

      • The entire (or part of the) network went down and was restarted.

      If these conditions exist, it indicates that the keepalive value is set too low and that the TCP connection was terminated and never restarted.

  4. If the DSI is up, check for data loss.
    Data loss error messages do appear in the Replication Server error log, however, these errors only show up once and may have occurred several days earlier.
Next

If a thread is down, determine the cause of the failure and correct the problem.

Thread That Failed

Action

Distributor (DIST)

Determine if the failure is due to Replication Server error 7035 or 13045, and correct the problem.

DSI

May indicate duplicate keys or permission failure, see Replication Server Troubleshooting Guide > Data Server Interface Problems.

DSI EXEC

See Replication Server Troubleshooting Guide > Data Server Interface Problems.

RepAgent User

See Replication Server Troubleshooting Guide > RepAgent Problems.

Replication Server (RS) User

See Replication Server Troubleshooting Guide > Subscriptions Problems.

Replication Server Interface (RSI) and RSI User

See Replication Server Troubleshooting Guide > Replication Server Interface Problems.

Stable Queue Manager (SQM)

The SQM should not go down. Restart the Replication Server; you cannot resume the SQM thread.

Stable Queue Thread (SQT)

Determine if the failure is due to Replication Server error 13045, and correct the problem.

User

This should have no effect on replication.

Related concepts
Data Server Interface Problems
Replication Server Interface Problems
RepAgent Problems
Subscription Problems
Related reference
Error 13045
Error 7035