Overview

A correctly configured Replication Server system is designed to be fault-tolerant. However, in the event of a serious failure, it might be necessary to manually intervene and fix the problem. This guide will help you locate, identify, and fix the cause of the problem.

You can solve most Replication Server problems by following the procedures in this guide. The key to finding the cause of a replication system failure is to eliminate possible causes. Use the following troubleshooting methods to narrow down the possible causes:

If you see an error message displayed in a Replication Server error log, you can identify the problem by reading the error log and referring to Chapter 2, “Analyzing Error Messages” to learn how to read error log files, Chapter 3, “Common Errors” to see if the error is a common one, or to one of the other chapters that is relevant to the symptoms of your problem.

If you do not see an error message, you must use diagnostic tools to further analyze the replication system.

The Replication Manager(RM) and the Embedded Replication Server System Database(ERSSD) problems are not covered in this guide. RM uses the Sybase Central message logging feature to provide a log of all commands sent by the RM to any server. It also has a view queue data feature that helps you to troubleshoot transactions in a queue. See the online help for the Replication Manager plug-in and the Replication Server Administration Guide Volume 1 for more information on how to use these features. See also the Replication Server Administration Guide Volume 1 for more information on ERSSD recovery procedure.

This guide may be able to help you to identify hardware, network, and operating system problems, but solving these problems is beyond the scope of this book. Any time a server or a network connection is down, you should also check for hardware, network, or operating system problems.

On Windows, you can usually see a hardware or operating system problem when stack traces randomly or frequently occur at the same time you get errors in the Replication Server error log.

Check the operating system error log for errors that indicate hardware or operating system problems. Such failures might only partially resolve the effects on the replication system. You may still need to resynchronize data between the primary and destination databases.