If you have set the RS_FAILOVER_MODE parameter to QUIESCE or SWITCH, RCM monitors the Replication Server during a failover process. The RCM monitors the failover process to determine when the Replication Server commands switch active or suspend log transfer have completed.
Certain configuration parameters control how RCM monitors the failover process:
FAILOVER_WAIT – the number of seconds the RCM waits after a potential failover is detected before initiating the failover process. This failover waiting period gives the active Adaptive Server an opportunity to recover automatically. For example, if you set FAILOVER_WAIT to 60, RCM waits 60 seconds before initiating the failover process.
MONITOR_WAIT – the number of seconds the RCM monitors Replication Server after invoking a failover command in Replication Server and before switching end users to the standby Adaptive Server. This gives the Replication Server time to empty its queues. For example, if you set MONITOR_WAIT to 60, RCM monitors Replication Server for 60 seconds.
The MONITOR_WAIT parameter is not used if the RS_FAILOVER_MODE parameter is set to NONE.
If you set MONITOR_WAIT to -1 and RS_FAILOVER_MODE to QUIESCE, RCM quiesces Replication Server and ensures replication server queues are emptied completely. RCM then switches user connections to the standby Adaptive Server. See RS_FAILOVER_MODE in “Understanding RCM configuration parameters,” for more information
TIMER_INTERVAL – the number of seconds the RCM waits between server pings and monitoring commands. The TIMER_INTERVAL value must be less than or equal to the values of the FAILOVER_WAIT and MONITOR_WAIT parameters.
For example, if you set TIMER_INTERVAL to 5, RCM waits 5 seconds between server pings and monitoring commands. If you set FAILOVER_WAIT to 60, the RCM pings the server 12 times before beginning the failover process.
If the TIMER_INTERVAL value
is greater than either or both the FAILOVER_WAIT and MONITOR_WAIT values,
the RCM does not start and displays a notification that there is
an error in the parameter settings.
You can tune the system using these configuration parameters. Used together, these parameters work as described in the following scenario:
The RCM detects a failover in the system.
RCM pings the active Adaptive Server every TIMER_INTERVAL seconds for FAILOVER_WAIT seconds to determine if it has recovered.
After FAILOVER_WAIT seconds, the Adaptive Server has not recovered, so the RCM initiates the failover process. The RCM begins to monitor Replication Server. Every TIMER_INTERVAL seconds, the RCM issues a monitoring command.
The RCM continues to monitor Replication Server for MONITOR_WAIT seconds. At that time, or when the Replication Server finishes the failover process if that is sooner, the RCM switches the users to the standby Adaptive Server.
The
RCM uses the FAILOVER_WAIT and TIMER_INTERVAL parameters
to monitor the environment even when you set the RS_FAILOVER_MODE parameter
to “NONE” because you plan to fail over the Replication
Server manually. In this case, the RCM responds to a failover by
locking user connections out of the Adaptive Server, but does not
invoke any Replication Server commands.