Replacing the Coordinator (Manual Failover)

Make sure that the coordinator is no longer running before you replace it.

Prerequisites
  • The coordinator process must be dead before manual failover.
    Note: In a worst case scenario, the former coordinator computer might be running but disconnected from the network, or in a hardware hibernation state. In this situation, you cannot log into the coordinator computer, but the coordinator computer could start functioning normally without warning. Ideally, the computer on which the coordinator was running should be shut down during the manual failover process.
  • Use a reader for the designated failover node, if possible. Readers have no pending writeable transactions, which makes failover easier.
  • The designated coordinator node must be included and part of the multiplex.
Task
  1. Ensure that coordinator process is dead.
    Warning!  Initiating manual failover while the former coordinator process is alive may cause database corruption.
    If there were any read-write transactions running on secondary nodes when the original coordinator was shut down, these transactions roll back. Ideally if the coordinator is running on dedicated server hardware, that computer should be shut down during the failover process.
    • On UNIX, log into the coordinator machine and make sure that the environment variables are set, then issue the following command:
      stop_iq
      and stop the appropriate iqsrv16 process.
    • On Windows, log into the coordinator machine. Start Task Manager and look for the process name iqsrv16.exe. Stop the iqsrv16.exe process.

  2. To identify the designated failover node, connect to any running multiplex server and execute the stored procedure sp_iqmpxinfo. The column coordinator_failover shows the designated failover node.
  3. Connect to the designated failover node and run COMMIT, then BEGIN TRANSACTION to ensure that this node is up to date with the latest TLV log.

    Shut down the designated failover node cleanly, using the dbstop utility.

  4. At the command line, restart the intended coordinator using the failover switch (-iqmpx_failover 1) on the server startup utility:
    start_iq -STARTDIR/host1/mpx
    @params.cfg -iqmpx_failover 1 
    -n mpxnode_w1 -x "tcpip{port=2764}"
    mpxtest.db

Once the server starts, the failover process is complete and the designated failover node is the new coordinator node. After failover, on the next transaction, other secondary servers recognize the new coordinator and connect to it for all read-write transactions. The former coordinator becomes a reader and can be started as a regular secondary node once you synchronize it against the new coordinator.

To perform failover using Sybase Control Center, see the Sybase Control Center for SAP Sybase IQ online help in SCC or at http://sybooks.sybase.com/sybooks/sybooks.xhtml?prodID=10680.