Failover

It is the user’s responsibility to ensure that the former coordinator process is no longer running before attempting failover. In a worst case scenario, the former coordinator computer might be running but disconnected from the network, or in a hardware hibernation state. In this situation, you cannot log into the coordinator computer, which would be unreachable by tools such as Sybase Central, but the coordinator computer could start functioning normally without warning. Ideally, the computer on which the coordinator was running should be shut down during the manual failover process.

WARNING! Initiating manual failover while the former coordinator process is alive may cause database corruption.

StepsReplacing the coordinator (Sybase Central)

Make sure that the coordinator is really down before you replace it.

  1. Right-click the multiplex set node in the Sybase Central tree view. The Failover wizard is only enabled when the coordinator is down and the designated failover node is running.

  2. Specify the action to take against the current coordinator by choosing one of three options from the drop-down list: Drop it (the default), Keep it as Reader, or Keep it as Writer.

    If you choose to drop the server, the Delete Server Files check box appears (deselected by default).

    If you choose to keep the server as reader or writer, two radio buttons display: Included and Excluded (the default).

    Choose Included or Excluded (the default). If you choose Included, the Synchronize After Failover check box appears. This check box is deselected by default.

  3. Specify the new failover node by choosing a node from the Identify the New Failover Node dropdown.

  4. Click Finish to start the failover process.

    Two dialog boxes display.

  5. Click Yes if you are certain that the coordinator is down and you are ready to fail over. Several progress messages display at the base of the wizard screen.

StepsReplacing the coordinator (Command line)

The coordinator process must be dead before you initiate replacement. The designated coordinator node must be included and part of the multiplex. Sybase recommends that you have a reader be the designated failover node. Readers have no pending writeable transactions, which makes failover easier.

  1. Ensure that coordinator process is dead.

    If there were any read-write transactions running on secondary nodes when the original coordinator was shut down, these transactions will be rolled back. Ideally if the coordinator is running on dedicated server hardware, that computer should be shut down during the failover process.

    • On UNIX, log into the coordinator machine and make sure that the environment variables are set, then issue the following command:

      stop_iq
      

      and stop the appropriate iqsrv15 process.

    • On Windows, log into the coordinator machine. Start Task Manager and look for the process name iqsrv15.exe. Stop the iqsrv15.exe process.

  2. To identify the designated failover node, connect to any running multiplex server and execute the stored procedure sp_iqmpxinfo. The column coordinator_failover shows the designated failover node.

  3. Connect to the designated failover node and run COMMIT, then BEGIN TRANSACTION to ensure that this node is up to date with the latest TLV log.

    Shut down the designated failover node cleanly, using Sybase Central (Right-click > Control > Stop) or the dbstop utility.

  4. At the command line, restart the intended coordinator using the failover switch (-iqmpx_failover 1) on the server startup utility:

    start_iq -STARTDIR/host1/mpx
    @params.cfg -iqmpx_failover 1 
    -n mpxnode_w1 -x "tcpip{port=2764}"
    mpxtest.db
    

Once the server startup is complete, the failover process is complete and the designated failover node becomes the new coordinator node. After failover, on the next transactions, other secondary servers recognize the new coordinator and connect to it for all read-write transactions. The former coordinator becomes a reader and can be started as a regular secondary node.

To start the former coordinator, you must synchronize it against the new coordinator. Follow steps 1 through 4 in “Synchronizing servers (command line)” but in step 2 (dbbackup), the connection string specified with the -c parameter must contain the new coordinator's connection parameters.