High Availability Configuration for Containers

A description of the High Availability configuration for Containers within Sybase CEP Server clusters, as well as their general functionality.

A Sybase CEP Server cluster includes several Containers and one active Manager. Sybase CEP Server may set its Containers to active or passive status. Active Containers are actively used by the Manager when it distributes the workload, while passive Containers remain in standby mode, with no workload until an active Container fails.

All Containers notify the Manager of their availability when they first start up. Containers also send a continuous periodic "heartbeat" to the Manager while they are running. The frequency of the heartbeat is determined by the number of seconds you specify in the Container's "HeartbeatFrequencySeconds" preference in the Container's c8-server.conf file.

The Manager detects Container failure when it does not receive three consecutive heartbeats from the Container. The Manager considers the first heartbeat as missing if the gap between heartbeats exceeds the interval set by "HeartbeatFrequencySeconds" multiplied by three. It considers the second heartbeat missing, if no heartbeat arrives after an additional "HeartbeatFrequencySeconds" interval. It considers the third heartbeat as missing if no heartbeat arrives within yet another "HeartbeatFrequencySeconds" interval. For example, if "HeartbeatFrequencySeconds" is set to two (2), the first heartbeat is assumed to be missing after six seconds; the second and third heartbeats are assumed to be missing after an additional two seconds each. A total of ten seconds elapse before the Manager initiates failover procedures.

After waiting the appropriate number of seconds, the Manager initiates failover procedures in the following order:

  1. If an active Container fails momentarily and restarts immediately, the Manager reassigns the Container's workload back to the Container.

    This occurs if a Container fails and restarts before the Manager detects that the Container failed. The Manager does not attempt to restart a Container once it determines that the container has failed.

  2. If an active Container fails and does not recover immediately, and one or more passive containers are available, the Manager activates a passive Container and reassigns the failed Container's work to the new Container.

  3. If an active Container fails and does not recover immediately, and no passive Containers are available, the Manager redistributes the failed Container's workload among the remaining Containers.

When configuring Container High Availability features, you can specify the number of active Containers that the Manager should try to maintain at any given time and the maximum number of CCX modules that each Container is allowed to run. However, you cannot explicitly indicate to the Manager which Containers it should activate at a particular time.