Event Stream Processor clusters are designed for simplicity and minimal need for interaction from administrators once started.
A cluster consists of a group of nodes, which are processes that run on hosts. A cluster can have a single node or multiple nodes. Single-node clusters provide a convenient starting point from which to build and refine multinode clusters.
A single-node cluster refers to a cluster with a single manager node (which functions as both a manager and a controller). In development and test environments, a single node cluster may be sufficient. You can deploy several projects to a single-node cluster that monitors project status and, if the project deployed had failover configured, restarts failed projects. However, as you develop and refine your Event Stream Processor environment, the demands on your cluster grow. You can therefore expand your cluster to include additional nodes and, if necessary, additional clusters.
When you have multiple manager nodes in a cluster, it is called a multinode cluster. In a multinode cluster, all manager nodes are considered primary, so there is no single point of failure in the cluster. However, if you configure only one controller for multiple managers, the controller can become a single point of failure.
When a project is deployed to a cluster, it maintains a heartbeat with one of the managers in the node. If the manager node detects three consecutive missed heartbeats from a project, it assumes project failure and issues a STOP command and, if the project deployed had failover figured, restarts the project. If your CPU utilization is operating at 100 percent, the project server may not be able to send heartbeats to the cluster manager, which stops the project. In multinode clusters, a different manager may be responsible for monitoring the project than the manager through which it is deployed.
Manager nodes are paired with other managers through a shared cache. If a manager node starts a project and subsequently fails, any other manager with a shared cache can take over management of the projects previously being monitored by the failed manager node.