High Availability

In Event Stream Processor, server clusters promote failure recovery and data redundancy. Event Stream Processor supports an added level of high availability at the project level called active-active mode. Active-Active mode is also known as HA (high availability) mode; the terms are interchangeable.

A single-node cluster provides project-level failure recovery, meaning it detects when a project stops running and automatically restarts it. However, a single-node cluster does not protect against server failure.

A multinode cluster can protect against server failure. When a server in such a cluster fails, the projects running on the failed server are restarted on other servers if their affinities allow it. (Affinities control which server or servers a project can run on.)

When you deploy projects in active-active mode, two instances of the same project run in the cluster, preferably on separate machines. One version of the project is designated as the primary instance, and the other is designated as the secondary instance. All connections from outside the cluster (adapters, clients, Studio) are directed to the primary project server. If the primary instance fails, all connections are automatically directed to the secondary instance.

Data between primary and secondary instances is continuously synchronized. The primary instance receives each message first. To maintain redundancy, the secondary instance must also acknowledge receipt of the message before the primary instance begins processing.