The SDK supports either fully transparent or automatic failover in a number of situations.
- Cluster Failovers – the URIs used to connect to a back-end component can include a list of cluster manager specifications.
The SDK maintains connections to these transparently.
If any one manager in the cluster goes down, the SDK tries to reconnect to another instance.
If connections to all known instances fail, the SDK returns an error.
If working in callback or select access modes, you can configure the SDK with an additional level of tolerance for loss of connectivity.
In this case, the SDK does not disconnect a Server instance even if all known manager instances are down.
Instead, it generates a ServerEvent.STALE event.
If it manages to reconnect after a (configurable) number of attempts, it generates a ServerEvent.UPTODATE event. Otherwise, it disconnects and generates a ServerEvent.DISCONNECTED event.
- Project Failovers – an Event Stream Processor cluster lets you deploy projects with failover.
Based on the configuration settings, a cluster restarts a project if it detects that it has exited (however,
projects are not restarted if they are explicitly closed by the user).
To support this, you can have Project instances monitor the cluster for project restarts and then reconnect.
This works only in callback or select modes.
When the SDK detects that a project has gone down, it generates a ProjectEvent.STALE event.
If it is able to reconnect, it generates a ProjectEvent.UPTODATE event, otherwise it generates a ProjectEvent.DISCONNECTED event.
- Active-Active Deployments – you can deploy a project in active-active mode.
In this mode, a cluster starts two instances of the project, a primary instance and a secondary instance.
Any data published to the primary is automatically mirrored to the secondary instance.
The SDK supports active-active deployments.
When connected to an active-active deployment, if the currently connected instance goes down, the Project tries to reconnect to the alternate instance.
Unlike failovers, this happens transparently. Therefore, if the reconnection is successful, there is no indication generated to the user.
In addition to the Project, there is support for this mode when publishing and subscribing.
If subscribed to a project in an active-active deployment, the SDK does not disconnect the subscription if the instance goes down.
Instead, it generates a SubscriberEvent.DATA_LOST event.
It then tries to reconnect to the peer instance.
If it is able to reconnect, the SDK resubscribes to the same streams.
Subscription clients then receive a SubscriberEvent.SYNC_START event, followed by the data events, and finally a SubscriberEvent.SYNC_END event.
Clients can use this sequence to maintain consistency with their view of the data if needed.
Reconnection during publishing is also supported, but only if publishing in synchronous mode.
It is not possible for the SDK to guarantee data consistency otherwise.
Reconnection during publishing happens transparently; there are no external user events generated.