The SDK supports either fully transparent or automatic failover in a number of situations.
- Cluster failovers – the URIs used to connect to a back-end component can include a list of cluster manager specifications. The SDK maintains connections to these transparently. If any one manager in the cluster goes down, the SDK tries to reconnect to another instance. If connections to all known instances fail, the SDK returns an error. If working in callback or select access modes, you can configure the SDK with an additional level of tolerance for loss of connectivity. In this case, the SDK does not disconnect a NetEspServer instance even if all known manager instances are down. Instead, it generates a NET_ESP_SERVER_EVENT_STALE event. If it manages to reconnect after a (configurable) number of attempts, it generates a NET_ESP_SERVER_EVENT_UPTODATE event. Otherwise, it disconnects and generates a NET_ESP_SERVER_EVENT_DISCONNECTED event.
- Project failovers – an Event Stream Processor cluster allows a project to be deployed with failover. Based on the configuration settings, a cluster restarts a project if it detects that is has exited (however, projects are not restarted if they are explicitly closed by the user). To support this, you can have NetEspProject instances monitor the cluster for project restarts and then reconnect. This works only in callback or select modes. A NET_ESP_PROJECT_EVENT_STALE is generated when the SDK detects that the project has gone down. If it is able to reconnect, it generates a NET_ESP_PROJECT_EVENT_UPTODATE event. Otherwise, it generates a NET_ESP_PROJECT_EVENT_DISCONNECTED event.
- Active-active deployments – You can deploy a project in active-active mode. In this mode, the cluster starts two instances of the project, a primary instance and a secondary instance. Any data published to the primary instance is automatically mirrored to the secondary instance. The SDK supports active-active deployments. When connected to an active-active deployment, if the currently connected instance goes down, NetEspProject tries to reconnect to the alternate instance. Unlike failovers, this happens transparently. Therefore, if the reconnection is successful, there is no indication generated to the user. In addition to NetEspProject, there is support for this mode when publishing and subscribing. If subscribed to a project in an active-active deployment, the SDK does not disconnect the subscription if the instance goes down. Instead, it generates a NET_ESP_SUBSCRIBER_EVENT_DATA_LOST event. It then tries to reconnect to the peer instance. If it is able to reconnect, the SDK resubscribes to the same streams. Subscription clients then receive a NET_ESP_SUBSCRIBER_EVENT_SYNC_START event, followed by the data events, and finally a NET_ESP_SUBSCRIBER_EVENT_SYNC_END event. Clients can use this sequence to maintain consistency with their view of the data if needed. Reconnection during publishing is also supported but only if publishing in synchronous mode. It is not possible for the SDK to guarantee data consistency otherwise. Reconnection during publishing happens transparently; there are no external user events generated.