Batch Processing

When stream processing logic is relatively light, inter-stream communication can become a bottleneck. To avoid such bottlenecks, you can publish data to the ESP server in micro batches. Batching reduces the overhead of inter-stream communication and thus increases throughput at the expense of increased latency.

ESP supports two modes of batching: envelopes and transactions.

In both the cases the number of records to place in a micro batch depends on the nature of the model and needs to be evaluated by trial and error. Typically, the best performance is achieved when using a few tens of rows per batch to a few thousand rows per batch. Note that while increasing the number of rows per batch may increase throughput, it also increases latency.