Avoiding deadlock scenarios

When you apply response time thresholds to components and specify a low number of minimum instances, the server may deadlock with some application architectures, including:

To understand how deadlock can occur and how to avoid it, you must first understand how the Performance Monitor governs client load. During normal operation (after server start-up), Performance Monitor governs load as follows:

  1. Before an entity starts executing a request, it calls the Performance Monitor to check whether it should proceed.

  2. The Performance Monitor checks the entity’s threshold properties and measured average response time. Based on these values, the entity is blocked temporarily or allowed to execute.

  3. After the request completes, the entity again calls the Performance Monitor, allowing measurement of the actual execution time. Performance Monitor adds this to the average time for the entity.

    If the average time is lower than the configured threshold, Performance Monitor increases the maximum allowable simultaneous instances for the entity by one. On the other hand, if average time is higher than allowed, it reduces maximum allowable instances by one.

    This reduction in maximum allowable instances will continue until maximum allowable Instances is equal to configured Minimum Number of Instances.

During server start-up, Performance monitor uses the same algorithm, but starts with the configured value for Minimum Number of Instances as the maximum number of instances that can execute. The number of instances can grow if initial response times are lower than the specified threshold.

When components make intercomponent calls and are invoked directly by base clients, deadlock can occur when client invocations have exhausted the allowable number of instances, and intercomponent calls require the creation of additional instances. Consider components A and B, both with response-time thresholds configured and a value of 5 for Minimum Allowable Instances, and the following sequence of events:

  1. Five clients invoke A and B, creating five instances of each component.

  2. A attempts to call B, but is blocked because measured response times are over the threshold (or the server is just starting, and no response times have been measured).

  3. B attempts to call A, but is blocked because measured response times are over the threshold. A and B are deadlocked.

To avoid this pitfall, you can either remove response time monitoring from the components and apply it to the network listener, or split the component logic into two sets of components. Create a thin wrapper to be invoked by base clients, calling the original component to invoke the logic that requires intercomponent calls. Configure response time monitoring only on the wrapper components that are invoked directly by the base clients.

A component that calls itself recursively can deadlock in a similar scenario. The cure is also similar. Remove response time monitoring from the components and apply it to the network listener, or create a wrapper component to be invoked by base clients and call the recursive component, with response time monitoring configured for the wrapper component instead of the recursive component.

If you suspect your components may be deadlocked due to response time monitoring, analyze the stack traces in the performance monitor statistics.