Communication Failure or Coordinator Failure and Restart During Global Transaction

If internode communication (INC) fails or the coordinator fails or is shut down during a writer-initiated global transaction, transactions suspend and resume automatically if the INC is restored before a user-specified timeout expires.

Delays in command execution may indicate INC suspend and resume operations. If INC is interrupted, the coordinator suspends a global transaction for an hour. The transaction resumes successfully as soon as INC is restored. If the timeout value elapses, the transaction fails. Set the MPX_LIVENESS_TIMEOUT database option to change the timeout period.

The following cases describe the behavior of writer nodes.

Communication to Coordinator Resumes Before Timeout
Writer Command Status Command Behavior Result
Actively executing command Command suspends, except for ROLLBACK, which executes locally on writer. Command succeeds.
New DML command Command suspends and resumes, except for ROLLBACK and ROLLBACK TO SAVEPOINT, which execute locally on the writer. If communication is restored, resumed commands succeed.
Communication Failure Exceeds Timeout
Writer Command Status Command Behavior Result
Suspended DML command on connection The suspended command fails and returns an error about the non-recoverable state of the transaction. You must roll back the transaction. Rollback happens automatically if the suspended command is COMMIT or ROLLBACK to SAVEPOINT.
No suspended DML command on connection The next command returns an error about the non-recoverable state of the transaction. You must roll back the transaction.

To check connection status, use the sp_iqconnection system procedure on a writer node or the sp_iqmpxsuspendedconninfo system procedure on a coordinator.

Run sp_iqmpxincstatistics for a snapshot of the aggregate statistics of the INC status since server startup.

Note: If a global transaction initiated from a writer node modifies both global and local persistent objects (for example, an SA base table and an IQ base table), and the coordinator fails during commit, global object changes may be committed while local object changes are lost. This is consistent with a scenario that updates both local and proxy tables in the same transaction, where “best effort” is used to commit both local and global components of a transaction.