Monitoring links between instances

The monCIPCLinks monitoring table monitors the state of the links between instances in the cluster. monCIPCLinks includes two states for each link: “passive” and “active.”

NoteA logical cluster and each instance in the cluster can have different states.

See “Cluster and instance states” for a detailed description of cluster “states.”

The “passive” state is used to monitor day-to-day messages sent over the links. The Cluster Edition gathers the active state when there is no message traffic over the link. The status of the link is described as a “state,” and each state has an age associated with it, in milliseconds. The states include “Up,” “Down,” and “In doubt”. The state is “In doubt” when messages are not sent between the instances.

When the cluster is healthy, regular internode traffic is used to determine the state of the link. This is referred to as passive monitoring, and maintains the link’s passive state. If the monitoring that determines the state occurs during a period of inactivity in the cluster, the defined state may become stale and unreliable (that is, a state that is determined to be Up during a period of inactivity may in fact be Down, but the inactivity prevents the monCIPCLinks table from showing this in the result set). This inactive state is described in the PassiveState column as “In doubt.” Once a link is marked as “In doubt,” the active link state monitoring it is triggered and the value described by the ActiveState column is valid.

Each of the active and passive states have an age associated with them, showing when the state was last updated. If the normal traffic is sufficient to maintain the link state, the active state is not updated and the age value associated with this state becomes large. The large value indicates that the associated state may no longer accurately represent the true state of the link.

If instances are not sending messages, the PassiveState is listed as In doubt, but the ActiveState shows the actual state: Up, In doubt, or Down.

This example shows a two-node cluster in which both links are running and have traffic flowing between them. Because the PassiveStateAge is 0 for all links, you can assume the output is a true reflection of the link state:

InstanceID    LocalInterface     RemoteInterface         PassiveState
      PassiveStateAge    ActiveState    ActiveStateAge------------   --------------     ------------            --------------
      --------------     ------------   ---------------
2              ase2               ase2                    Up
      0                  Up             10300
2              blade2             blade1                  Up
      0                  Up             0
1              ase1               ase2                    Up
      0                   In doubt       179001              blade1             blade2                  Up
      0                   Up             100

This example shows the same two-node cluster after the primary interconnected network fails. The PassiveState value for the link between the network endpoints “ase1” and “ase2” is “In doubt”, and the value for the PassiveStateAge is “large” (indicating that the ActiveState represents the true state of the links). The ActiveState value is younger and shows the links as “Down”:

InstanceID    LocalInterface     RemoteInterface         PassiveState
      PassiveStateAge    ActiveState    ActiveStateAge------------   --------------     ------------            --------------
      --------------     ------------   ---------------
2              ase2               ase1                    In doubt
      13500              Down           700
2              blade2             blade1                  Up
      0                  Up             700
1              ase1               ase2                    In doubt
      13600              Down           400
1              blade1             blade2                  Up
      0                   Up             400

NoteThere is a slight delay between the failure of a link and the time the active state truly reflects the state of the link

Ignore any state with the value “large” for ActiveStateAge since this indicates the link is old and the value may be inaccurate. When the link state is old and the value for ActiveStateAge is “large”, active monitoring is triggered by the absence of messages, but has not yet determined the link state.

NoteWhen you set up both primary and secondary interconnected networks in your cluster input file, do not restart the cluster unless both interconnected networks are running.