You can view the state transitions here[1]
General Flow of Activities
- Once we start and do some basic sanity checks, we go into the S_NOT_DC state and await instructions from the DC or input from the CCM which indicates the election algorithm needs to run. If the election algorithm is triggered we enter the S_ELECTION state from where we can either go back to the S_NOT_DC state or progress to the S_INTEGRATION state (or S_RELEASE_DC if we used to be the DC but arent anymore).
Refer to the ClusterResourceManagerDaemon[2] page for details of the election and its algorithm. Once the election is complete, if we are the DC, we enter the S_INTEGRATION state which is a DC-in-waiting style state. We are the DC, but we shouldnt do anything yet because we may not have an up-to-date picture of the cluster. There may of course be times when this fails, so we should go back to the S_RECOVERY stage and check everything is ok. We may also end up here if a new node came online, since each node is authorative on itself and we would want to incorporate its information into the CIB. Once we have the latest CIB, we then enter the S_POLICY_ENGINE state where invoke the Policy Engine. It is possible that between invoking the Policy Engine and recieving an answer, that we recieve more input. In this case we would discard the orginal result and invoke it again. Once we are satisfied with the output from the Policy Engine we enter S_TRANSITION_ENGINE and feed the Policy Engine's output to the Transition Engine who attempts to make the Policy Engine's calculation a reality. If the transition completes successfully, we enter S_IDLE, otherwise we go back to S_POLICY_ENGINE with the current unstable state and try again. Of course we may be asked to shutdown at any time, however we must progress to S_NOT_DC before doing so. Once we have handed over DC duties to another node, we can then shut down like everyone else, that is by asking the DC for permission and waiting it to take all our resources away. The case where we are the DC and the only node in the cluster is a special case and handled as an escalation which takes us to S_SHUTDOWN. Similarly if any other point in the shutdown fails or stalls, this is escalated and we end up in S_TERMINATE. At any point, the CRMd/DC can relay messages for its sub-systems, but outbound messages (from sub-systems) should probably be blocked until S_INTEGRATION (for the DC case) or the join protocol has completed (for the CRMd case)
States
The list of possible states the FSA can be in.
- S_IDLE: Nothing happening
- S_ELECTION: Take part in the election algorithm as described below
- S_INTEGRATION: integrate that status of new nodes (which is all of them if we have just been elected DC) to form a complete and up-to-date picture of the CIB
- S_NOT_DC: we are in crmd/slave mode
- S_POLICY_ENGINE: Determin the next stable state of the cluster
- S_RECOVERY: Something bad happened, check everything is ok before continuing and attempt to recover if required
- S_RECOVERY_DC: Something bad happened to the DC, check everything is ok before continuing and attempt to recover if required
S_RELEASE_DC: we were the DC, but now we arent anymore, possibly by our own request, and we should release all unnecessary sub-systems, finish any pending actions, do general cleanup and unset anything that makes us think we are special
- S_PENDING: we are just starting out
- S_STOPPING: We are in the final stages of shutting down
- S_TERMINATE: We are going to shutdown, this is the equiv of "Sending TERM signal to all processes" in Linux and in worst case scenarios could be considered a self STONITH
- S_TRANSITION_ENGINE: Attempt to make the calculated next stable state of the cluster a reality
- S_ILLEGAL: This is an illegal FSA state (must be last)
Inputs
Inputs/Events/Stimuli to be given to the finite state machine Some of these a true events, and others a synthesised based on the "register" (see below) and the contents or source of messages.
- I_NULL: Nothing happened
- I_CCM_EVENT,
- I_CIB_OP: An update to the CIB occurred
- I_CIB_UPDATE: An update to the CIB occurred
- I_DC_TIMEOUT: We have lost communication with the DC
- I_ELECTION: Someone started an election
- I_RELEASE_DC: The election completed and we were not elected, but we were the DC beforehand
- I_ELECTION_DC: The election completed and we were (re-)elected DC
- I_ERROR: Something bad happened (more serious than I_FAIL) and may not have been due to the action being performed. For example, we may have lost our connection to the CIB.
- I_FAIL: The action failed to complete successfully
- I_INTEGRATION_TIMEOUT,
- I_NODE_JOIN: A node has entered the CCM membership list
- I_NODE_LEFT: A node shutdown (possibly unexpectedly)
- I_NODE_LEAVING: A node has asked to be shutdown
- I_NOT_DC: We are not and were not the DC before or after the current operation or state
- I_RECOVERED: The recovery process completed successfully
- I_RELEASE_FAIL: We could not give up DC status for some reason
- I_RELEASE_SUCCESS: We are no longer the DC
- I_RESTART: The current set of actions needs to be restarted
- I_REQUEST: Some non-resource, non-ccm action is required of us, eg. ping
- I_ROUTER: Do our job as router and forward this to the right place
- I_SHUTDOWN: We need to shutdown
- I_STARTUP,
- I_SUCCESS: The action completed successfully
- I_WELCOME: Welcome a newly joined node
- I_WELCOME_ACK: The newly joined node has acknowledged us as overlord
- I_WAIT_FOR_EVENT: we may be waiting for an async task to "happen" and until it does, we cant do anything else
- I_DC_HEARTBEAT: The DC is telling us that it is alive and well
- I_LRM_EVENT,
- I_ILLEGAL: This is an illegal value for an FSA input
References
| [1] | http://www.linux-ha.org/_cache/ClusterResourceManagerDaemon_FSA__fsa_inputs_1.png
|
| [2] | http://www.linux-ha.org/ClusterResourceManagerDaemon
|