Cluster Concepts

The following definitions were adapted from the Open Cluster Framework definitions.

Cluster
For our purposes, a cluster is a collection of loosely coupled cooperating computing elements which we refer to as nodes. Failures in clusters are not observed instantaneously or simultaneously by every node. Instead, failures occur asynchronously, and are observed stochastically and independently by the nodes of the cluster. It is not guaranteed that any particular failure will be observed in the same way by every node in the cluster, or even observed at all by every node.

Subcluster
At any given point in time, a cluster is divided into zero or more subclusters (or partitions) of live nodes. Each live node has a view of subcluster membership, and acknowledges membership in no more than one of these subclusters. The most desirable state is that there is only one subcluster in the cluster, and that all (active) nodes belong to that subcluster. However, communication failures can cause a single cluster to divide into multiple subclusters which are partially or completely unaware of each other.

Primary Subcluster
One of these subclusters may be designated the primary subcluster. Traditionally, this primary subcluster is said to have quorum (or more specifically have cluster-quorum, or cluster-wide-quorum). How and why the primary subcluster is chosen is implementation-dependent. How and why each node associates itself with a particular subcluster is implementation-dependent.

Membership
Membership calculation is the process whereby each node associates itself with a particular subcluster, and obtains a (probabalistically current) view of subcluster membership. Quorum calculation is the process whereby the primary subcluster (if any) is selected.

Normally, a membership algorithm will try to make the primary subcluster as large as it can, but it is provable that this is impossible under all circumstances. If a node is in a non-primary subcluster, the membership algorithm is under no obligation to try and make these non-primary subclusters as large as possible.

At any given point in time, each node can view the subcluster of which it is a member as either having stable membership or being in transition. Stable membership is defined to mean that no event has yet been observed by the node which would cause it to recalculate subcluster membership and/or quorum. In transition means that a particular node has observed an event which may cause a membership change, but that the process of recomputing membership and/or quorum has not yet completed. Different nodes in a subcluster may have different views of the stable/transition state at any given point in time.

Because failures occur asynchronously and are observed stochastically, membership provides only probabilistic (and not absolute) assurances of ability to communicate with any particular node. Because it is impossible to know whether a node in a subcluster might observed an event which would make it go into a into transition, in a very real sense, membership is only probabilistically certain, never absolutely certain.

Determination of Quorum
The quorum calculation process is required to select no more than one primary subcluster at a time, but need not select any at all. Under some circumstances, designating more than one primary subcluster at a time can lead to irrecoverable application failures. Nevertheless, no quorum algorithm can provide an absolute guarantee of this property in the presence of arbitrary failures. Different quorum implementations provide different degrees of certainty for any given configuration and set of expected failures.

A common method for defining quorum is to say that a subcluster of size n in a cluster of m total nodes has quorum if it has a plurality of nodes in its partition. That is, it has n members where n > INT(m/2). This simple method works quite well for larger clusters with generally reliable communications. However it breaks down for 2-node clusters, and may perform poorly for geographically dispersed clusters without highly reliable communications between sites.

Strongly Connected Subclusters
Strongly connected subclusters are subclusters where each node can communicate with every member of its subcluster.

Consensus Subclusters
Consensus subclusters are subclusters where during the time when every member of a subcluster views subcluster membership as being stable (as defined above) each node has precisely the same view of subcluster membership as every other member of its subcluster.

Cluster Membership Quality of Service
High-availability membership algorithms typically offer probabilistic guarantees that they always create strongly connected consensus subclusters.

The lowest grade of membership algorithms only meet the letter of the law above, never select a primary subcluster, and make no guarantees regarding strong connectivity or consensus membership. Many traditional high-performance clustering applications tolerate such mechanisms, but most high-availability applications require stronger guarantees.

Such properties can be referred to collectively as cluster quality of service (CQOS) properties.

Fencing
We use the term fencing to refer to the act of separating a cluster node from the resources it manages, without its cooperation. That is, proper fencing techniques will separate a node from its resources without the cooperation of the node being fenced. Given that a cluster node which is subjected to fencing is typically thought to be errant, and the nature of the fault it has experienced is unknown, relying on a third party (involuntary) fencing mechanism increases the probability that an errant node is not longer using its resources. Using fencing and is considered more reliable than simply relying on an errant node to stop using resources on its own.

Relationship Between Quorum and Fencing
Normally quorum and fencing are used in combination, that is only the designated subcluster would fence nodes. However, in certain configurations, reliably determining a designated subcluster is difficult. In these cases, fencing can be used to keep cluster resources from being improperly used, in spite of an imperfect quorum method.

If reliable fencing is used without quorum (or with an imperfect quorum mechanism), then certain undesirable behaviors (such as mutual fencing) may occur, but fencing will still protect cluster resources from being improperly used in spite of any imperfections in the quorum method.