This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

Homepage

About Us

Contact Us

Legal Info

How To Contribute

Security Issues

7th - 10th October 2008 Linux-Kongress in Hamburg will have several sessions on Linux-HA - see you there!

18 August 2008 Heartbeat release 2.1.4 is now out Download it

11 October 2007 NEW educational HA/DR Blog hosted by Alan Robertson

9 April 2007 Check out the Cool Heartbeat Screencasts: Installation, Intro to the GUI Part of the Heartbeat Education project

Last site update:
2008-10-14 19:10:21

Split-Brain

A split-brain condition is the result of a ClusterPartition, where each side believes the other is dead, and then proceeds to take over resources as though the other side no longer owned any resources.

After this, a variety of BadThingsWillHappen - including destroying shared disk data.

This is the result of acting on incomplete information - neglecting DunnsLaw. That is, when a node is declared "dead", its status is, by definition, not known. Perhaps it is dead, perhaps it is merely incommunicado. The only thing that is known is that its status is not known.

The ultimate cure to this is to use Fencing and lock the other side out.

The problem with merely using quorum without fencing, is that the loss of quorum can take an unbounded amount time to detect and react to in the worst case.

Fencing does not require knowledge of the timing or behavior of the "errant" nodes, nor does it require the cooperation or sanity of errant nodes. In addition, fencing operations receive positive confirmation. Hence, fencing has a high degree of certainty.

A good way of avoiding split brain conditions in most cases without having to resort to fencing is to configure redundant and independent cluster communications paths - so that loss of a single interface or path does not break communication between the nodes - that is the communications should not have a single point of failure.

Using both redundant communications and fencing is a good way to go. We highly recommend both.

See Also

Split-brain, quorum, fencing overview, ClusterConcepts, fencing, quorum, STONITH, SPOF, FAQ on tuning deadtime, deadtime directive, warntime directive