Linux-HA Logo

Note

This page describes the release 1 architecture of the Linux-HA project. See the BasicArchitecture[1] page for an overview of the release 2[2] architecture.

Technical Overview

The most well-known component of the Linux-HA project is Heartbeat[3]. Heartbeat[3] sends heartbeat packets across the network (or serial ports) to the other instances of Heartbeat[3] as a sort of keep-alive message.

Heartbeat[3] itself acts similar to a cluster[4]-wide init daemon, making sure each of the services it manages are running at all times as though they were spawned with an init(8) respawn directive.

When heartbeat packets are no longer received, the node[5] is assumed to be dead, and any services (resources[6]) it was providing are failed over[7] to the other node[5]. This assumption of death can be assured to be true by the proper integration of STONITH[8] or a watchdog[9] timer.

Heartbeat[3] can also monitor routers and switches as though they were cluster members using the ping[10] or ping_group[11] directives. Combined with the ipfail[12] program, Heartbeat[3] can also cause failovers when networking connectivity is compromised.

Heartbeat[3] provides services (which it calls resources[13]) as configured in the haresources[14] file. The services in the haresources[14] file are started in a left-to-right order, and stopped in a right-to-left order.

Communication

Heartbeat[3] can communicate using a variety of mechanisms. It can communicate using any combination of UDP broadcast[15], UDP multicast[16], UDP unicast[17], and serial[18] communication paths.

Heartbeat uses a multicast communications protocol to deal with lost and corrupted packets.

See Also

ClusterConcepts[19], FactSheet[20], ipfail[12], ha.cf[21], Heartbeat[3], haresources[14], HeartbeatResourceAgent[13], NewHeartbeatDesign[2], BasicArchitecture[1], TechnicalPapers[22]


References

[1]http://www.linux-ha.org/BasicArchitecture
[2]http://www.linux-ha.org/NewHeartbeatDesign
[3]http://www.linux-ha.org/HeartbeatProgram
[4]http://en.wikipedia.org/wiki/Computer_cluster
[5]http://www.linux-ha.org/ClusterNode
[6]http://www.linux-ha.org/resource
[7]http://en.wikipedia.org/wiki/Failover
[8]http://www.linux-ha.org/STONITH
[9]http://www.linux-ha.org/ha.cf/WatchdogDirective
[10]http://www.linux-ha.org/ha.cf/PingDirective
[11]http://www.linux-ha.org/ha.cf/PingGroupDirective
[12]http://www.linux-ha.org/ipfail
[13]http://www.linux-ha.org/HeartbeatResourceAgent
[14]http://www.linux-ha.org/haresources
[15]http://www.linux-ha.org/ha.cf/BcastDirective
[16]http://www.linux-ha.org/ha.cf/McastDirective
[17]http://www.linux-ha.org/ha.cf/UcastDirective
[18]http://www.linux-ha.org/ha.cf/SerialDirective
[19]http://www.linux-ha.org/ClusterConcepts
[20]http://www.linux-ha.org/FactSheet
[21]http://www.linux-ha.org/ha.cf
[22]http://www.linux-ha.org/TechnicalPapers


This information provided courtesy of the Linux-HA project at http://linux-ha.org/