Ha.cf

The ha.cf file
The ha.cf file is one of the more important files to understand when configuring Heartbeat. It lists the cluster nodes, the communications topology, and which features of the configuration are enabled. This page does not discuss the FineArtOfConfiguringaCluster, which is a separate topic worthy of significant thought. For Heartbeat novices, it is probably worthwhile to read the article on GettingStartedWithHeartbeat.

Global ha.cf options
It is important to note that certain options in the ha.cf file are global in nature, and that ordering of these global options is important in configuring the ha.cf file, since each directive is interpreted as it is encountered in ha.cf.

These global options are: It is recommended that these options be placed first in the ha.cf file when they are entered. In particular, placing the logging entries first is especially recommended.
 * use_logd
 * logfacility - (deprecated)
 * logfile - (deprecated)
 * debugfile - (deprecated)
 * udpport
 * baud

The default values for each of these options can be found below.

A Minimum ha.cf file
A minimum ha.cf file contains one or more node directives, and one or more of the communication topology (bcast, mcast, ucast, or serial) directives.

apiauth - API authorization directive
The apiauth directive specifies what users and/or groups are allowed to connect to a specific API group name. The syntax is simple: apiauth apigroupname [uid=uid1,uid2 ...] [gid=gid1,gid2 ...] You can specify either a uid list, or a gid list, or both. However you must specify either a uid list or a gid list. If you include both a uid list and a gid list, then a process is authorized to connect to that API group if if it is either in the uid-list or it is in the gid-list.

The API group name default has special meaning. If it is specified, it will be used for authorizing clients without any API group name, and all client groups not identified by any other apiauth directive.

Unless you specify otherwise in the ha.cf file, certain services will be provided default authorizations as follows:


 * service
 * default apiauth
 * ipfail
 * uid=hacluster
 * ccm
 * gid=haclient
 * ping
 * gid=haclient
 * cl_status
 * gid=haclient
 * lha-snmpagent
 * uid=root
 * crm
 * uid=hacluster

auto_failback directive - set failback policy
The auto_failback option determines whether a resource will automatically fail back to its "primary" node, or remain on whatever node is serving it until that node fails, or an administrator intervenes.

The possible values for auto_failback are:

Both the auto_failback on and off are backwards compatible with the old "nice_failback on" setting.
 * on - enable automatic failbacks
 * off - disable automatic failbacks
 * legacy - enable automatic failbacks in systems where all nodes in the cluster do not yet support the auto_failback option.

See the FAQ document for information on how to convert from "legacy" to "on" without a flash cut (i.e., using a RollingUpgrade process)

The default value for auto_failback is "legacy", which will issue a warning at startup. So, make sure you put an auto_failback directive in your ha.cf file (note: auto_failback can be any boolean value or legacy). Typically, you want to set auto_failback on for an ActiveActive cluster, and commonly to off for an ActivePassive cluster.

NOTE: auto_failback does not have any effect on a Release 2 CRM-style cluster (one configured with crm on). For CRM-style clusters, this has been replaced with the default_resource_stickiness attribute in the CIB.

autojoin - enables automatic node joining
The autojoin directive enables nodes to join automatically just by communicating with the cluster, hence not requiring node directives in the ha.cf file. Since our communication is normally strongly authenticated, only nodes which know the cluster key can join (automatically or otherwise).

The general syntax of the autojoin directive is:

autojoin (none|other|any)

All legal autojoin directives are shown below:

autojoin none autojoin other autojoin any The values you can give for the autojoin directive have the following meanings:

Note that the set of nodes currently considered part of the cluster is kept in the hostcache file.
 * none: disables automatic joining.
 * other: allows nodes other than ourself who are not listed in ha.cf to join automatically. In other words, our node has to be listed in ha.cf, but other nodes do not.
 * any: allows any node to join automatically without being listed in ha.cf, even the current node.

With autojoin enabled, the node directive is no longer authoritative - the hostcache file is.

baud - set serial communication speed
The baud directive is used to set the speed for serial communications. Any of the following speeds can be specified, provided they are supported by your operating system: 9600, 19200, 38400, 57600, 115200, 230400, 460800. The default speed is 19200. A sample baud directive is shown below:

baud 38400

bcast - configure broadcast communication path
The bcast directive is used to configure which interfaces Heartbeat sends UDP broadcast traffic on. More than one interface can be specified on the line. The udpport directive is used to configure which port is used for these broadcast communications if the udpport directive is specified before the bcast directive, otherwise the default port will be used. A couple of sample bcast lines are shown below.

bcast eth0 eth1 # on Linux systems bcast le0       # for Solaris systems On CRM-enabled clusters, the bcast directive does not work on FreeBSD and OpenBSD because of the fragmentation issue described in

W. Richard Stevens - Unix Network Programming - Vol 1 - 3rd Edition: The Sockets Networking API

20.4 dg_cli Function Using Broadcasting ... IP Fragmentation and Broadcasts Berkeley-derived kernels do not allow a broadcast datagram to be fragmented. If the size of an IP datagram that is being sent to a broadcast address exceeds the outgoing interface MTU, EMSGSIZE is returned (pp. 233-234 of TCPv2). This is a policy decision that has existed since 4.2BSD. There is nothing that prevents a kernel from fragmenting a broadcast datagram, but the feeling is that broadcasting puts enough load on the network as it is, so there is no need to multiply this load by the number of fragments. .... AIX, FreeBSD, and MacOS implement this limitation. Linux, Solaris, and HP-UX fragment datagrams sent to a broadcast address.

This results because CRM clusters try and send large (>MTU size) packets over the cluster communication media.

compression - set compression method
The compression directive sets which compression method will be used when a message is big and compression is needed.

It could be either zlib or bz2, depending on whether you have the corresponding library in the system. You can check /usr/lib/heartbeat/plugins/HBcompress  to see what compression module is available.

If this directive is not set, there will be no compression.

compression   

compression_threshold - set compression threshold for a message
The compression_threshold directive sets the threshold to compress a message, e.g. if the threshold is 1, then any message with size greater than 1 KB will be compressed. The default is 2 (KB). This directive only makes sense if you have set the compression directive.

compression_threshold 2

conn_logd_time - the directive to set interval to reconnect to the logging daemon
The conn_logd_time directive specifies the time Heartbeat will reconnect to the logging daemon if the connection between Heartbeat and the logging daemon is broken. The conn_logd_time is specified according to the time. For example,

conn_logd_time 60 #60 seconds Default is 60 seconds.

Note: Heartbeat will not automatically reconnect to the logging daemon. It only tries to reconnect when it needs to log a message and conn_logd_time have passed since the last attempt to connect.

coredumps - enable capturing core dumps
The coredumps directive tells Heartbeat to do things to enable making core dumps - should it need to dump core.

The general syntax of a coredumps directive is:

coredumps boolean The most common coredumps directives are shown below:

coredumps true coredumps false

Heartbeat core dumps should show up in one of these two locations - depending on the release of Heartbeat you're using: /etc/ha.d /var/lib/heartbeat/cores/*

crm - enabling and disabling the Pacemaker cluster manager
The crm directive specifies whether Heartbeat should run the 1.x-style cluster manager or the 2.x-style cluster manager that supports more than 2 nodes.

The syntax is simple:

crm off|on|respawn When set to on|respawn, the directive automatically implies:

apiauth stonithd       uid=root apiauth crmd           uid=hacluster apiauth cib            uid=hacluster

respawn hacluster      ccm respawn hacluster      cib respawn root           stonithd respawn root           lrmd respawn hacluster      crmd

deadping - set failure (death) detection time for ping nodes
The deadping directive is used to specify how quickly Heartbeat should decide that a ping node in a cluster is dead. Setting this value too low will cause the system to falsely declare the ping node dead. Setting it too high will delay detection of communication failure.

The deadping value is specified according to the time. Two sample deadping specifications are shown below.

deadping 20   # 20 seconds deadping 750ms # 750 milliseconds

deadtime - set failure (death) detection time
The deadtime directive is used to specify how quickly Heartbeat should decide that a node in a cluster is dead. Setting this value too low will cause the system to falsely declare itself dead. Setting it too high will delay takeover after the failure of a node in the cluster. Please read the FAQ document for more information on how to configure (tune) this important parameter.

The deadtime value is specified according to the time. Two sample deadtime specifications are shown below.

deadtime 10   # 10 seconds deadtime 250ms # 250 milliseconds (1/4 second)

debug - set debug level
The debug directive is used to set the level of debugging in effect in the system. Production systems should have their debug level set to zero (i.e., turned off). This is the default. Legal values of the debug option are between 0-255. The most useful values are between 0 (off) and 3. Setting the debug level greater than 1 can have an adverse effect on the size of your log files, and on the system's ability to send heartbeats at rapid rates, thus affecting the cluster reliability.

The debug level of the system can also be specified on the command line using the -d option. Additionally, the debug level of the system can be dynamically changed by sending the heartbeat process SIGUSR1 and SIGUSR2 signals. SIGUSR1 raises the debug level, and SIGUSR2 lowers it. A sample debug directive is shown below.

debug 0

debugfile - configures file for debug messages
The debugfile directive is deprecated for version 2.x configurations. Please enable the use_logd directive instead.

The debugfile directive specifies the file Heartbeat will write debug messages to.

A sample debugfile directive is shown below:

debugfile /var/log/ha-debug

hbaping directive
Hbaping directives are given to declare fiber channel devices as PingNodes to Heartbeat.

The syntax of the hbaping directive is simple:

hbaping fc-card-name The fc-card-name is the name obtained from the hbaapitest program that is part of the hbaapi package mentioned below. Running hbaapitest will produce verbose output. One of the first lines is similar to:

Adapter number 0 is named: qlogic-qla2200-0 Here fc-card-name is qlogic-qla2200-0.

This directive is not normally enabled in distributed versions of the Linux-HA software. To enable this directive, follow these steps:

Obtain the source to the HBAAPI libary from http://hbaapi.sourceforge.net, Compile it (Unfortunately the Makefile included in the tarball does not work in linux. You can download the Makefile for linux here) Copy the libHBAAPI.so file it produced into /usr/lib, Copy the hbaapi.h file from the package to /usr/include, Obtain and install the vendor-specific HBAAPI plugin specific to your HBA (Host Bus Adapter) from your HBA vendor, Configure, compile and install the Linux-HA (Heartbeat) package. As an alternative: install Heartbeat from RPM configure and compile Heartbeat from same-version source, and manually copy only the hbaping.so file to /usr/lib/heartbeat/plugins/HBcomm.

hbgenmethod - specifies method for creating Heartbeat communications generation number
The hbgenmethod directive specifies how Heartbeat should compute its current generation number for communications. This is a specialized and obscure directive, used mainly in firewalls which have no local disk, and other devices which do not have a method of storing data persistently across reboots. It defaults to storing the Heartbeat generations in a file. Generation numbers are used by Heartbeat for replay attack protection.

All legal hbgenmethod directives are shown below:

hbgenmethod time hbgenmethod file # this is the default.

Caveats
If one specifies the time method, there are certain possible cases where troubles can arise. If a machine restarts Heartbeat and its local time of day clock is less than or equal to than the value of the time of day clock when Heartbeat last started, then that node will be unable to join the cluster.

hopfudge - sets serial port forwarding maximum count
The hopfudge directive controls how many nodes a packet can be forwarded through before it is thrown away in the worst case. However, the hopfudge value is added to the number of nodes in the system. It defaults to 1.

A sample hopfudge directive is shown below:

hopfudge 1

initdead - set initial deadtime detection interval
The initdead parameter is used to set the time that it takes to declare a cluster node dead when Heartbeat is first started. This parameter generally needs to be set to a higher value, because experience suggests that it sometimes takes operating systems many seconds for their communication systems before they operate correctly. initdead is specified according to the time. A sample initdead value is shown below:

initdead 30 In some switched network environments, switches engage in a spanning tree algorithm whenever a NIC connects to a port. This can take a long time to complete, and it is only necessary if the NIC being connected is another switch. If this is the case, you may be able to configure certain NICs as not being switches and shrink the connection delay significantly. If not, you'll need to raise initdead to make this problem go away.

If this is set too low, you'll see one node declare the other as dead, and for non-CRM clusters, you'll see "both nodes own XXX resources" in the logs if initdead is set too low.

keepalive - set heartbeat keep-alive interval
The keepalive directive sets the interval between heartbeat packets. It is specified according to the time.

Two sample keepalive directives are shown below:

keepalive 100ms

keepalive 2 # 2 seconds

logfacility - configures syslog logging facility
The logfacility is used to tell Heartbeat which syslog logging facility it should use for logging its messages.

The possible values for logfacility vary by operating system, but some of the most common ones are {auth, authpriv, daemon, syslog, user, local0, local1, local2, local3, local4, local5, local6, local7}.

A sample logfacility directive is shown below:

logfacility local7 If you want to disable logging to syslog:

logfacility none

logfile - configures logging file
The logfile directive is deprecated for version 2.x configurations. Please enable the use_logd directive instead.

The logfile directive configures a log file. All non-debug messages from Heartbeat will go into this file.

A sample logfile directive is shown below:

logfile /var/log/ha-log

Caveats
Configuring a log file (instead of using syslog logging) can cause Heartbeat to block for several seconds under heavy load. This can affect the deadtime required for the system.

mcast - configures multicast communication path
The mcast directive is used to configure a multicast communication path.

The syntax of an mcast directive is: mcast dev mcast-group udp-port ttl 0 A sample mcast directive is shown below:
 * dev - IP device to send/rcv heartbeats on
 * mcast-group - multicast group to join (class D multicast address 224.0.0.0 - 239.255.255.255). For most Heartbeat uses, the first byte should be 239.
 * port - UDP port to sendto/rcvfrom (set this to the same value as udpport)
 * ttl - the ttl value for outbound heartbeats. This affects how far the multicast packet will propagate. (0-255). Set to 1 for the current subnet. Must be greater than zero.

mcast eth0 239.0.0.1 694 1 0

Bugs
This directive has a few more parameters than it should.

mcast6 - configures IPv6 multicast communication path
The mcast6 directive is used to configure an IPv6 multicast communication path.

The syntax of an mcast directive is: mcast6 [device] [mcast6 group] [port] [mcast6 hops] [mcast6 loop] A sample mcast6 directive is shown below, using link-local scope with some "transient" group: mcast6 eth0 ff12::1:2:3:4 694 1 0
 * dev - network interface to send/rcv heartbeats on
 * mcast6 group - multicast group to join.
 * port - UDP port to sendto/rcvfrom (set this to the same value as udpport)
 * hops - the IPv6 hop count for outbound heartbeats. This affects how far the multicast packet will propagate. (0-255). Set to 1 for the current subnet. Must be greater than zero.
 * loop - should outgoing messages be loop'ed back to the sender as well? Developement only. Always set to 0.

msgfmt - the directive to set the message format in wire
The msgfmt directive specifies the format Heartbeat uses in wire.

msgfmt  Default is classic.

If not sure, choose classic (default).
 * classic - Heartbeat will convert a message into a string and transmit in wire. Binary values are converted with a base64 library.
 * netstring - Binary messages will be transmitted directly. This is more efficient since it avoids conversion between string and binary values.

node directive
The node directive tells what machines are in the cluster. The syntax of the node directive is simple:

node nodename1 nodename2 ... Node names in the directive must (normally) match the "uname -n" of that machine.

You can declare multiple node names in one directive. You can also use the directive multiple times. Normally every node in the cluster must be listed in the ha.cf file, including the current node, unless the autojoin directive is enabled.

Note that starting with 2.0.4, the node directive is not completely authoritative with regard to nodes heartbeat will communicate with. If a node has ever been added in the past, it will tend to remain in the hostcache file more until it's manually removed. See Also: http://www.osdl.org/developer_bugzilla/show_bug.cgi?id=1226

ping directive
Ping directives are given to declare PingNodes to Heartbeat.

The syntax of the ping directive is simple:

ping ip-address ... Each IP address listed in a ping directive is considered to be independent. That is, connectivity to each node is considered to be equally important.

In order to declare that a group of nodes are equally qualified for a particular function, and that the presence of any of them indicates successful communication, use the ping_group directive.

ping_group directive
Ping group directives are given in the ha.cf file to declare a group PingNode to Heartbeat.

The syntax of the ping_group directive is simple:

ping_group group-name ip-address ... Each IP address listed in a ping_group directive is considered to be related, and connectivity to any one node is considered to be connectivity to the group.

A ping group is considered by Heartbeat to be a single cluster node (group-name). The ability to communicate with any of the group members means that the group-name member is reachable. This is useful when (for example) two different routers may be used to contact the internet, depending on which is up, or when finding an appropriate reliable single ping node is difficult.

realtime - enable realtime features in Heartbeat
The realtime directive specifies whether or not Heartbeat should try and take advantage of the operating system's realtime scheduling features. When enabled, Heartbeat will lock itself into memory, and raise its priority to a realtime priority (as set by the rtprio directive). This feature is mainly used for debugging various kinds of loops which might otherwise cripple the system and impair debugging them. The realtime flag is a boolean value, whose default value is true. A sample realtime directive is shown below.

realtime on

respawn - specifies programs for Heartbeat to run at startup
The respawn directive is used to specify a program to run and monitor while it runs. If this program exits with anything other than exit code 100, it will be automatically restarted. The first parameter is the user id to run the program under, and the second parameter is the program to run. Subsequent parameters will be given to the program as arguments.

At the current time, the program most people will be interested in running this way is ipfail.

A sample respawn directive is shown below:

respawn hacluster /usr/lib/heartbeat/ipfail SECURITY NOTE: It is a bad security practice to run programs from Heartbeat as root unless they are prepared to change their user ids once they're started. None of the programs which come with Heartbeat to be used with respawn should be run as root. Do not run them as root or. If you ignore this advice, just remember that BadThingsMayHappen, and don't blame us.

rtprio - specifies Heartbeat's realtime priority
The rtprio directive is used to specify the priority at which Heartbeat runs. It does not need to be specified unless other realtime priority programs are also running on the system. The minimum and maximum values for this field can be determined from the sched_get_priority_min(SCHED_FIFO) and sched_get_priority_max(SCHED_FIFO) calls respectively. The default value for rtprio is halfway between the minimum and maximum values.

A sample rtprio directive is shown below:

rtprio 5

serial - configure serial communication path
The serial directive tells Heartbeat to use the specified serial port(s) for its communication. The parameters to the serial directive are the names of tty devices suitable for opening without waiting for carrier first. On Linux, those ports are typically named /dev/ttySX.

A few sample serial directives are shown below:

serial /dev/ttyS0 /dev/ttyS1    # Linux serial /dev/cuaa0               # FreeBSD serial /dev/cua/a               # Solaris The baud directive is used to configure the baud rate for the port(s) if the baud directive is specified before the serial directive, otherwise the default baud rate will be used.

stonith directive
The stonith directive is used to configure Heartbeat's (release 1 only), STONITH configuration. It assumes you're going to put in a STONITH configuration file on each machine in the cluster to configure the (single) STONITH device that this node will use to reset the other node in the cluster.

Sample stonith directive

stonith {stonith-device-type} {stonith-configuration-file} where {stonith-device-type} is the type of (supported) STONITH device being configured, and {stonith-configuration-file} is the name of the file in which you put the STONITH configuration information for this particular STONITH device.

To get a list of valid {stonith-device-type}s, issue this command: stonith -L

To get a list of how to configure each type of STONITH device, issue the following command: stonith -h

NOTE: This command is mutually exclusive with the stonith_host directive.

stonith_host directive
The stonith_host directive is used to configure Heartbeat's (release 1 only), STONITH configuration. With this directive, you put all the STONITH configuration information for the devices in your cluster in the ha.cf file, rather than in a separate file.

You can configure multiple stonith devices using this directive. The format of the line is:

stonith_host {hostfrom} {stonith_type} {params...} Only one stonith_host directive can have a * for {hostfrom}.
 * {hostfrom} is the machine the stonith device is attached to or * to mean it is accessible from any host.
 * {stonith_type} is the type of stonith device
 * {params...} are the configuration parameters this STONITH device requires.

Caveats
If you put your stonith device access information in ha.cf, and you make this file publically readable, you're inviting a denial of service attack.

To get a list of valid {stonith-device-type}s, issue this command: stonith -L

To get a list of {params...} for each type of STONITH device, issue the following command: stonith -h

NOTE: This command is mutually exclusive with the stonith directive.

traditional_compression - controls compression mode
The general syntax of a traditional_compression directive is:

traditional_compression boolean

An example ha.cf snipplet is shown below, together with recommended related settings suitable for use with pacemaker:

# pacemaker passes down xml fields, # which compress best with bz2 compression bz2 # pacemaker may chose to compress some message fields itself, # but recent pacemaker will only do so if the plain text exceeds 128kB, # which is too much for UDP. # If that happens to be a value of a message field NOT marked for # "selective compression", instead of risking "EMSGSIZE", # compress whole packets in the hearbeat core ... traditional_compression on # with a reasonable threshold compression_threshold 40

With traditional_compression off, but compression bz2, only selected message fields will be compressed by the heartbeat core, if so requested by the user of the heartbeat communication infrastructure.

With traditional_compression on, whole packets will be compressed, regardless of field type, including header information. Which is less efficient, as all packets have to be uncompressed, even on nodes that could have ignored the message, if they had known they are not the intended recipient by looking at the header information.

Non-pacemaker users of heartbeat should probably set this to off.

ucast - configures unicast Heartbeat communication
The ucast directive configures Heartbeat to communicate over a UDP unicast communications link. The udpport directive is used to configure which port is used for these unicast communications if the udpport directive is specified before the ucast directive, otherwise the default port will be used.

The general syntax of a ucast directive is: ucast dev peer-ip-address Where dev is the device to use when talking to the peer, and peer-ip-address is the IP address we will send packets to.

Although this is a unicast communication link, the UDP packets sent over this link is a multicast protocol.

A sample ucast directive is shown below:

ucast eth0 10.10.10.133 This directive will cause us to send packets to 10.10.10.133 over interface eth0.

Note that ucast directives which go to the local machine are effectively ignored. This allows the ha.cf directives on all machines to be identical.

udpport - specifies port for UDP communication
The udpport directive specifies which port Heartbeat will use for its UDP intra-cluster communication. There are two common reasons for overriding this value: there are multiple bcast clusters on the same subnet, or this port is already in use in accordance with some locally-established policy.

The default value for this parameter is the the port ha-cluster in /etc/services (if present), or 694 if port ha-cluster is not in /etc/services. 694 is the IANA registered port number for Heartbeat (a.k.a. ha-cluster).

A sample udpport directive is shown below. udpport 694

You have to configure udpport (in ha.cf) before you configure ucast or bcast, if not heartbeat will use the default port (694)

NOTE: The GUI doesn't use UDP, and isn't intracluster communications, so GUI communication is not affected by this directive.

BUGS: Due to a specification error in the syntax of the mcast directive, this directive does not apply to mcast communications.

use_logd - the directive to determine whether heartbeats use logging daemon or not
The use_logd directive specifies whether Heartbeat logs its messages through logging daemon or not. The syntax is simple:

use_logd  (Note: use_logd can be any boolean value)

The detailed policy is:

If the logging daemon is used, all log messages will be sent through IPC to the logging daemon, which then writes them into log files. In case the logging daemon dies (for whatever reason), a warning message will be logged and all messages will be written to log files directly.
 * 1) if there is any entry for debugfile/logfile/logfacility in ha.cf
 * 2) if use_logd is not set, logging daemon will not be used
 * 3) if use_logd is set to on, logging daemon will be used
 * 4) if use_logd is set to off, logging daemon will not be used
 * 5) if there is no entry for debugfile/logfile/logfacility in ha.cf
 * 6) if use_logd is not set, logging daemon will be used
 * 7) if use_logd is set to on, logging daemon will be used
 * 8) if use_logd is set to off, config error, i.e. you can not turn off all logging options

If the logging daemon is used, logfile/debugfile/logfacility in this file are not meaningful any longer. You should check the config file for logging daemon (the default is /etc/logd.cf).

If use_logd is not used, all log messages will be written to log files directly.

The logging daemon is started/stopped in heartbeat script.

Setting use_logd to "yes" is recommended.

uuidfrom - selects how the local UUID is generated
In the normal case, heartbeat generates a UUID for each node in the system as a way of uniquely identifying a node - even if it should change nodenames. This UUID is typically stored in the file /var/lib/heartbeat/hb_uuid.

For certain kinds of installations (those booting from CDs or other read-only media), it is impossible for heartbeat to save a generated to disk as it normally does. In these cases, one can use the uuidfrom directive to instruct heartbeat to use the nodename as though it were a UUID, by specifying uuidfrom nodename.

All possible legal uuidfrom directives are shown below.

uuidfrom file uuidfrom nodename

warntime - set late heartbeat warning time
The warntime directive is used to specify how quickly Heartbeat should issue a "late heartbeat" warning.

The warntime value is specified according to the time. A sample warntime specification is shown below.

warntime 10   # 10 seconds The warntime directive is important for tuning deadtime.

watchdog - configure watchdog device
The watchdog directive configures Heartbeat to use a watchdog device. In some circumstances, a watchdog device can be used in place of a STONITH device. In any case, it is a reasonable thing to configure if you don't have a STONITH device, or if you wish, in addition to your STONITH device.

It is the purpose of a watchdog device to shut the machine down if Heartbeat does not hear its own heartbeats as often as it thinks it should. This keeps things like scheduler bugs from becoming split-brain configurations.

The general syntax of a watchdog directive is: watchdog watchdog-device-name

A sample watchdog directive is shown below: watchdog /dev/watchdog

The most common watchdog device currently used with general Linux systems is the softdog device. The softdog device is a software-based watchdog device and is usually referred to as /dev/watchdog - although like most UNIX devices, this is a convention not a rule.

Caveats
Heartbeat tries to set the watchdog device to reboot the system at the next second after it would declare itself dead.

It also tries to ensure that if it is shut down gracefully, that it will keep the system from rebooting when it exits. However, this behavior is out of its hands. It depends on the watchdog device driver. For the softdog driver see the softdog page for details on how you can make this work the way you want it to.

Defaults
The ha.cf directives with default values are shown below - along with a brief description. This was produced by heartbeat -DW # 2.0.3

Boolean
All of the following values are equivalent specifications for a true value:
 * true
 * yes
 * on
 * y
 * 1

All of the following values are equivalent specifications for a false value:
 * false
 * no
 * off
 * n
 * 0

Time
When it is necessary to specify time intervals to Heartbeat, times can be specified as a floating point number followed by an optional units-specifier. The units specifiers allowed are: If a units specifier is omitted, seconds are assumed. The following are examples of legal Heartbeat time interval specifications: 1 100ms 100000us .001 1500.1ms
 * ms - milliseconds
 * us - microseconds
 * usec - microseconds