This site is a work in progress — you can help! Please see the Site news for details.

Heartbeat Resource Agents

From Linux-HA

Jump to: navigation, search

The legacy Heartbeat resource agents (scripts) are basically LSB init scripts - with slightly odd status operations.

The following is true for all resource agents (even init scripts) for haresources mode, and for class=heartbeat primitives in Pacemaker. But when using Pacemaker, you are usually better off using the corresponding OCF Resource Agents, or, if none is available, real LSB Resource Agents instead.

The only operations on the resource scripts which the cluster performs are:

  • start
  • stop
  • status

These operations are as follows:

start operation

Activate the given resource.

According to the LSB, it is never an error to start an already active resource. Exit with 0 on success, nonzero on failure. The cluster will only start a resource if it wants it to be running on the current machine, and status shows it's not already running. The cluster will never start the same resource at the same time in different nodes in the cluster.

stop operation

Deactivate the given resource.

Performed when we want to make sure a resource is not running. Although there are occasions when we check to see if a resource is running before stopping it, during shutdown, we will stop all resources whether or not we think they're running.

According to the LSB, stopping a resource which is already stopped is always permissible. The cluster will DEFINITELY stop resources it doesn't know is running. Stop failures can result in the machine being rebooted to clear up the error. Note that some init scripts are not LSB-compliant and complain when trying to stop resources which are not running. You'll have to fix those to properly work as cluster resource agents.

status operation

Determine running status of the given resource.

The status operation has to really report status correctly, AND, it has to print either OK or running when the resource is active, and it CANNOT print either of those when it's inactive. For the status operation, we ignore the return code.

This sounds quite odd, but it's a historical hangover for compatibility with earlier versions of Linux distributions where the init scripts didn't reliably give proper status exit codes, but they did print OK or running reliably.

Heartbeat calls the status operation in many places. We do it before starting any resource, and also (IIRC) when releasing resources.

After repeated stop failures, we will do a status on the resource. If the status reports that the resource is still running, then we will reboot the machine to make sure things are really stopped. Note that this behaviour is only with haresources based clusters. CRM/Pacemaker clusters use stonith.

Concurrency

Start, stop and status operations are NEVER overlapped on a given resource on a given machine. You don't have to worry about concurrency of an operation on a resource.

Parameters

Unlike LSB Resource Agents, a Heartbeat Resource Agent can be passed a list of positional parameters. The parameters go before the operation name, like this:

IPaddr 10.10.10.1 start

The haresources line which corresponds to this set of parameters is:

IPaddr::10.10.10.1

and invoked with the start operation.

Location

haresources mode looks for resource scripts in /etc/ha.d/resource.d and /etc/init.d, in that order.

See Also

Resource Agents, haresources, LSB Resource Agents, OCF Resource Agents

Personal tools