This site best when viewed with a modern standards-compliant browser. We recommend Firefox Get Firefox!.

Linux-HA project logo
Providing Open Source High-Availability Software for Linux and other OSes since 1999.

USA Flag UK Flag

Japanese Flag

ホームページ

サイトについて

コンタクト情報

使用条件

協力方法

セキュリティ

2008.8.28
RHEL用rpm更新
更新情報はこちらから

2008.8.18
Heartbeat 2.1.4
リリース!
Downloadはこちらから

2008.6.30
パッケージ追加
追加パッケージ集にパッケージを追加しました

2007.11.13
Linux-ha-japan日本語ML移植しました

2007.10.5
日本語サイトOPEN
日本語MLも開設しました

2007.10.5
OSC2007 Tokyo/Fall で Heartbeat紹介
発表資料を公開しました

Last site update:
2008-10-11 01:00:32

Contents

  1. Start and Stop
    1. Timeout Example
  2. Resource Monitoring
    1. Monitoring Examples
  3. Per Action Parameters
    1. Parameter Examples
    2. Finding Parameters in Resource Agents
      1. LSB Init Scripts
      2. OCF Scripts
      3. Legacy Heartbeat Scripts

Start and Stop

Version 1.x of Heartbeat supported three actions:

  • start,
  • stop, and
  • status

Version 2 also supports only the start and stop actions directly but brings the benefit of being able to specify a timeout for either action. If the action does not complete within the timeout, the action is considered to have failed and recovery measures will be taken.

Timeouts must be specified per-action, per-resource. There is no global or resource default.

Timeout Example

<primitive id="NameServer" class="lsb" type="named">
    <operations>
        <op id="1" name="stop"  timeout="3s"/>
        <op id="2" name="start" timeout="5s"/>
    </operations>
</primitive>

Annotated DTD

Resource Monitoring

One of the most requested Heartbeat features was the ability for it to detect when a resource failed (not just the whole node).

To support this, the CRM also knows about monitor actions.

NOTE: monitor actions are not executed by default. If you wish Heartbeat to make sure the resource is running, then you must specify one or more monitor actions in the operations section of the resource. You have to define one monitor action for each of the resources roles (e.g. role="master", role="slave").

In addition to the timeout field, monitor actions must also specify an interval. This tells the Heartbeat how often it should check the resource's status.

Monitoring Examples

This example indicates that the resources should be checked every 10 seconds to see if it is still running.

<primitive id="NameServer" class="lsb" type="named">
    <operations>
        <op id="1" name="stop"  timeout="3s"/>
        <op id="2" name="start" timeout="5s"/>
        <op id="3" name="monitor" interval="10s"  timeout="3s"/>
    </operations>
</primitive>

Here we add a second monitor action, one that runs once per minute.

<primitive id="NameServer" class="lsb" type="named">
    <operations>
        <op id="1" name="stop"  timeout="3s"/>
        <op id="2" name="start" timeout="5s"/>
        <op id="3" name="monitor" interval="10s"  timeout="3s"/>
        <op id="4" name="monitor" interval="1min" timeout="5s"/>
    </operations>
</primitive>

NOTE: Each monitor operation for the resource must have a unique interval.

Here we define a monitor action for a MultiState (master_slave) resource.

<master_slave id="ms_1" interleave="true">
    <meta_attributes id="ms_1_ma">
        <attributes>
        ...
        </attributes>
    </meta_attributes>
    <primitive class="ocf" id="drbd" provider="heartbeat" type="drbd">
        <operations>
            <-- role="Started" is the default value -->
            <op name="monitor" id="drbd_www_mon_normal" interval="15s" timeout="10s" />
            <op name="monitor" id="drbd_www_mon_slave" interval="10s" timeout="10s" role="Slave" />
            <op name="monitor" id="drbd_www_mon_master" interval="5s" timeout="10s" role="Master" />
        </operations>
    </primitive>
</master_slave>

NOTE: As always, each monitor operation for the resource must have a unique interval. Moreover, if no role="" attribute is given, role defaults to "Started".

Per Action Parameters

It is also possible to pass extra parameters to a ResourceAgent depending on the type of action being performed. This is done using instance_attributes.

Parameter Examples

Below you'll find an example of how to be told what type of check to use to determine the resource's status.

  • The example below follows the OCF standard (found here, specifically section 2.5.3.1) to specify what type of check to make.

<primitive id="NameServer" class="ocf" type="apache" provider="heartbeat">
    <operations>
        <op id="1" name="stop"  timeout="3s"/>
        <op id="2" name="start" timeout="5s"/>
        <op id="3" name="monitor" interval="10s"  timeout="3s">
            <instance_attributes id="monitor_10s">
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_10SEC" name="OCF_CHECK_LEVEL" value="0"/>
                </attributes>
            </instance_attributes>
        </op>
        <op id="4" name="monitor" interval="1min" timeout="5s">
            <instance_attributes id="monitor_1min>
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_1MIN" name="OCF_CHECK_LEVEL" value="10"/>
                </attributes>
            </instance_attributes>
        </op>
        <op id="5" name="monitor" interval="30min" timeout="20s">
            <instance_attributes id="monitor_30min">
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_30MIN" name="OCF_CHECK_LEVEL" value="20"/>
                </attributes>
            </instance_attributes>
        </op>
    </operations>
</primitive>

The example below assumes someone named named-provider has provided you with an OCF-compliant resource agent.

<primitive id="NameServer" class="ocf" type="named" provider="named-provider">
    <operations>
        <op id="1" name="stop"  timeout="3s"/>
        <op id="2" name="start" timeout="5s"/>
            <instance_attributes id="start">
                <attributes>
                    <nvpair id="foo" name="foo" value="bar"/>
                </attributes>
            </instance_attributes>
        <op id="3" name="monitor" interval="10s"  timeout="3s">
            <instance_attributes id="monitor_10s">
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_10S" name="OCF_CHECK_LEVEL" value="0"/>
                    <nvpair id="check_hosts_mon_10s" name="check_hosts" value="www.mycorp.com"/>
                </attributes>
            </instance_attributes>
        </op>
        <op id="4" name="monitor" interval="1min" timeout="5s">
            <instance_attributes id="monitor_1min">
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_1MIN" name="OCF_CHECK_LEVEL" value="10"/>
                    <nvpair id="check_hosts_mon_1min" name="check_hosts" value="www.mycorp.com,www.google.com"/>
                </attributes>
            </instance_attributes>
        </op>
        <op id="5" name="monitor" interval="30min" timeout="20s">
            <instance_attributes id="monitor_30min">
                <attributes>
                    <nvpair id="OCF_CHECK_LEVEL_MON_30MIN" name="OCF_CHECK_LEVEL" value="20"/>
                    <nvpair id="check_hosts_mon_30_min" name="check_hosts" value="www.mycorp.com,www.google.com"/>
                    <nvpair id="verify_with" name="verify_with" value="alt.dns.server"/>
                </attributes>
            </instance_attributes>
        </op>
    </operations>
</primitive>

Finding Parameters in Resource Agents

Remember parameters will be named differently depending on the type of ResourceAgent you are using.

LSB Init Scripts

LSB init scripts do not take parameters.

OCF Scripts

OCF resource scripts are the only form of resource agent which takes name/value parameters. In this case, they're prefixed by an OCF_RESKEY_ prefix.

echo $OCF_RESKEY_check_hosts
www.mycorp.com,www.google.com 

Legacy Heartbeat Scripts

Named parameters are not supported and instead the name must refer to the (relative) position of the value as an argument.

There is no example provided here as use of action parameters with Legacy Heartbeat RAs is painful and discouraged. Please consider using the OCF version instead or converting any custom RAs to the OCF scripts. It's really quite easy.

There is a reason they are called Legacy scripts :)