
pingd is a replacement for ipfail that like the rest of version 2 allows for connectivity (and resource placement based on relative connectivity) to work in clusters with any number of nodes.
The role of pingd is to detect changes to a node's connectivity and ensure that updates of this information to the CIB[1] occur (at least effectively) simultaneously.
To locate your resources on the node(s) with the greatest connectivity, an admin needs to use the information placed in the CIB by pingd. This is achieved with the creation of resource location[2] constraints that reference the attribute created by pingd. See "Using pingd Output in Location Constraints" below.
There are two options for configuring pingd
The first is by adding a respawn directive to ha.cf eg:
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s
See "pingd Usage Information" below for details on the meaning of the options passed to pingd.
The other option is to create a clone[3] using the pingd OCFResourceAgent[4]. Since RA's are started as root, you need to add a line like "apiauth pingd uid=root" to your ha.cf[5] or - even better - add the parameter user=hacluster to your RA configuration to use the implicit apiauth directive like the respawn method does. With version 2.0.6 you also have to add parameter pidfile which points to somewhere the chosen user has write permission (e.g. /tmp/pingd-default).
Both methods need the ping nodes listed in your ha.cf[5] file. The host_list parameter of the RA can only use a subset of those, not some other hosts.
Both methods also require the addition of one-or-more colocation constraints to the CIB. See "Using pingd Output in Location Constraints" below.
The advantage of using the resource agent is that you can:
An equivalent resource for the respawn directive above would be:
<clone id="pingd-clone">
<meta_attributes id="pingd-clone-ma">
<attributes>
<nvpair id="pingd-clone-1" name="globally_unique" value="false"/>
</attributes>
</meta_attributes>
<primitive id="pingd-child" provider="heartbeat" class="ocf" type="pingd">
<operations>
<op id="pingd-child-monitor" name="monitor" interval="20s" timeout="60s" prereq="nothing"/>
<op id="pingd-child-start" name="start" prereq="nothing"/>
</operations>
<instance_attributes id="pingd_inst_attr">
<attributes>
<nvpair id="pingd-1" name="dampen" value="5s"/>
<nvpair id="pingd-2" name="multiplier" value="100"/>
</attributes>
</instance_attributes>
</primitive>
</clone>
NOTE: Changing the attribute's location in the CIB, while possible, is discouraged. This is because you may end up with multiple copies of the attribute for each node... causing the cluster to behave differently than expected.
usage: pingd [-V?p:a:d:s:S:h:Dm:]
--help (-?) This text
--daemonize (-D) Run in daemon mode
--pid-file (-p) <filename> File in which to store the process' PID
* Default=/tmp/pingd.pid
--attr-name (-a) <string> Name of the node attribute to set
* Default=pingd
--attr-set (-s) <string> Name of the set in which to set the attribute
* Default=cib-bootstrap-options
--attr-section (-S) <string> Which part of the CIB to put the attribute in
* Default=status
--ping-host (-h) <single_host_name> Monitor a subset of the ping nodes listed in ha.cf
(can be specified multiple times)
--attr-dampen (-d) <integer> How long to wait for no further changes to occur before
updating the CIB with a changed attribute
--value-multiplier (-m) <integer> For every connected node, add <integer> to the value set in the CIB
* Default=1
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a default_ping_set
Node |
Connected Ping Nodes |
default_ping_set Value |
c001n01 |
5 |
500 |
c001n02 |
4 |
400 |
c001n03 |
5 |
500 |
c001n04 |
N/A |
N/A |
c001n05 |
0 |
0 |
<rsc_location id="my_resource_connected" rsc="my_resource">
<rule id="my_resource_connected_rule" score_attribute="default_ping_set">
<expression id="my_resource_connected_expr_defined" attribute="default_ping_set" operation="defined"/>
</rule>
</rsc_location>
The above constraint:
requires a value to be set for default_ping_set (c001n04 is unaltered)
requires the value of default_ping_set to be greater than 100 (c001n05 is unaltered)
increases the preference for my_resource to run on c001n01 by 500
increases the preference for my_resource to run on c001n02 by 400
increases the preference for my_resource to run on c001n03 by 500
Node |
Connected Ping Nodes |
default_ping_set Value |
Combined Score |
c001n01 |
5 |
500 |
500 |
c001n02 |
4 |
400 |
400 |
c001n03 |
5 |
500 |
500 |
c001n04 |
N/A |
N/A |
0 |
c001n05 |
0 |
0 |
0 |
If we also had the following constraint:
<rsc_location id="my_resource_preferred" rsc="my_resource">
<rule id="my_resource_prefer_c001n01" score="100">
<expression id="my_resource_prefer_c001n01_expr" attribute="#uname" operation="eq" value="c001n01"/>
</rule>
<rule id="my_resource_prefer_c001n02" score="200">
<expression id="my_resource_prefer_c001n02_expr" attribute="#uname" operation="eq" value="c001n02"/>
</rule>
<rule id="my_resource_prefer_c001n03" score="300">
<expression id="my_resource_prefer_c001n03_expr" attribute="#uname" operation="eq" value="c001n03"/>
</rule>
<rule id="my_resource_never" score="-INFINITY" boolean_op="or">
<expression id="my_resource_never_c001n04_expr" attribute="#uname" operation="eq" value="c001n04"/>
<expression id="my_resource_never_c001n05_expr" attribute="#uname" operation="eq" value="c001n05"/>
</rule>
</rsc_location>
Then the updated scores for running the resource would be:
Node |
Connected Ping Nodes |
default_ping_set Value |
Combined Score |
c001n01 |
5 |
500 |
600 |
c001n02 |
4 |
400 |
600 |
c001n03 |
5 |
500 |
800 |
c001n04 |
N/A |
N/A |
-INFINITY |
c001n05 |
0 |
0 |
-INFINITY |
At this point, if the resource was not running or v2/dtd1.0/annotated#default resource stickiness[6] was set to zero, then the resource would be started on c001n03 with c001n01 and c001n02 equally preferred as a backup.
However if the resource was running on c001n02 and resource_stickiness was set to 1000, then the updated scores would be:
Node |
Connected Ping Nodes |
default_ping_set Value |
Combined Score |
c001n01 |
5 |
500 |
600 |
c001n02 |
4 |
400 |
1600 |
c001n03 |
5 |
500 |
800 |
c001n04 |
N/A |
N/A |
-INFINITY |
c001n05 |
0 |
0 |
-INFINITY |
and the resource would be left running on c001n02.
Alternatively, if resource_stickiness was set to 100, then the scores would look like this:
Node |
Connected Ping Nodes |
default_ping_set Value |
Combined Score |
c001n01 |
5 |
500 |
600 |
c001n02 |
4 |
400 |
700 |
c001n03 |
5 |
500 |
800 |
c001n04 |
N/A |
N/A |
-INFINITY |
c001n05 |
0 |
0 |
-INFINITY |
and the resource would be moved to c001n03.
This should also adequately demonstrate the importance of correctly setting:
pingd's --value-multiplier option
default_resource_stickiness / resource_stickiness
score in rsc_location constraints
Add this to ha.cf
respawn root /usr/lib/heartbeat/pingd -m 100 -d 5s -a pingd
Add this constraint to the CIB:
It is sometimes desirable to shut a particular service down if ping connectivity is lost. This rule will prohibit the service from running anywhere that there is no ping connectivity to the outside world, and all nodes with some connectivity are treated as the same, regardless of how many ping nodes are accessible.
<rsc_location id="my_resource:connected" rsc="my_resource">
<rule id="my_resource:connected:rule" score="-INFINITY" boolean_op="or">
<expression id="my_resource:connected:expr:undefined"
attribute="pingd" operation="not_defined"/>
<expression id="my_resource:connected:expr:zero"
attribute="pingd" operation="lte" value="0"/>
</rule>
</rsc_location>
Of course, if you have configured the pingd[7] daemon to set some attribute name besides its default (pingd), then you need to change the name of the attribute above from pingd to whatever name you have configured the pingd[7] daemon to use.
Attention: Note that this will stop the resource everywhere if the pinged node(s) indeed go down or heartbeat loses connectivity to them (firewalls et cetera). Consider using the wiki:CIB/Idioms/PingdAttrAsScore[8] instead, which instead expresses a positive preference for the node with the best connectivity.
This is probably one of the better ways to use pingd. In this method, the pingd attribute value becomes the score for the rule. So, the --value-multiplier you set will depend heavily on the scores you give other criteria. This rule will not stop a resource completely if all nodes lose connectivity to the outside world.
It is often desirable to allow the value of the attribute that pingd[7] sets directly as a the score for a particular rule.
If you set the pingd[7] scaling factor to 100, then having access to one node is worth 100, 2 nodes is worth 200, and so on.
This way, if all else is equal, the node with the highest ping connectivity will be selected. If two or more eligible nodes have the same score, then they will be given equal weight according to the rule below.
<rsc_location id="my_resource:connected" rsc="my_resource">
<rule id="my_resource:connected:rule" score_attribute="pingd" >
<expression id="my_resource:connected:expr:defined"
attribute="pingd" operation="defined"/>
</rule>
</rsc_location>
Of course, if you have configured the pingd[7] daemon to set some attribute name besides its default (pingd), then you need to change the name of the score_attribute above from pingd to whatever attribute you have configured the pingd[7] daemon to use.
| [1] | http://www.linux-ha.org/CIB |
| [2] | http://www.linux-ha.org/v2/dtd1.0/annotated#rsc%2Blocation |
| [3] | http://www.linux-ha.org/v2/Concepts/Clones |
| [4] | http://www.linux-ha.org/OCFResourceAgent |
| [5] | http://www.linux-ha.org/ha.cf |
| [6] | http://www.linux-ha.org/v2/dtd1.0/annotated#default%2Bresource%2Bstickiness |
| [7] | http://www.linux-ha.org/pingd |
| [8] | http://www.linux-ha.org/CIB/Idioms/PingdAttrAsScore |
This information provided courtesy of the Linux-HA project at http://linux-ha.org/