kindly provided by John R. Shearer, PUREmail, Inc. <john@puremail.com> edited by Lars Ellenberg Last updated: 2005-08-22
Providing NFS service in High Availability cluster presents several specific problems. This document is intended to provide the reader with enough information to successfully implement HA file service. Due to various distribution-specific issues only Debian GNU/Linux 3.1 "Sarge" is discussed here, though most information is applicable to other Un*x and Linux variants. Where possible I have included information for other distributions.
See Also: HaNFS
To properly install the proposed environment the following hardware will be required:
For the purposes of this document the hosts will be named and have IP addresses as follows:
host-a 10.0.0.10 host-b 10.0.0.20 fs-cluster 10.0.0.30
Alternatively, just apt-get it, see ../InstallDebianPackages07 . Precompiled packages for other distributions (rpms) can be found at http://www.linbit.com/support/drbd-current
# apt-get install drbd0.7-module-source # apt-get install drbd0.7-utils # apt-get install dpatch # cd /usr/src # tar -zxf drbd0.7.tar.gz # cd /usr/src/modules/drbd # module-assistant prepare # module-assistant automatic-install drbd0.7-module-source Navigate the module package creation procedure as logically as possible; details for this procedure are not provided. # cd /usr/src # dpkg -i drbd0.7-module-2.4.27-2-k7_0.7.10-3+2.4.27-8_i386.deb
http://lists.linbit.com/pipermail/drbd-dev/2005-February/000266.html
resource drbd-resource-0 {
protocol C;
incon-degr-cmd "halt -f"; # killall heartbeat would be a good alternative :->
disk {
on-io-error panic;
}
syncer {
rate 10M; # Note: 'M' is MegaBytes, not MegaBits
}
on host-a {
device /dev/drbd0;
disk /dev/hda8;
address 10.0.0.10:7789;
meta-disk internal;
}
on host-b {
device /dev/drbd0;
disk /dev/hda8;
address 10.0.0.20:7789;
meta-disk internal;
}
} # /etc/init.d/drbd start
host-a: # drbdadm primary allYou may have to force the issue by running:
host-a: # drbdsetup /dev/drbd0 primary --do-what-I-say
host-a: # mkreiserfs /dev/drbd0 (or mke2fs, or whatever filesystem you prefer) host-a: # mkdir /share host-a: # mkdir /share/spool0 host-a: # mount /dev/drbd0 /share/spool0Now create a subdirectory that will be shared out via NFS:
host-a: # mkdir /share/spool0/data
# apt-get install nfs-kernel-server
# /etc/init.d/nfs-kernel-server stop
# echo "/share/spool0/data 10.3.11.0/255.255.255.0(rw,sync)" >> /etc/exports
# update-rc.d -f nfs-kernel-server remove (for non-Debian systems you can probably just delete /etc/init.d/ nfs or /etc/init.d/nfs-server)
host-a: # mv /var/lib/nfs /share/spool0/varlibnfs host-a: # ln -s /share/spool0/varlibnfs /var/lib/nfsNote that although this procedure is often recommended it is contested by Peter Kruse at
http://lists.linbit.com/pipermail/drbd-user/2004-June/001116.html. The author of this document makes no claim to its necesitity.
# echo 'STATDOPTS="-n my_clusters_name"' >> /etc/default/nfs-commonIn Redhat/Fedora the procedure is something like...
# echo 'STATD_HOSTNAME=my_clusters_name' >> /etc/sysconfig/network (Anyone: please verify this)For other distributions you will probably have to insert
rpc.statd -m my_clusters_name into the NFS startup/shutdown script. For a better description of why this modification is required see an excellent post by Ragnar Kjørstad at http://lists.community.tummy.com/pipermail/linux-ha-dev/2003-June/006000.html
# apt-get install heartbeat
/etc/ha.d/authkeys for both hosts:
auth 2 1 crc 2 sha1 ThisIsASampleKeyAnythingAlphaNumericIsGoodHere 3 md5 ThisIsASampleKeyAnythingAlphaNumericIsGoodHere
/etc/ha.d/ha.cf for node-a (node-b would have the same configuration but with "ucast eth0 10.0.0.10" instead of 10.0.0.20):
keepalive 1 deadtime 10 warntime 5 initdead 60 udpport 694 baud 19200 serial /dev/ttyS0 ucast eth0 10.0.0.20 auto_failback off watchdog /dev/watchdog node node-a node node-bNote that auto_failback is off; we should not fail back unless a human has confirmed that the DRBD state is consistent.
http://lists.linbit.com/pipermail/drbd-user/2004-June/001107.html to resolve a problem getting the error "Stale NFS file handle" during failovers. This may be a Debian specific problem and it may even have been resolved; consider omitting it during your testing. If anyone has any advice on this step please let me know. I'd suggest reading the HaNFS page - those directions work fine without Delay...
# echo 'killall -9 nfsd ; exit 0' > /etc/heartbeat/resource.d/killnfsd # chmod 755 /etc/heartbeat/resource.d/killnfsd
It seems that the Debian nfs-kernel-server RC script often fails to stop all of the running nfsd processes. This can result in clients getting "Stale NFS file handle" errors when attempting to failover gracefully. This script seems to solve the problem. This procedure was inspired by Jens Dreger's post above. This is fixed in newer versions of NFS - and is also documented and explained on the HaNFS page.
/etc/heartbeat/haresources:
host-a drbddisk::drbd-resource-0 \
Filesystem::/dev/drbd0::/share/spool0/data::reiserfs \
killnfsd \
nfs-common \
nfs-kernel-server \
Delay::3::0 \
IPaddr::10.0.0.30/24/eth0client-a: # mkdir /mnt/hafs client-a: # echo "fs-cluster:/share/spool0/data /mnt/hafs nfs defaults 0 0" >> /etc/fstab client-a: # mount -aNote that fs-cluster must resolve to 10.0.0.30, the shared IP address of our new cluster.
In the above examples the two hosts are connected to the network, and therefore to their NFS client hosts, by a single Ethernet interface. Should that interface, cable or hub/switch fail then the hosts will be unreachable and, moreover, they will be unable to replicate. Using the Ethernet bonding driver we can set up hosts that have multiple interfaces which operate in fail-over mode.
To improve file service availability during a full DRBD synchronization the two file servers can be connected together by cross-over cable on a separate Ethernet interface. This will reduce replication times, increase general throughput (when using DRBD protocol C), and prevent replication from interfering with file service.
These two enhancements can be combined. A server with four interfaces, two that provide network connectivity and two that provide host-to-host connectivity, gives the ultimate in throughput and reliability.
See also the many other sample NFS configurations in the PressRoom - most using DRBD.
Highly Available NFS-server using DRBD, Heartbeat and User Mode Linux by Nelson Castillo
DRBD HOWTO by David Krovich
Various posts to the DRBD-user mailing list
Various other pages on linux-ha.org, e.g. HaNFS.