Appendix D. Supplementary Hardware Information

The following sections provide additional information about configuring the hardware used in a cluster system.

D.1. Setting Up Power Controllers

This section discusses power controllers. For more information about power controllers and their role in a cluster environment, refer to Section 2.1.3 Choosing the Type of Power Controller.

D.1.1. Power Switches

For a list of serial-attached and network-attached power switches tested with and/or supported by Red Hat, Inc. for cluster power control, refer to the Red Hat Hardware Compatibility List located at the following URL:

http://hardware.redhat.com/hcl/

D.1.2. Setting up Watchdog Power Switches

A description of the usage model for watchdog timers as a cluster data integrity provision appears in Section 2.1.3 Choosing the Type of Power Controller. As described in that section, there are two variants of watchdog timers: hardware-based and software-based.

This section details the configuration tasks required to setup watchdog timer usage in a cluster hardware configuration.

Regardless of which type of watchdog timer is employed, it is necessary to create the device special file appropriate for the watchdog timer. This can be accomplished with the following commands:

cd /dev
./MAKEDEV watchdog

When using the Cluster Configuration Tool, each new member added to the cluster has software watchdog functionality enabled by default.

D.1.2.1. Configuring the Software Watchdog Timer

Any cluster system can utilize the software watchdog timer as a data integrity provision, as no dedicated hardware components are required. The cluster software automatically loads the corresponding loadable kernel module called softdog.

If the cluster is configured to utilize the software watchdog timer, the cluster membership daemon (clumembd) periodically resets the timer interval. Should clumembd fail to reset the timer, the failed cluster member reboots itself.

When using the software watchdog timer, there is a small risk that the system hangs in such a way that the software watchdog thread is not executed. In this unlikely scenario, the other cluster member may takeover services of the apparently hung cluster member. Generally, this is a safe operation; but in the unlikely event that the hung cluster member resumes, data corruption could occur. To further lessen the chance of this vulnerability occurring when using the software watchdog timer, administrators should also configure the NMI watchdog timer as well as an external power switch (if available).

D.1.2.2. Enabling the NMI Watchdog Timer

If you are using the software watchdog timer as a data integrity provision, it is also recommended to enable the Non-Maskable Interrupt (NMI) watchdog timer to enhance the data integrity guarantees. The NMI watchdog timer is a different mechanism for causing the system to reboot in the event of a hang scenario where interrupts are blocked. This NMI watchdog can be used in conjunction with the software watchdog timer.

Unlike the software watchdog timer which is reset by the cluster quorum daemon (cluquorumd), the NMI watchdog timer counts system interrupts. Normally, a healthy system receives hundreds of device and timer interrupts per second. If there are no interrupts in a 5 second interval, a system hang has occurred and the NMI watchdog timer expires, initiating a system reboot.

A robust data integrity solution can be implemented by combining the health monitoring of the the cluster quorum daemon with the software watchdog timer along with the low-level system status checks of the NMI watchdog.

Correct operation of the NMI watchdog timer mechanism requires that the cluster members contain an APIC chip on the main system board.

The NMI watchdog is enabled on supported systems by adding nmi_watchdog=1 to the kernel's command line. Here is an example /etc/grub.conf:

NoteNote
 

The following GRUB and LILO bootloader configurations only apply to the x86 architecture of Red Hat Enterprise Linux.

# grub.conf
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title HA Test Kernel (2.4.9-10smp)
        root (hd0,0)
        # This is the kernel's command line.
        kernel /vmlinuz-2.4.9-10smp ro root=/dev/hda2 nmi_watchdog=1

# end of grub.conf

On systems using LILO, add "nmi_watchdog=1" to the append section in /etc/lilo.conf. For example:

# lilo.conf
prompt
timeout=50
default=linux
boot=/dev/hda
map=/boot/map
install=/boot/boot.b
lba32

image=/boot/vmlinuz-2.4.9-10smp
        label=linux
        read-only
        root=/dev/hda2
        append="nmi_watchdog=1"

# end of lilo.conf

Run /sbin/lilo after editing /etc/lilo.conf for the changes to take effect.

To determine if a server supports the NMI watchdog timer, first try adding "nmi_watchdog=1" to the kernel command line as described above. After the system has booted, log in as root and type:

cat /proc/interrupts

The output should appear similar to the following:

           CPU0       
  0:    5623100          XT-PIC  timer
  1:         13          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  7:          0          XT-PIC  usb-ohci
  8:          1          XT-PIC  rtc
  9:     794332          XT-PIC  aic7xxx, aic7xxx
 10:     569498          XT-PIC  eth0
 12:         24          XT-PIC  PS/2 Mouse
 14:          0          XT-PIC  ide0
NMI:    5620998       
LOC:    5623358 
ERR:          0
MIS:          0

The relevant portion of the above output is to verify that the NMI id appears on the left side. If NMI value has a value larger than zero (0), the server supports the NMI watchdog.

If this approach fails, that is, NMI is zero, try passing nmi_watchdog=2 to the kernel instead of nmi_watchdog=1 in the manner described previously. Again, check /proc/interrupts after the system boots. If NMI has a value larger than zero, the NMI watchdog has been configured properly. If NMI is zero, your system does not support the NMI watchdog timer.

D.1.2.3. Configuring a Hardware Watchdog Timer

The kernel provides driver support for various types of hardware watchdog timers. Some of these timers are implemented directly on the system board; whereas, others are separate hardware components such as PCI cards. Hardware-based watchdog timers provide excellent data integrity provisions in the cluster because they operate independently of the system processor and are therefore fully capable of rebooting a system in the event of a system hang.

Due to a lack of uniformity among low-level hardware watchdog components, it is difficult to make generalizations describing how to know if a particular system contains such components. Many low-level hardware watchdog components are not self-identifying.

When configuring any of the supported watchdog timers supported by the kernel, it is necessary to place a corresponding entry into the /etc/modules.conf file. For example, if an Intel-810 based TCO watchdog timer is to be used, the following line should be added to /etc/modules.conf:

alias wdt i810-tco