RED HAT Cluster Suite RHEL3 U5 Release Notes Copyright(c) 2005 Red Hat, Inc. ------------------------------------------------------- Introduction: The following topics are covered in this document: o New Red Hat Cluster Suite Features for RHEL3 U5 o Defects Fixed in the Release o Related Documentation New Red Hat Cluster Suite Features for RHEL3 U5 o STONITH support for various Bull systems A STONITH module for the Bull Novascale FAME architecture has been added. Support is made possible by accessing the Platform Administration Processor (PAP) management console. The configurable parameters in the Cluster Configuration Tool (redhat-config-cluster) GUI is as follows: pap: IP Address: <-- IP address of PAP management console Port: <-- Domain (the Bull virtual host) Username: <-- Administrative user capable of issuing power on/off commands for the given Domain Password: <-- Password to authenticate the administrative username Additionally, STONITH module support has been added for any BULL machine that uses Intelligent Platform Management Interface over local area networks (IPMI-over-LAN). The configurable parameters in the Cluster Configuration Tool (redhat-config-cluster) GUI is as follows: ipmilan: IP Address: <-- IP address of the node's IPMI port Port: <-- Unused Username: <-- Administrative user capable of issuing power on/off commands to the given IPMI port Password: <-- Password to authenticate the administrative username o STONITH support for IBM BladeCenter A STONITH module for the IBM BladeCenter has been added. Support is made possible by accessing the BladeCenter console via telnet. The configurable parameters in the Cluster Configuration Tool (redhat-config-cluster) GUI is as follows: bladecenter: IP Address: <-- IP address of management blade Port: <-- Blade number Username: <-- Administrative user capable of issuing power on/off commands on the Blade Center Password: <-- Password to authenticate the administrative username o Per-IP address link monitoring for clusters where heartbeating is done over a private LAN A checkbox that enables link monitoring has been added to the IP address dialog box of the Cluster Configuration Tool (redhat-config-cluster) Defects Fixed in the Release o Service number mismatches in pathological cases are retried Users periodically received a "service number mismatch 4,6" at the log level. The issue has been isolated to the Red Hat Enterprise Linux kernel, and a retry was added so that clumanager only prints the message if it receives bad data multiple instances in a row. Cluster development is working with Kernel engineers to find the root cause of the issue. o STONITH subsystem left temporary files unaccounted for The STONITH subsystem would pass configuration information in files in the /tmp directory. When a new STONITH device (like a fence agent) was configured, it would write all the information necessary to use that device into a file in /tmp. The temporary file was mode 0600, but still contained passwords and login names for the power switch. The temporary files are now removed after being added to the cluster configuration. o The clumanager init script missed a default case The clumanager service init script did not cause an error message in default cases (such as entering an invalid option or not entering an option at all) as it should have. o Routed IP tiebreaker addresses were not properly handled in some cases The length of the ICMP packet (stored in the ICMP packet) was incorrect, which caused pings to not get responses from various network router models. o Random restarts of services with bogus status returns Clumanager would restart a service after receiving an invalid status return. Cluster developers could not reproduce the issue, but isolated it to a wait() call that was interrupted by a signal. o Upgrade postuninstall script fails if /etc/cluster.xml did not exist An error message was reported if the /etc/cluster.xml configuration file was not found during an upgrade of Red Hat Cluster Suite packages. o Signals blocked when starting user services Cluster services that used signals for communication/wakeups were broken because of blocked signals during service initialization. o Force unmount killed too many processes Cluster Manager kills all processes using a specified mount, regardless of whether or not Cluster Manager started the processes. Cluster Manager must kill all processes using a file system in order to unmount it. However, an issue was found in which a force unmount killed too many processes. Suppose there are two different mounts: /var/mail <-- managed by cluster /var/mail-backup <-- not managed by cluster; local copy When Cluster Manager killed the processes using /var/mail while trying to force-unmount it, it also killed all processes using /var/mail- backup.