Chapter 9. Managing Node Groups
Red Hat HPC cluster management is built around the concept of node groups. Node groups are a powerful template mechanism that allows the cluster administrator to define common shared characteristics among a group of nodes. Red Hat HPC ships with a default set of node groups for installer nodes, packaged installed compute nodes, diskless compute nodes and imaged compute nodes. The default node groups can be modified or new node groups can be created from the default node groups. All of the nodes in a node group share the following:
Node Name format
Operating System Repository
Kernel parameters
Kits and components
Network Configuration and available networks
Additional RPM packages
Custom scripts (for automated configuration of tools)
Partitioning
A typical HPC cluster is created from a single installer node and many compute nodes. Normally compute nodes are exactly the same as each other with a few exceptions, like the node name or other host specific configuration files. A node group for compute nodes makes it easy to configure and manage 1 or 100 nodes all from the same node group. The ngedit command is a graphical TUI (Text User Interface) run by the cluster administrator to create, delete and modify node groups. The ngedit tool modifies cluster information in the Red Hat HPC database and also automatically calls other tools and plugins to perform actions or update configuration. For example, modifying the set of packages associated with a node group in ngedit automatically calls cfm (configuration file manager) to synchronize all of the nodes in the cluster using yum to add and remove the new packages, while modifying the partitioning on the node group notifies the administrator that a re-install must be performed on the nodes in the node group in order to change the partitioning. The Red Hat HPC database keeps track of the node group state, thus several changes can be made to a node group simultaneously and the physical nodes in the group can be updated immediately or at a future time using the cfmsync command.
9.1. Adding RPM Packages in RHEL to Node Groups
Open a Terminal and run the node group editor as root.
# ngedit
Select the compute-rhel node group and move through the Text User Interface screens by pressing F8 or by choosing next on the screen. Stop at the Optional Packages screen.
Additional RPM packages are added by selecting the package in the tree list. Pressing the space bar expands or contracts the list to display the available packages.
Packages are sorted alphabetically by default. The list of packages can be sorted by Red Hat groups, just choose Toggle View to re-sort the packages.
Select the additional packages using the spacebar. When a package is selected an asterisk displays beside the package name.
Package dependencies are automatically handled by yum. If any selected package requires other packages they are automatically included when the package is installed on the cluster nodes.
ngedit automatically calls cfm to synchronize the nodes and install new packages but, by design, does not automatically remove packages from nodes in the cluster. If required pdsh and rpm can be used to completely remove packages from the RPM database on each node in the cluster.