High performance computing (HPC) generally refers to processing complex calculations at high speeds across multiple servers in parallel. Those groups of servers are known as clusters and are composed of hundreds or even thousands of compute servers that have been connected through a network. In an HPC cluster, each component computer is often referred to as a node.
HPC clusters run batches of computations. The core of any HPC cluster is the scheduler, used to keep track of available resources, allowing job requests to be efficiently assigned to various compute resources (CPU and GPU) via fast network.
A typical HPC solution has 3 main components:
A supercomputer is made up of thousands of compute nodes that work together to complete tasks.
While historically "supercomputers" were single super fast machines, current high performance computers are built using massive clusters of servers with one or more central processing units (CPUs).
The supercomputers of today are aggregating computing power to deliver significantly higher performance than single desktops or servers and are used for solving complex problems in engineering, science, and business.
By applying more compute power with HPC, data-intensive problems can be run using larger datasets, in the same amount of time. This ability allows problems to be described and examined at higher resolution, larger scale, or with more elements.
HPC solutions require an operating system in order to run. LinuxⓇis the dominant operating system for high performance computing, according to TOP500 list that keeps track of world’s most powerful computer systems. All TOP500 supercomputers run Linux, and many in the top 10 run Red HatⓇ Enterprise Linux.
With the increased use of technologies like the Internet of Things (IoT), artificial intelligence (AI), and machine learning (ML), organizations are producing huge quantities of data, and they need to be able to process and use that data more quickly in real-time.
HPC is is now running anywhere from cloud to the edge, and can be applied to a wide scope of problems—and across industries such as science, healthcare, and engineering—due to its ability to solve large-scale computational problems within reasonable time and cost parameters.
To power increasingly sophisticated algorithms, high-performance data analysis (HPDA) has emerged as a new segment that applies the resources of HPC to big data. In addition, supercomputing is enabling deep learning and neural networks to advance artificial intelligence.
HPC can also be applied in other industries and use cases, such as government and academic research, high-performance graphics, biosciences, genomics, manufacturing, financial services and banking, geoscience, and media.
The compute resources needed to analyze big data and solve complex problems are expanding beyond the on-premise compute clusters in the datacenter that are typically associated with HPC and into the resources available from public cloud services.
Cloud adoption for HPC is central to the transition of workloads from an on-premise-only approach to one that is decoupled from specific infrastructure or location.
Cloud computing allows resources to be available on demand, which can be cost-effective and allow for greater flexibility to run HPC workloads.
The adoption of container technologies has also gained momentum in HPC. Containers are designed to be lightweight and enable flexibility with low levels of overhead—improving performance and cost. Containers also help to meet the requirements of many HPC applications, such as scalability, reliability, automation, and security.
The ability to package application code, its dependencies and even user data, combined with the demand to simplify sharing of scientific research and findings with a global community across multiple locations, as well as the ability to migrate said applications into public or hybrid clouds, make containers very relevant for HPC environments.
By using containers to deploy HPC apps and workloads in the cloud, you are not tied to a specific HPC system or cloud provider.
Red Hat Enterprise Linux provides a platform that delivers reliability and efficiency to HPC workloads at scale for on-premise, in the cloud, or in hybrid HPC environments. Red Hat Enterprise Linux provides a range of container tools, delivering enhanced portability and reproducibility for HPC workloads.
Red Hat® Enterprise Linux® for Workstations, a variant of Red Hat Enterprise Linux, is optimized for high-performance graphics, animation, and scientific activities. It is also available on the Amazon Web Services (AWS) cloud (GRID driver and TESLA driver), delivered via NICE DCV, a high-performance remote display protocol with enhanced security components.
Red Hat OpenShift is an enterprise container orchestration platform that extends Kubernetes capabilities and provides consistent operations and application life cycle management at scale using flexible topology options to support low-latency workloads anywhere.