EXECUTIVE SUMMARY
Many hardware vendors now offer both Ceph-optimized servers and rack-level solutions designed for distinct workload profiles. To simplify the hardware selection process and reduce risk for organizations, Red Hat has worked with multiple storage server vendors to test and evaluate specific cluster options for different cluster sizes and workload profiles. Red Hat’s exacting methodology combines performance testing with proven guidance for a broad range of cluster capabilities and sizes. With appropriate storage servers and rack-level solutions, Red Hat® Ceph Storage can provide storage pools serving variety of workloads — from throughput-sensitive and cost/capacity-focused workloads to emerging IOPS-intensive workloads.
INTRODUCTION
Red Hat Ceph Storage significantly lowers the cost of storing enterprise data and helps organizations manage exponential data growth. The software is a robust and modern petabyte-scale storage platform for public or private cloud deployments. Red Hat Ceph Storage offers mature interfaces for enterprise block and object storage, making it an optimal solution for active archive, rich media, and cloud infrastructure workloads characterized by tenant-agnostic OpenStack® environments1 . Delivered as a unified, software-defined, scale-out storage platform, Red Hat Ceph Storage lets businesses focus on improving application innovation and availability by offering capabilities such as:
- Scaling to hundreds of petabytes.
- No single point of failure in the cluster.
- Lower capital expenses (CapEx) by running on commodity server hardware.
- Lower operational expenses (OpEx) with self-managing and self-healing properties.
Red Hat Ceph Storage can run on myriad industry-standard hardware configurations to satisfy diverse needs. To simplify and accelerate the cluster design process, Red Hat conducts extensive performance and suitability testing with participating hardware vendors. This testing allows evaluation of selected hardware under load and generates essential performance and sizing data for diverse workloads — ultimately simplifying Ceph storage cluster hardware selection.
As discussed in this guide, multiple hardware vendors now provide server and rack-level solutions optimized for Red Hat Ceph Storage deployments with IOPS-, throughput-, and cost/capacity-optimized solutions as available options. For more information on configuring a Red Hat Ceph Storage cluster, see the Red Hat Ceph Storage hardware configuration guide. Full performance and sizing guides for several vendors are also available, providing complete and detailed information on the systems tested and results achieved.
WORKLOAD-OPTIMIZED CEPH PERFORMANCE DOMAINS
One of the key benefits of Ceph storage is the ability to support different types of workloads within the same cluster using Ceph performance domains. Dramatically different hardware configurations can be associated with each performance domain. Storage pools can then be deployed on the appropriate performance domain, providing applications with storage tailored to specific performance and cost profiles. Selecting appropriately sized and optimized servers for these performance domains is an essential aspect of designing a Red Hat Ceph Storage cluster.
Table 1 provides the criteria Red Hat uses to identify optimal Red Hat Ceph Storage cluster configurations on storage servers. These categories are provided as general guidelines for hardware purchases and configuration decisions, and can be adjusted to satisfy unique workload blends. Actual hardware configurations chosen will vary depending on specific workload mix and vendor capabilities.
TABLE 1. CEPH CLUSTER OPTIMIZATION CRITERIA
CLUSTER OPTIMIZATION CRITERIA | PROPERTIES | EXAMPLE USES |
IOPS-OPTIMIZED | • Lowest cost per IOPS • Highest IOPS per GB • 99th percentile latency consistency | • Typically block storage • 3x replication for hard disk drives (HDDs) or 2x replication for solid-state drives (SSDs) • MySQL on OpenStack clouds |
THROUGHPUT-OPTIMIZED | • Lowest cost per MBps (throughput) • Highest MBps per TB • Highest MBps per BTU • Highest MBps per Watt • 97th percentile latency consistency | • Block or object storage • 3x replication • Active performance storage for video, audio, and images • Streaming media |
COST/CAPACITY-OPTIMIZED | • Lowest cost per TB • Lowest BTU per TB • Lowest Watts required per TB | • Typically object storage • Erasure coding common for maximizing usable capacity • Object archive • Video, audio, and image object repositories |
To the Ceph client interface that reads and writes data, a Ceph cluster appears as a simple pool where data is stored. However, the storage cluster performs many complex operations in a manner that is completely transparent to the client interface. Ceph clients and Ceph object storage daemons (Ceph OSDs, or simply OSDs) both use the controlled replication under scalable hashing (CRUSH) algorithm for storage and retrieval of objects. OSDs run on OSD hosts — the storage servers within the cluster.
A CRUSH map describes a topography of cluster resources, and the map exists both on client nodes as well as Ceph Monitor (MON) nodes within the cluster. Ceph clients and Ceph OSDs both use the CRUSH map and the CRUSH algorithm. Ceph clients communicate directly with OSDs, eliminating a centralized object lookup and a potential performance bottleneck. With awareness of the CRUSH map and communication with their peers, OSDs can handle replication, backfilling, and recovery— allowing for dynamic failure recovery.
The CRUSH map is also used to implement both failure domains and performance domains. Performance domains are simply a hierarchy that takes the performance profile of the underlying hardware into consideration. The CRUSH map describes how Ceph stores data, and it is implemented as a simple hierarchy (acyclic graph) and a ruleset. The CRUSH map can support multiple hierarchies to separate one type of hardware performance profile from another. For example:
- Hard disk drives (HDDs) are typically appropriate for cost/capacity-focused workloads.
- HDDs with Ceph write journals on solid state drives (SSDs) are often used for throughput-sensitive workloads.
- SSDs are used for IOPS-intensive workloads such as MySQL and MariaDB.