Resiliency with performance for PostgreSQL
Red Hat® OpenShift® Container Storage offers scalable and available persistent storage for cloud-native applications based on PostgreSQL. Container-native, software-defined storage lets application and development teams dynamically provision persistent volumes (PVs), quickly scaling or deprovisioning storage on demand. Moreover, Red Hat OpenShift Container Storage can provide business continuity with resilience across multiple cloud provider availability zones, maintaining performance comparable to cloud providers’ storage offerings that only run within a single availability zone.
Crunchy Data PostgreSQL on Red Hat OpenShift Container Storage
With many PostgreSQL database deployments taking place in the cloud, software-defined storage increasingly plays a critical role in both database performance and resilience for enterprise applications. As a full-featured, tier-1 relational database management system (RDBMS),1 PostgreSQL is growing rapidly. DB-Engines ranking tool for popular database management systems, regularly shows PostgreSQL as one of the fastest-growing open source RDBMS over the last several years, at the expense of other popular solutions.2
Enterprises deploying PostgreSQL for critical applications in the cloud need more than a full feature set. They need both robust performance and resilience for critical data. Ultimately, software-defined storage is essential for achieving both. Like most databases, organizations have multiple methods to provide resiliency, including:
- Application-layer resiliency. With PostgreSQL replication, the database itself manages database resiliency. While this approach offers application awareness, it introduces greater complexity, requiring more in-depth PostgreSQL knowledge (or third-party software) to manage data replication. Moreover, any configured resilience applies only to PostgreSQL. Other applications and databases would need their own resilience methods, adding complexity and duplication.
- Storage-layer resiliency. In contrast, storage-layer resiliency relies on underlying storage services to manage data replication. This approach is usually more straightforward than implementing application-layer resiliency and provides potentially greater flexibility. Storage-layer resiliency protects not only PostgreSQL databases but also other types of databases and applications. It can also offer more control over replication fine-tuning.
Red Hat testing of the latest releases demonstrated storage performance comparable to cloud-native storage, even while Red Hat OpenShift Container Storage provided resilience across three Amazon Web Services (AWS) Availability Zones.
Crunchy Data PostgreSQL
PostgreSQL is a popular open source, object-relational database system with more than 20 years of continuous development. For testing, Red Hat engineers chose Crunchy Data PostgreSQL. Crunchy Data provides commercial support for PostgreSQL on a subscription basis, ensuring that enterprises of all sizes have access to certified software packages, updates, bug fixes, security patches, and 24x7x365 technical support from PostgreSQL experts. Crunchy Certified PostgreSQL is a trusted, commercially supported, and Common Criteria EAL 2+ certified distribution of open source PostgreSQL. Crunchy PostgreSQL for Kubernetes is a containerized PostgreSQL deployment that uses the operator pattern for Kubernetes and has achieved the autopilot capability level as part of Red Hat OpenShift Operator Certification.3
Crunchy Data helps enterprises benefit from the power and efficiency of PostgreSQL for critical applications through its suite of open source products and services, offering:
- More secure and high-availability PostgreSQL deployments.
- Elastic hybrid cloud PostgreSQL solutions on all infrastructures.
- Geospatial, big data, and artificial intelligence (AI) architectures backed by PostgreSQL.
- Certified PostgreSQL installations and automated compliance verification.
Red Hat OpenShift Container Storage
Red Hat OpenShift Container Storage offers more reliable and scalable persistent storage for cloud-native applications like PostgreSQL running in the cloud. It provides agile, scalable, portable, and highly available storage that can be provisioned and deprovisioned on demand. Application teams can dynamically provision PVs for many workload categories. The platform offers:
- Agility to streamline app and dev workflows across the hybrid cloud.
- Scalability to support emerging data-intensive workloads.
- Portability to allow simple data placement and access across clouds.
Red Hat OpenShift Container Storage features Red Hat Ceph® Storage, which supports the needs of modern stateful applications. The use of the Kubernetes orchestration framework and Kubernetes operators makes Red Hat OpenShift Container Storage simpler and easier to install. Operators are software extensions to Kubernetes that use custom resources to automate and manage applications and their components.
Storage-based resiliency options
Within a cloud-based platform, organizations have choices for configuring and deploying storage-based resiliency. These choices can have ramifications for both performance and cost. Additionally, public cloud customers can choose between general-purpose storage classes or higher-performance, direct-attached storage volumes for their PVs. These storage options typically limit recovery from data failures to within a single AWS Availability Zone, which may not satisfy application requirements.4
In contrast, adding Red Hat OpenShift Container Storage to AWS storage volumes can provide data failover protection across multiple AWS Availability Zones—independent of the cloud-provider storage class selected. Red Hat testing has shown that this additional resilience can be accomplished while providing consistent performance for small databases. Table 1 summarizes the advantages and disadvantages of different AWS instance and storage classes, both with and without Red Hat OpenShift Container Storage.
While Elastic Block Store (EBS) general-purpose (gp2) offers failover within a single AWS Availability Zone, Red Hat OpenShift Container Storage adds automatic failover for AWS instances with direct-attached storage—resulting in additional performance and resiliency. AWS provides no storage failover options across multiple Availability Zones. In contrast, Red Hat OpenShift Container Storage provides automatic failover across multiple Availability Zones, while ensuring performance for applications.
Table 1. Red Hat OpenShift Container Storage has performance, failover, and cost implications for single and multiple AWS Availability Zones
To evaluate the performance of different software-defined storage options, Red Hat engineers used the Sysbench benchmark suite to load a Crunchy Data PostgreSQL cluster with both small (20GB) and large (120GB) databases. AWS instances and storage volumes used are shown in Table 2.
Table 2. Sysbench test configuration
|Small database (20GB)||Large database (120GB)|
|Master nodes||3x M5.xlarge instances|
|Compute nodes||3x M5.xlarge instances (Crunchy Data PostgreSQL)|
|Storage nodes||3x M5.4xlarge (Red Hat OpenShift Container Storage 4.2)||3x i3en.2xlarge (Red Hat OpenShift Container Storage 4.3)|
|Storage devices||3x 2TB EBS gp2 volumes per node||2x 2.3 TB direct-attached NVMe solidstate drives (SSDs) per node|
Small database tests
For the small 20GB database tests, each PostgreSQL pod requirement specification included one vCPU and 3GB of memory. Each Red Hat OpenShift Container Platform compute node held 12 PostgreSQL pods and another set of 12 pods running Sysbench. Test runs compared systems using AWS EBS gp2 volumes against the same systems with Red Hat OpenShift Container Storage running on the EBS gp2 volumes.
Testing showed that when PostgreSQL is backed directly by EBS gp2 PVs, latency grows, and performance drops dramatically due to the gp2 credit burst calculation for small volumes. In contrast, the performance in terms of transactions per second (TPS) was consistent when using Red Hat OpenShift Container Storage 4.2 running on the same EBS gp2 volumes (Figures 1 and 2).5
Large database tests
For the large 120GB database tests, each PostgreSQL pod requirement specification included eight vCPUs and 32GB of memory. Each Red Hat OpenShift Container Platform compute node held a single PostgreSQL pod and a single Sysbench pod.
In Red Hat OpenShift Container Storage 4.2, EB2 gp2 volumes form the basis of the cluster. As such, workload performance is necessarily lower than when using the EBS gp2 volumes directly (Figure 3). However, it is important to note that the performance shown for Red Hat OpenShift Container Storage 4.2 includes replication across three AWS Availability Zones while the EBS gp2 solution is only measuring performance from within a single Availability Zone.
Red Hat OpenShift Container Storage 4.3 includes support for direct-attached storage. In the third set of columns, Red Hat OpenShift Container Storage used direct-attached storage instances (i3en.2x instance store volumes) instead of EBS gp2 volumes. With direct-attached storage, Red Hat OpenShift Container Storage has comparable performance to that of a cluster based on EBS gp2 alone—while still providing resilience across three AWS Availability Zones.
Red Hat OpenShift Container Storage provides a flexible storage platform for PostgreSQL databases. Red Hat OpenShift Container Storage demonstrated consistent performance when tested using Crunchy Data PostgreSQL pods based on EBS gp2 volumes. Moreover, this approach added the ability to support high-performance, direct-attached storage and provide replication across 3 AWS Availability Zones. This functionality gives those deploying PostgreSQL the flexibility they need to supply resiliency and performance that matches their most demanding database applications.
PostgreSQL features ACID properties (atomicity, consistency, isolation, and durability) and primary and unique indexes, updatable views, triggers, foreign keys (FKs) and even stored procedures (SPs).
“DB-Engines Ranking,” DB-Engines, Accessed 14 July 2020.
“Crunchy PostgreSQL for Kubernetes 4.2 Receives Red Hat OpenShift Operator Certification.” Crunchy Data, 10 Feb. 2020.
For example, Amazon Web Services (AWS) Elastic Block Storage (EBS) general-purpose (gp2) storage classes as well as AWS direct-attached storage do not offer failover across AWS Availability Zones.
Each Sysbench run was 10 minutes, with a 75% read and a 25% write workload ratio.