One of a cloud architect's responsibilities is to establish and maintain a cloud's capacity and availability at a justifiable cost and with efficient resource usage. This involves managing resources and capacity for the infrastructure (compute, storage, networking), software, and people.
In the first article in this series, we highlighted various aspects of cloud architecture; in the second article, we described self-services delivery; and in the third article, we explored operations. Keeping the platform stable and available requires a disciplined approach to resource and capacity management. The right approach seeks to balance optimizing the platform for cost, availability, and performance.
As we wrote in the first article, Capability Maturity Model Integration (CMMI) provides a framework for the maturity of the processes that combine the people, procedures, and tools to deliver capabilities. The CMMI process areas most relevant to resource and capacity management are:
- Requirements Management (REQM)
- Measurement and Analysis (MA)
- Capacity and Availability Management (CAM)
- Service Continuity (SCON)
Mature teams understand these requirements at different levels. They can measure and predict future resource requirements based on historical data and visibility into the potential future workload.
Establish a strategy
Cloud platform teams need to establish and maintain a capacity and availability management strategy that accounts for quickly responding to capability changes.
They typically start with guardrails to control capacity demand as users consume services. They also use showback and chargeback capabilities, predictive analytics, and automation to push for ways to expand the cloud continually and as required.
The strategy should include resource overcommitment policies that apply to different environments and service-level objectives (SLOs). Ideally (and depending on the service supported), the team should avoid operating production services on overcommitted hardware configurations.
[ Don't miss 6 must-read books for aspiring cloud architects. ]
Know how and when to expand
Capacity management requires a clear understanding of procurement processes and funding models. The cloud team needs to know how long it takes to expand the cloud—taking into account all organizational levels and the level of project commitment required—before they can expand the platform. Enhanced project visibility improves the ability to predict resource requirements. Teams should resist the temptation to introduce or increase existing overcommitments in production environments to satisfy a new application or a customer due to a lack of project planning.
Consider the efficiency of the workload
Kubernetes-based clouds are driving a new set of capabilities such as scale to zero and cloud-optimized application runtimes that can significantly impact resource usage when deployed at scale. Two examples include:
- Tekton Pipelines: These provide serverless Kubernetes-native pipelines that run only when and where needed across cloud platforms.
- Quarkus: This is a Kubernetes-native Java stack that uses a fraction of memory and CPU resources compared to traditional Java runtimes. For example, a large European telco saw a 50% reduction in memory requirements and saved on infrastructure costs when swapping out Spring Boot for Quarkus.
Conclusion
Effective, efficient resource and capacity management are among the most challenging components of running a private cloud environment compared to a public cloud. Keeping track of usage metrics and having a forward-looking view combined with a clear description of cost per compute, memory, and storage unit will help your team anticipate and justify platform expansion when you need it.
Über die Autoren
Johan has 19 years of experience in Information Technology in various sectors including Banking and Finance, and Government. For Red Hat, he worked as a Federal Government Technology Specialist. He successfully used CMMI models to establish a team that operated on DevOps principles for one of Australia's largest retail organizations. The practices used by his team became a catalyst for change in the broader IT organization.
Mohammad has 20+ years of experience in multi-tiered system development and automated solutions. He has extensive experience in online services that use open-source software based on UNIX and Linux. Primarily focused on IT infrastructure with a background in open source web development. Mohammad has dedicated the last 2 years to emerging technologies, primarily Kubernetes using OpenShift and automation using Ansible.
Maurice is a Senior Consultant at Red Hat with 30+ years experience in Information Technology. He has worked for vendors, system integrators and end user organizations and has experienced the challenges of each.
He's well versed in server hardware and operating systems, particularly Linux, which has been his dominant focus for the last 15 years. Maurice has dedicated the last 5 years of his career to private cloud deployment and operations.
John, a Senior Technical Support Engineer, has 16 years of systems administration, operations, and IT management experience around UNIX, Linux, Performance/Capacity Management, Automation, Configuration Management, and OpenStack private clouds. Between 2010 and 2015, he spent 4.5 years as a Service Delivery Manager operating IT for a major financial institution in Australia using ITILv3 methodologies with team members stretching across 5 countries. In 2017, John joined Red Hat as a consultant.
Nach Thema durchsuchen
Automatisierung
Das Neueste zum Thema IT-Automatisierung für Technologien, Teams und Umgebungen
Künstliche Intelligenz
Erfahren Sie das Neueste von den Plattformen, die es Kunden ermöglichen, KI-Workloads beliebig auszuführen
Open Hybrid Cloud
Erfahren Sie, wie wir eine flexiblere Zukunft mit Hybrid Clouds schaffen.
Sicherheit
Erfahren Sie, wie wir Risiken in verschiedenen Umgebungen und Technologien reduzieren
Edge Computing
Erfahren Sie das Neueste von den Plattformen, die die Operations am Edge vereinfachen
Infrastruktur
Erfahren Sie das Neueste von der weltweit führenden Linux-Plattform für Unternehmen
Anwendungen
Entdecken Sie unsere Lösungen für komplexe Herausforderungen bei Anwendungen
Original Shows
Interessantes von den Experten, die die Technologien in Unternehmen mitgestalten
Produkte
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud-Services
- Alle Produkte anzeigen
Tools
- Training & Zertifizierung
- Eigenes Konto
- Kundensupport
- Für Entwickler
- Partner finden
- Red Hat Ecosystem Catalog
- Mehrwert von Red Hat berechnen
- Dokumentation
Testen, kaufen und verkaufen
Kommunizieren
Über Red Hat
Als weltweit größter Anbieter von Open-Source-Software-Lösungen für Unternehmen stellen wir Linux-, Cloud-, Container- und Kubernetes-Technologien bereit. Wir bieten robuste Lösungen, die es Unternehmen erleichtern, plattform- und umgebungsübergreifend zu arbeiten – vom Rechenzentrum bis zum Netzwerkrand.
Wählen Sie eine Sprache
Red Hat legal and privacy links
- Über Red Hat
- Jobs bei Red Hat
- Veranstaltungen
- Standorte
- Red Hat kontaktieren
- Red Hat Blog
- Diversität, Gleichberechtigung und Inklusion
- Cool Stuff Store
- Red Hat Summit