Protecting infrastructure and workloads, to reduce the impact to containerized services.
As organizations move their systems to the hybrid cloud, resilience is often a critical concern. The ability to withstand errors and failures without data loss is key to providing reliable application services that contribute to business continuity. Critical applications must also continue to perform well even under component failure. Applications alone can go only so far in providing resilience, ultimately depending on underlying data-services infrastructure for resilience and performance under failure conditions.
High Availability is protecting infrastructure or applications on a single site, to ensure continuous operations. The aim is to reduce single points of failure in a computing stack, generally through redundant access paths and component resiliency. Including high availability concepts in an environment means services have built in resiliency and can recover on their own. To recover, these services might: restart if they fail, allow for a faulted node to be restarted, a workload on failed hardware would be redeployed some place different in the environment, a network path failing would result in transactions being resent to the service or sent to a different instance of a service.
High Availability is key to ensuring your applications operate without downtime and can handle unforeseen failures. Learn more about how to make your applications, clusters, and hybrid cloud platform highly available. Technologies such as Containers, Kubernetes, and Serverless present new opportunities in application development but still need a recovery plan in the event of a failure.
Disaster Recovery (DR) is protecting infrastructure or applications in a geographically distributed manner, to reduce business impact as much as possible. The aim is to enable automated or automatic recovery over longer distances than traditional high availability and would extend recovery to a different cluster. In environments where an application is restricted to one site at a time, migration between sites may be automated and require an individual with authority to make a decision to move computing services between sites. This is needed when technology requires a cost to resync applications when failover between sites occurs. Reducing the time it takes to recover from incidents is critical to your organization's success.
Disaster recovery is the ability to recover and continue business critical applications from natural or human created disasters. It is the overall business continuance strategy of any major organization as designed to preserve the continuity of business operations during major adverse events.
Regional-DR capability provides volume persistent data and metadata replication across sites that are geographically dispersed. In the public cloud these would be akin to protecting from a regional failure. Regional-DR ensures business continuity during the unavailability of a geographical region, accepting some loss of data in a predictable amount. This is usually expressed at Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
RPO is a measure of how frequently you take backups or snapshots of persistent data. In practice, the RPO indicates the amount of data that will be lost or need to be reentered after an outage.
RTO is the amount of downtime a business can tolerate. The RTO answers the question, "How long can it take for our system to recover after we were notified of a business disruption?"
Advanced Cluster Management datasheet
Our ecosystem partners work with us to validate their solutions with our platforms and enable a robust Data Protection and Disaster Recovery strategy. Find out more about some of our Network and Storage infrastructure partners:
Red Hat Consulting offers more than just technical expertise. We're strategic advisers who take a big-picture view of your organization, analyze your challenges, and help you overcome them.