High availability is the protection of infrastructure or applications on a single site, ensuring continuous operation. The aim is to reduce single points of failure in a computing stack, generally through redundant access paths and component resiliency. Including high-availability concepts in an environment means services have built-in resiliency and can recover on their own. Recovery options for these services might include:
- Restarting if they fail,
- Allowing for a faulted node to be restarted,
- A workload on failed hardware could be redeployed at a different location in the environment,
- A network path failing would result in transactions being re-sent to the service or sent to a different instance of a service.
High availability is key to ensuring your applications operate without downtime and can handle unforeseen failures. Learn more about how to make your applications, clusters, and hybrid cloud platform highly available. Technologies such as containers, Kubernetes, and serverless present new opportunities in application development but still need a recovery plan in the event of a failure.
Disaster recovery (DR) is protecting infrastructure or applications in a geographically distributed manner, to reduce business impact as much as possible. The aim is to enable automated or automatic recovery over longer distances than traditional high availability and would extend recovery to a different cluster. In environments where an application is restricted to one site at a time, migration between sites may be automated and require an individual with authority to make a decision to move computing services between sites. This is needed when technology requires a cost to resync applications when failover between sites occurs. Reducing the time it takes to recover from incidents is critical to your organization's success.