In my previous blog post, I have shared the vision of Disaster Recovery as a Service for OpenStack (DraaS) as an umbrella topic that describes what needs to be done to protect workloads running in an OpenStack cloud from a large scale disaster.
Last week we shared this vision in several sessions at the OpenStack summit. While OpenStack attendees were dealing with infrastructure Disaster Recovery topics in Hong Kong, the strongest tropical cyclone in recorded history “Typhoon Haiyan” also known as Typhoon Yolanda, devastated multiple coastal cities in the Philippines and took the lives of tens of thousands of people with millions evacuated. The storm destroyed complete cities, villages, airports, roads, power and communications infrastructures.
If there’s one thing that history has not only taught us, but also keeps on teaching us every year, is that catastrophic events do happen and that if we don’t invest in preventative measures now, we will pay a hefty price later.
What would happen to your organization if this type of calamity hit?
It is hard enough to protect hosted workloads even in a case of an overheated datacenter that can knock down production servers and deeply impact your operations and revenue generating activities. For service providers, downtime is not an option, every hour that your production service is down, you can loose not only business but also your reputation.
It is one thing to put your application workload in the cloud, but how can you guarantee that when the hosting service goes down, you can provide the right safety net and business continuity for your customers?
Although an entire datacenter, can in fact go down in the case of a disaster, from a user’s point of view, what service providers should care about is how to protect their own data and make sure their services continue running after such events.
When it comes to elastic clouds, it is all about being able to adapt to workload changes by dynamically provisioning and decommissioning resources, and the more dynamic and elastic the cloud platform is, the more challenges you face in making your data and services highly available in the event of a disaster. Data recovery is usually not the end goal–it is the ability to restore the services that use this data. For service providers, this is the hosted workload.
The Replication targets
Disaster Recovery between a primary cloud and a target cloud requires the data to be available in (at least two) geographically dispersed, independent sites in a share-nothing model.
OpenStack replication targets can include:
- Private cloud to Private cloud
- Private cloud to Public cloud
- Public cloud to Public cloud
- Bare-metal environments to Public cloud
As our recovery target is the hosted workload, we should look at ways to achieve DR at the workload level. Imagine selecting a DR service level flavor for a workload, such as applying a “Gold” profile for application service that requires the highest protection level with the shortest recovery point objective (RPO) and the shortest recovery time objective (RTO). Such a DR policy can be based on synchronous replication and hot backup site. Or what if you were able to select the other policies such as “Silver” based on periodic replication, or “Bronze” based on async replication with low capacity standby site for application services that require lower protection levels with longer RPOs & RTOs?
The first step for Disaster Recovery enablement in OpenStack is the ability to support data and state (metadata) replication. Several different approaches may be applicable, such as leveraging application-based replication, host-based replication (Hypervisor VM level) and of course array-based replication.
Replicating Data
OpenStack Swift Globally Distributed Cluster object storage can be used to replicate Glance virtual machine images. Swift is currently designed to work in a single region where a region is defined as a low latency link between Swift zones. As long as sites are nearby, zones can be distributed over multiple sites.
Another option to replicate virtual machine images would be to utilize Glance’s multiple image locations feature. Starting in the OpenStack Havana release, image service images can now be stored in multiple locations. This enables the efficient consumption of image data and the use of backup images in the event of a primary image failure.
Cinder can be extended to support storage array based replication in the following ways:
- Utilize the scheduler to create “protected” volumes on storage arrays that are continuously replicating
- Use volume types to create replicated volumes where drivers support volume level granularity for replication
- Replicate data in 2 independent volumes (across different storage backends and possibly sites) using hypervisor based replication
Replicating OpenStack Services State
Disaster Recovery in OpenStack should include support for:
Capturing the metadata relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata. Without capturing the Openstack different services state, we will not be able to achieve a complete failover of the hosted workloads to the recovery site.
Examples of OpenStack metadata that requires replication can include:
- Nova: VM flavors and SSH keys
- Keystone: Identities of tenants and users
- Neutron: Virtual networks between VMs
- Cinder: Volume types and pairing
- Glance: Registry and image metadata
- Ability to provide consistency of the replicated data & metadata with checkpoints
We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc).
Understanding that Disaster Recovery is a complex task where different applications and use-cases have different requirements, some use-cases can be easily supported while others may be more complex, this is targeted as a long-term effort with incremental steps.
Some APIs and features are expected to be integrated into existing projects such as Nova (DR features for compute). Some functionality, like DR orchestration may be part of Heat, or a new project, or even outside the scope of OpenStack.
Enabling Cinder storage replication in the OpenStack Icehouse release is just the first step in protecting workloads running in OpenStack clouds to ensure business continuity while preparing for the worst case scenario.
À propos de l'auteur
Parcourir par canal
Automatisation
Les dernières nouveautés en matière d'automatisation informatique pour les technologies, les équipes et les environnements
Intelligence artificielle
Actualité sur les plateformes qui permettent aux clients d'exécuter des charges de travail d'IA sur tout type d'environnement
Cloud hybride ouvert
Découvrez comment créer un avenir flexible grâce au cloud hybride
Sécurité
Les dernières actualités sur la façon dont nous réduisons les risques dans tous les environnements et technologies
Edge computing
Actualité sur les plateformes qui simplifient les opérations en périphérie
Infrastructure
Les dernières nouveautés sur la plateforme Linux d'entreprise leader au monde
Applications
À l’intérieur de nos solutions aux défis d’application les plus difficiles
Programmes originaux
Histoires passionnantes de créateurs et de leaders de technologies d'entreprise
Produits
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Services cloud
- Voir tous les produits
Outils
- Formation et certification
- Mon compte
- Assistance client
- Ressources développeurs
- Rechercher un partenaire
- Red Hat Ecosystem Catalog
- Calculateur de valeur Red Hat
- Documentation
Essayer, acheter et vendre
Communication
- Contacter le service commercial
- Contactez notre service clientèle
- Contacter le service de formation
- Réseaux sociaux
À propos de Red Hat
Premier éditeur mondial de solutions Open Source pour les entreprises, nous fournissons des technologies Linux, cloud, de conteneurs et Kubernetes. Nous proposons des solutions stables qui aident les entreprises à jongler avec les divers environnements et plateformes, du cœur du datacenter à la périphérie du réseau.
Sélectionner une langue
Red Hat legal and privacy links
- À propos de Red Hat
- Carrières
- Événements
- Bureaux
- Contacter Red Hat
- Lire le blog Red Hat
- Diversité, équité et inclusion
- Cool Stuff Store
- Red Hat Summit