As a principal engineer in Red Hat IT focused on container platforms, I know the benefits of using Red Hat OpenShift firsthand and the value it gives to our organization. Since our OpenShift services were first deployed in August 2016, we have seen many key improvements, such as shorter cycle time from code to production, higher density of applications, and better standardization of application architectures.
While these successes were important, we knew we had to face our next frontier. Other business drivers were pushing us to deploy OpenShift across a hybrid cloud environment. These factors included improved multi-site resiliency, the ability to support burst resources on public cloud, avoiding vendor lock-in, and being able to use the most cost-effective infrastructure possible. In addition, with OpenShift being the abstraction layer for public and private cloud, we also needed to offer our application teams an OpenShift interface to the entire hybrid cloud environment so that they could meet the evolving business requirements for their applications.
Before we began this transition, our application deployments on infrastructure were very specific to a particular cloud or datacenter. A deployment to Red Hat Virtualization VMs, for example, did not allow for the agility to move to our AWS EC2 environment.
Challenges and Benefits of OpenShift
But, once deployed in an OpenShift ecosystem, applications have improved cloud agility, meaning they can move across the hybrid cloud environment as needed, mostly without concern for the underlying infrastructure. This has increased efficiency on the part of our teams and their workloads, since it provides consistency across the hybrid cloud environment, easily allowing for triple-active application deployments. Other goals we established for this transition were:
New patterns for application teams to deploy in a triple-active manner across three sites in the hybrid cloud environment.
Removal of legacy environment patterns, so we could align to newer networking models for infrastructure.
Drive cloud-native application design with a hybrid cloud architecture.
Of course, we faced a few challenges along the way. For one thing, storage across the hybrid cloud environment is difficult, and there were few out-of-the-box products or tools that provide storage management in this type of environment.
Secondly, applying an individual application’s patterns to other applications doesn’t always work, due to the different business requirements per application. And, last but not least, service dependencies for applications migrated to a hybrid cloud environment need to be rearchitected to support those hybrid cloud applications. If, for example, a single service becomes a bottleneck in a single site, all hybrid cloud applications dependent on that service will have a single site failure mode introduced.
Collaboration overcomes Challenges
To address these challenges, we started focusing on our teams and their people. After all, technology is only as good as the people who deploy and manage it from the start. Knowing this, we organized work streams based on our multidisciplinary teams to focus on core aspects of the hybrid cloud environment (i.e., deployment tooling, monitoring and operations, application modernization, load balancing, and environments and infrastructure).
We also made sure that collaboration between the service providers and service clients was practiced with this multidisciplinary structure in mind. For example, the load balancing workstream had operational expertise, but also key input from the application teams that would be utilizing load balancing services.
Next, we focused on the efficiencies we could create for our teams as this project unfolded. We decided to put a self-service model into our hybrid cloud platform so that our applications teams would have more flexibility in the project spaces created, self-management of network firewall, and on-demand storage. This removed the bottlenecks for operations support, and since some apps now had three different sites to deploy to, self-service and automation was key to lowering our support burden.
In addition to an effective collaboration structure and added efficiencies through self-service modeling, we knew we needed to address hybrid cloud consistency to ensure our success as much as possible.
Instead of establishing monitoring services that could be utilized according to what the public and private cloud offered individually at the time, site-agnostic monitoring services were chosen as our solution instead, so that we could ensure a solution for one site could be replicated across all three.
In the end, we now have three consistent sites in our hybrid cloud environment: two private datacenter offerings across the continental US, and one public cloud offering which is also multi-region. Some applications have been able to adopt triple active site deployments, providing exceptional resiliency. While whole sites have had some issues since, these multi-active applications have not had any disruption for their clients.
We were also able to achieve improvements in flexibility for application teams as they deploy to the site that’s right for them, or, in some cases, multiple sites. As long as we have capacity and cost management, the application teams can make their own decisions about which site is best for deployment.
Arbitrary networking models have also been removed from legacy environments so our application teams are not forced to deploy their app where it does not make sense, with the client’s perspective in mind. In addition to implementing consistent operations and successful support for our different sites, standardization of the tooling that supports the hybrid cloud environment was also a big win for us, as application teams are no longer dependent on site-specific tools or processes. Similar practices for monitoring, storage, networking, and others can now be used across the hybrid cloud environment as well.
In the end, we did meet our original goals for the hybrid cloud, but that’s not to say we didn’t experience some critical lessons learned. If we had to do it all again, knowing what we know now, these main takeaways would be most critical:
Some applications could not adopt the triple-active application architecture that a few applications developed. This was due to different business requirements for each of the applications in the hybrid cloud environment. It was better to focus on individual capabilities that applications can adopt (like global load balancing and multi-site pipelines) rather than replicating all of an application’s success across the board.
Not everything can be consistent in a hybrid cloud environment, and some things should specifically not be consistent. It’s important to cite these deltas to the end user (developers) so they can make better, more informed choices.
The entire stack needs to be ready for cloud-native application architecture and hybrid cloud design. If there are still legacy processes for managing the infrastructure, these will inevitably cause delays and inconsistencies in deploying the hybrid cloud environment, as well as a decreased agility when it comes to meeting new and changing application requirements.
So, where do we go from here? Well, Red Hat IT is now working on building our next datacenter. As we build out more sites, we’re thinking about how some apps have dependencies on specific geographies. The more sites we have, the greater our adoption of the hybrid cloud environment can be for those apps in particular.
We’re also turning our attention to more multi-active application architectures, which can mean increased adoption of hybrid cloud because we’ll have better patterns for app teams to access and put into use. In addition, we know we need to focus on our multi-site management of persistent data, which enables more stateful workloads to adopt hybrid cloud architectures.
With a dependence on persistent data in one site, the application architecture is still fully dependent on a single cloud, which restricts some of the cloud-native design that applications can transition into.
And lastly, we’re working to increase brokered cloud/datacenter services (i.e., storage, network, and database) while keeping a consistent experience for the end user, our developers. This will promote an increased adoption of hybrid cloud and cloud-native architectures along with repeatable standard solutions for requesting network, storage, compute, and other integration needs.