With the advent of advanced DevOps and GitOps practices, the art of migrating or redeploying applications from a self-managed Kubernetes cluster to a managed cluster on a cloud provider should be relatively straightforward. However, in reality, migrating a large number of applications to a managed cluster requires critical thinking.
This article offers an architect's view of the essential elements required for an effective large-scale migration of workloads from a self-managed Kubernetes cluster to a vendor-managed cluster based on real-world experience gained in the field.
Plan the migrationPlanning is the most important aspect of any large-scale workload migration. It involves getting commitments from all application teams on when they plan to start the migration and how long it will take for each team to move their workload to the managed cluster. To do that, the application teams must identify the changes required to their existing pipelines.
Identifying the key elements for each application is equally important, as this impacts the migration's timing and prioritization. You may wish to prioritize the applications with the least number of north- and south-bound integrations and dependencies for the migration. The key elements include:
- Dependencies between the workloads to be migrated, both inbound and outbound (such as exposed APIs and interfaces used by other systems, databases, other downstream systems, and so forth)
- Deployment methods, patterns, and tools used by different application teams (such as build, artifact repository, compliance, orchestration)
- Persistent volumes, if any, for stateful applications to be migrated
- Resources within each application team that will work on the migration
You also need a migration tracker that maps each application team, the set of applications each team owns, and the potential start and end dates for each application migration. Having such a tracker augments an effective migration plan. Not only that, the migration tracker, together with a visual representation of the application dependency map, can provide rich visibility into the whole migration path and identify risks before starting the migration.
[ Learn more about designing cloud-native applications in the Kubernetes patterns eBook. ]
Check your preparation and readiness
Consider the following key elements from a preparation and readiness standpoint:
- Creating a managed cluster playground to understand the nuances involved in standing it up
- Forming a core team to run a truly representative pilot application on the managed cluster with the following goals:
- Iron out connectivity issues from the CI/CD pipelines to the cluster.
- Verify critical configurations, including:
- Inbound and outbound connectivity from and to the pilot application running on the managed cluster
- Set node affinity and anti-affinity patterns
- Stream application logs from the managed cluster to the relevant target
- All other integration aspects, including interacting with application monitoring and alerting components
- Create a knowledge base covering the steps adopted for the pilot workload migration, the main issues encountered, and how you resolved them. This knowledge base will provide a valuable reference point during the migration phase.
- Identify opportunities to:
- Improvise and standardize the CI/CD pipelines to use standard Kubernetes objects for border compatibility and simplicity.
- Externalize the container image registry so that the availability and management of the container images are independent of the managed cluster's availability.
- Streamline the overall application logging strategy (for example, using AWS CloudWatch log forwarding on the AWS platform to stream the application logs to the target log aggregator).
- Reexamine the mechanisms used by applications to interact with AWS resources. For example, if the target managed cluster is ROSA (Red Hat OpenShift Service on AWS), you can feel confident it is well integrated with AWS Security Token Service (AWS STS). Using fine-grained IAM roles for service accounts (IRSA) would be a potential solution to simplify the identity and access management (IAM) functions required for accessing AWS resources.
- Automate creating the necessary resources on the managed cluster for all migration candidates, including user groups, roles, role-based access control (RBAC), namespaces, outbound/inbound firewall rules, and policies, to lay the foundation for deploying the applications through CI/CD pipelines without significant changes.
[ Learn one group's keys to success when migrating 3,000 applications from another cloud platform to Kubernetes. ]
Workload migrationWith the cluster shakedown already done by the core team as a part of the readiness check, you can start the actual application migration relatively easily. One fundamental tenet of a successful large-scale workload migration is to manage the scale-out through the core team that has done the pilot workload migration. Application teams guided by the core team must pass on the knowledge harvested from pilot migrations to the other teams. The core team model, together with a living "migration cookbook" (covering the knowledge gained with each migration), can enable newer teams to onboard their workloads much faster.
The migration toolkit for applications is very handy for migrating stateful workloads. The OpenShift migration best practices repository from Red Hat's Community of Practice provides extensive details on planning and executing large-scale migrations.
Although opening the firewalls is covered mainly during the preparation phase (preferably through automatic replication of the existing on-premises Kubernetes-centric firewall rules to the managed cluster), you might need to create new firewall rules between the existing Kubernetes cluster and the managed cluster. For example, cases where the migrated workloads are microservices and event-driven must be able to connect to those services or dependencies that still reside in on-premises clusters. Ensuring smoother connectivity between the managed and on-premises clusters also plays a pivotal role.
[ Gain technical and business advantages with an Event-driven architecture for a hybrid cloud blueprint. ]
Verify the migration
Perform regular iterative testing for each workload migrated to the managed cluster to verify that applications work as intended.
Once most of the workloads are running on the managed cluster, conduct the usual end-to-end tests (functional, security, performance, and so forth) to iron out the operational aspects of the applications and support business continuity.
While there are other operational aspects to consider for large-scale workload migrations, this is a simplified baseline for a quick start.
[ Build a flexible foundation for your organization. Download An architect's guide to multicloud infrastructure. ]