How to use cloud hyperscalers to handle 5G traffic demand bursts

August 11, 2022Fatih E. Nar, Brandon Jozsa6-minute read

Telecom, media, and entertainment (TME) industries use the term burst to describe unexpected, unplanned, or peak interest in consumable services and products that existing capabilities and capacities aren't able to handle.

[ Learn how to build a flexible foundation for your organization. Download An architect's guide to multicloud infrastructure. ]

One way TME service providers try to address these demand bursts is by signing partnership agreements with hyperscalers. This enables them to avoid unnecessary capital investments and handle temporary consumption increases in their services portfolio.

TME service providers primarily use on-premises solutions as their application platform for various reasons, including:

Regulatory compliance (for example, data locality requirements)
Better total cost of ownership (TCO) compared to the return on investment (ROI) ratio for stable or saturated service consumptions with optimized infrastructure and platform characteristics
End-to-end ownership and administration from infrastructure to platform and application stacks

TME providers can use hyperscalers to address bursts while not compromising the reasons above.

Things to consider in a 5G burst architecture

The key characteristics of burst are:

Can occur at unpredictable dates or times
Have a temporary or ephemeral duration
Are usually tied to low ROI against capital expenditures (capex) + operational expenditures (opex)

Applications that are amenable to burst with hyperscaler resources are:

Truly cloud-native with ease of horizontal scalability
Easily integrated with consumer traffic flow

Therefore, our solution needs to provide on-demand horizontal scaling, ephemeral resources, the fastest time to market, and the lowest TCO, including for cloud spending and talent.

TL;DR: Bursting shall be implemented with the highest level of automation and lowest level infrastructure cost or investment possible.

[ Use distributed, modular, and portable components to gain technical and business advantages. Download Event-driven architecture for a hybrid cloud blueprint. ]

Options for bursting 5G

The two major ways of bursting 5G are a 5G application stack that implements 3rd Generation Partnership Project (3GPP) 5G standards or an application platform that accommodates 5G, both of which are subject to bursting. There are two options for bursting the application platform: using a hyperscaler to expand the size of the existing platform or adding ephemeral new clusters on a hyperscaler.

Expanding the platform towards hyperscaler infrastructure

Although technically possible, option A is not the recommended approach because it would increase the size of the failure domain and grow a common attack surface. The main cluster is already under heavy traffic, and adding new worker capacity will not relieve the cluster control plane (actually, it will overload it). Also, mixing the different infrastructure types under a cluster formation will create non-homogenous configuration models (also called "snowflakes") for platform lifecycle management.

Note that cluster autoscaling is possible and can be recommended while preserving the serving infrastructure layer consistency. However, that would not address nor cover bursting from on-premises to cloud or hyperscaler use cases.

Adding ephemeral clusters on a hyperscaler to the application platform farm

Additional ephemeral clusters come with the cost of an additional cluster control plane. However, that helps lower the failure domain and segregate attack surfaces with control plane isolations.

We picked the second option to build our recommended solution architecture.

Our solution architecture

The solution's main components include:

Application platform: Red Hat OpenShift Container Platform (RH-OCP) as the 5G application platform.
- For a detailed analysis of the 5G application platform, refer to our articles Expanding 5G with the 5G Open HyperCore architecture and Kubernetes deployment models for edge applications.
Platform management: Red Hat Advanced Cluster Management (RH-ACM) as the base for implementing burstable 5G application platform management with an internal policy engine to match declared available capacities.
- For a detailed analysis of the 5G observability vs. service placement, refer to our articles How we designed a 5G Core platform that scales well and Edge computing: How to architect distributed scalable 5G with observability.
5G stack operations: Red Hat Ansible for managing the configuration of 5G service components and updating traffic management patterns to distribute incoming 5G traffic accordingly.

1. Burst the platform

Platform management is the heart of the platform burst operation that acts as a 5G platform (or cluster) dispenser on demand.

RH-ACM cluster pool functionality provides rapid and cost-effective access to configured RH-OCP clusters on demand and at scale. Cluster pools offer a configurable and scalable number of OCP clusters on Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure that can be claimed when needed.

Create a cluster pool on AWS, GCP, Azure

Cluster pools are especially powerful for providing or replacing cluster environments for development, continuous integration, production scenarios, and addressing on-demand capacity increases (bursting).

You can specify the number of clusters to keep running so that they are available to be claimed immediately for bursting to a cloud. The remaining clusters will be held in a hibernating state so that they can be resumed and claimed quickly (compared to cluster creation).

When a cluster claim is requested, the pool assigns a running cluster to it. If no running clusters are available, a hibernating cluster resumes to provide the cluster or a new cluster is provisioned.

Available versus hibernated clusters on AWS EC2

[ Learn more about cloud-native development in the eBook Kubernetes Patterns: Reusable elements for designing cloud-native applications. ]

The cluster pool automatically creates new clusters and resumes hibernating clusters to maintain the specified size and number of available running clusters in the pool.

A cluster is claimed when a cluster is running and ready in the cluster pool. The cluster pool automatically creates new running and hibernated clusters in the cluster pool to maintain the requirements that are specified for the cluster pool.

When bursting ends (when traffic levels return to normal) and extra capacity is no longer needed, the system initiates destruction of the cluster pool. In cluster pool destruction, all unclaimed hibernating clusters are destroyed, and their resources are released. See the "burst management" section below for more information.

2. Burst the 5G stack

From the RH-ACM perspective, 5G Core is an application (with multiple 5G microservices inside), and the application model is based on subscribing to one or more Kubernetes resource repositories (channel resources) that contain resources deployed on managed clusters. Both single and multicluster (burst-case) applications use the same Kubernetes specifications, but multicluster applications involve more deployment and application management lifecycle automation.

Placement rules define the target clusters where resource templates can be deployed. You can use placement rules to facilitate multicluster deployment (bursting) of 5G Core deployments. Placement rules are also used for governance and risk policies. See multicloud-operators-placementrule and the documentation on placement rules for details on multicloud placement rules.

5G application stack deployment on RH-ACM

3. Burst the traffic

You need to do post-placement work for consumer traffic management when additional platforms are ready to be used on a hyperscaler with the 5G stack deployment. Additional 5G capacity is plugged into the incoming traffic path. This should account for ingress controllers, fully qualified domain names (FQDNs), and microservices reachability in the other cluster. This can be done in various ways (individually or in combination), and our group's next article will elaborate on this topic.

Leveraging service mesh (option X): Use Istio Ingress with Istio virtual services to implement load balancing across numerous deployments of 5G across multiple clusters with federated mesh. See the "service mesh federation" section of Edge computing: How to architect distributed scalable 5G with observability for details.
Leveraging external DNS for Kubernetes (option Y): Adding a newly created 5G deployment on a hyperscaler to an existing DNS record resolution path allows seamless service scaling. Visit the ExternalDNS Kubernetes GitHub repository for details.

The latter approach can be coupled with geoproximity information to serve 5G consumers with the nearest deployment location. Therefore, this is our favorite option so far. Please visit AWS's geolocation routing page for details.

Burst management

For the best price/performance operational model, you must be conscious of resource usage over time. So, when to destroy a cluster vs. a cluster pool? Here are some approaches:

When should you destroy a claimed cluster? When the burst instance completes (for example, a major holiday, such as Thanksgiving in the US) and traffic returns to normal for the given time, yet the burst season is not over yet. The cluster pool is still up and ready to provide additional cluster(s) when needed.

Destroying a cluster at the end of the burst period

When should you destroy a cluster pool? When the burst season completes (for example, the US holiday season, which spans the last two months of the year).

Summary

We provided a lot of information here. Here is a simple flow diagram to summarize what we covered:

Remember there are multiple paths to fixing a problem or addressing a need. In sharing our solution, we included various choices with pros and cons. Our solution components will not fit every technical context or business reality. Therefore, it's important to remain open-minded and be able to adopt a better solution based on your needs.

This originally appeared on Medium as Burst OR not to burst! and is republished with permission.

About the authors

Fatih E. Nar

Chief Technologist

Fatih, known as "The Cloudified Turk," is a seasoned Linux, Openstack, and Kubernetes specialist with significant contributions to the telecommunications, media, and entertainment (TME) sectors over multiple geos with many service providers.

Before joining Red Hat, he held noteworthy positions at Google, Verizon Wireless, Canonical Ubuntu, and Ericsson, honing his expertise in TME-centric solutions across various business and technology challenges.

With a robust educational background, holding an MSc in Information Technology and a BSc in Electronics Engineering, Fatih excels in creating synergies with major hyperscaler and cloud providers to develop industry-leading business solutions.

Fatih's thought leadership is evident through his widely appreciated technology articles (https://fnar.medium.com/) on Medium, where he consistently collaborates with subject matter experts and tech-enthusiasts globally.

Read full bio

Brandon Jozsa

As a principal solutions architect, Brandon brings over 25 years of telco industry experience to the NA TME Tiger team. For several years, Brandon has been contributing to the development of OpenStack and Kubernetes, and as the original architect and project technical lead (PTL) of OpenStack-Helm, he has been specifically targeting cloud-native solutions for performant, telecom-based workloads. Prior to joining Red Hat, Brandon was a chief architect at Mavenir and also previously served as a lead architect at Charter and AT&T.

Read full bio

Browse by channel

Explore all channels

How to use cloud hyperscalers to handle 5G traffic demand bursts

Things to consider in a 5G burst architecture

Options for bursting 5G

Expanding the platform towards hyperscaler infrastructure

Adding ephemeral clusters on a hyperscaler to the application platform farm

Our solution architecture

1. Burst the platform

2. Burst the 5G stack

3. Burst the traffic

Burst management

Summary

About the authors

Fatih E. Nar

Brandon Jozsa

More like this

Browse by channel

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links