In the introductory article of this series, we discussed fault management and performance management (FM/PM) and how it's a critical domain for telco operators. An important part of FM/PM is observability, because you can't effectively manage what you can't see.
Observability needs a single pane of observability of the entire network and services, including the hardware and software components. The challenge is integrating diverse monitoring, telemetry, and alerting systems components into one interface that can provide real-time insights across the entire telecom infrastructure. In this article, we look at characteristics of some of the most sought-after requirements from telco operators when observing and monitoring solutions for their networks.
Real-time metrics feedback loop
For optimal functionality of a 5G telecom service provider network, the timely correlation of events and metrics must be ensured across different layers and protocols in the telco network, including RAN, core, and transport entities. This correlation depends on how quickly faults are propagated to the management (local and remote) and how fast the response and feedback propagates back to the origin.
A good observability solution implements this propagation loop within the proscribed time budget. A non-real-time (if not a near-real-time) response is essential for the proper functioning and performance of a 5G network. Any degradation in the network must be quickly noticed by the operator monitoring the networks.
Real-time metrics feedback loops are the backbone of modern observability and resilience strategies, particularly in industries like telecom, where rapid adjustments are key to maintaining service quality.
On-prem long-term log storage
Telco networks generate massive amounts of real-time data, such as call detail records (CDR), packet flow, and signaling data. Storing, collecting, and processing this data in real time to identify and respond to issues is a challenge. Storing on-prem long term also enables a comprehensive view by correlating logs, metrics, and traces into one cohesive interface. This helps in root cause analysis and faster troubleshooting. Plus, the regulatory compliances in many jurisdictions require telcos to store logs for months, if not years!
Logging helps with network optimization and SLA monitoring. It enriches data for AI/ML models for predictive analysis and anomaly detection. Last but not least, logs play an important role in troubleshooting and finding patterns for rare issues.
Identifying data retention requirements of various cloud applications can help choose the right storage solution, such as object storage like Ceph. Log management tools like ELK, Searchstack, and Loki need to be able to access, aggregate, ingest, and process logs with indexing or compression and encryption, as required. Telcos must treat long-term log storage as a regulatory requirement and a strategic asset for better observability, security, and optimization.
Metrics based autoscaling
Horizontal pod autoscaling using custom metrics allows scaling pods based on metrics beyond the default CPU or memory usage, such as application-specific metrics like number of sessions, data throughput, or business metrics.
Consider this use case of dynamic scaling in a 5G network:
- Metrics collected: CPU and memory usage, throughput, and network specific traffic patterns
- Analysis: If traffic exceeds 80% capacity on a specific slice, the system predicts a potential SLA breach
- Feedback action: Automatically spin up additional virtualized RAN nodes to handle the load
Such use cases need custom metrics to be observed by the telco cloud. Metric collectors, like Prometheus, can be configured to scrape custom metrics data.
Variety of management interfaces
A typical telco service provider has its own customized (usually hierarchical) management interfaces. Management components expect incoming metrics, alerts, or logs to be in a specific format, and through specific protocols and interfaces.
The challenge lies with the cloud metric servers and their clients to keep their implementation as generic as possible to cater to various management interfaces, but also easily customize as the telco operator needs. Retention times, secure interfaces, packet sizes, and protocol versions are some of the tunable parameters required to support these requirements.
Regulatory compliance audit support
Telecom observability and monitoring solutions must align with a variety of regulatory compliance standards to ensure proper auditing, data integrity, and lawful data management. These compliance standards aim to protect sensitive information, enable lawful interception, and provide sufficient transparency to meet legal, business, and operational requirements.
Some common requirements for observability systems include:
- Enable detailed logging of all intercepted communications
- Support secure and auditable access pathways for authorized entities
- Ensures the confidentiality, integrity, and availability of monitoring data
- Audit logs must track system access, modifications, and operational changes
Telco networks generate massive amounts of observability data, making real-time compliance challenging.
Granularity in parameters
The granularity of parameters in a telco monitoring solution significantly impacts its effectiveness, efficiency, and the value it provides. In this context, "granularity" refers to the level of detail or resolution at which data is collected, processed, and analyzed. Choosing the right level of granularity is critical for addressing the unique demands of a telco network.
Detailed metrics, such as per-second packet latency or per-user session throughput, offer deep visibility into network performance. This is useful for diagnosing specific issues, such as jitter in VoIP calls or individual user experience. Aggregated metrics provide a high-level overview, like average network latency per hour. These are suitable for identifying long-term trends but may overlook transient issues.
Observability in telco
Observability is a requirement for telco networks, to ensure compliance and quality of service. Getting observability right, however, isn't easy. Red Hat serves the telco industry, and provides solutions that meet and exceed expectations. For more information about our Telco services visit our Telco industry page.
product trial
Red Hat Advanced Cluster Security Cloud Service | product trial
About the author
Deepak has been working in RedHat since 2023 as Product Manager for Cloud Telco platforms. Prior to this he has been with Nokia & Ericsson in areas of software development and solution architecture for products in Radio and core networks. His recent interest has been in Telco Observability and the involved AI/ML technology and tooling for the same.
More like this
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Inclusion at Red Hat
- Cool Stuff Store
- Red Hat Summit