An architect’s guide to Network Programmability

2021 年 7 月 1 日Alexis de Talhouët7 分钟阅读

Network programmability is the use of software to deploy, manage, and troubleshoot network elements. A programmable network is driven by an intelligent software stack that can take action based on business requests or network events. Let’s discuss how network programmability can help communication service providers adapt to new trends including internet of things (IoT), 5G and edge computing.

Complementary to network programmability is Software Defined Networking (SDN), which not only separates the control plane and forwarding plane of network elements but also provides (application programming interfaces) APIs to control and manage them. You can learn more about SDN in a previously published post.

History of network programmability

One of the early attempts to enable network engineers and operators to start making networks programmable was the Simple Network Management Protocol (SNMP) in 1990.

SNMP displays management data in the form of variables organized in a management information base (MIB), which describes the system status and configuration. These variables can then be remotely queried (and, in some circumstances, manipulated) by managing applications. SNMP also supports sending notifications, based on preconfigured settings, called SNMP Traps.

With the rise of SDN since 2010, many new protocols have made network programmability easier. This really changed the game, and opened the door to self-autonomous networks.

The role of SDN

In my opinion, SDN is what enables network programmability. SDN makes the distributed system that is a network topology controllable and manageable through a set of APIs. Network programmability is the ability to consume and build a system around these APIs.

At very high level, SDN and network programmability have four main components:

Applications: business logic based on use cases.
Northbound APIs: exposes programmable access to network functionalities and resources.
A shim layer: translates received requests between southbound and northbound.
Southbound Interfaces: uses SDN protocols to enable bi-directional communication between network elements and SDN controllers.

SDN brought a number of interesting protocols fostering the adoption of data models, which provide a model-driven approach to express a network element’s functionalities.

YANG: the data modeling language

Let’s first start by saying Network Configuration Protocol (NETCONF), which was published as RFC 4741, is what democratized the use of data models with YANG (RFC 6020). Even though NETCONF uses XML over SSH to communicate with network elements, and RESTCONF uses JSON or XML, YANG rapidly became a de facto standard to express device functionalities. This led to the creation of a marshaller and unmarshaller to convert back and forth XML/JSON to YANG.

As defined in the RFC 6020, YANG "is a data modeling language used to model configuration and state data manipulated by the Network Configuration Protocol (NETCONF), NETCONF remote procedure calls, and NETCONF notifications."

Device Model

A YANG model should be seen as a tree structure. In a nutshell, when looking at a device overall, the root of the tree represents the overall device feature set. Each branch corresponds to a functionality and each leaf corresponds to a specific parameter of that functionality. The granularity at which a functionality is broken down is dependent on the data model owner.

In order to fulfill the need of telecommunication operators, who wanted to abstract their network, open source communities (OpenConfig and others) and standard bodies (mainly IETF) started standardizing YANG models. This enables device interoperability and makes the network controllers device-agnostic.

SDN was a new paradigm for network administrators and vendors to understand, and YANG was a new language for them to learn. Additionally, for network vendors that saw the need to offer such capabilities, it was a challenge: the data models they were building to represent a network equipment’s functionalities were rarely complete, often missing the one parameter or feature that was required.

Sometimes the device would support it, but the model wasn’t exposing it. Sometimes the model was exposing a parameter or a functionality, and the device wasn’t supporting it. This led to a lot of churn in the vendor data model and a lot of testing in the software that was built to automate; but these days, things are more stable, and the ecosystem has matured.

As functionalities evolved, their representation had to evolve, resulting in versioning the YANG model. One of the most critical best practices is to be backward compatible between newer versions and older ones (which is true to any type of exposed API).

But of course, the versioning requirement and/or backward-compatibility wasn’t well respected and drove a lot of churn for the network administrators and developers. As soon as the device was upgraded, the automation built around it was broken, having to redo it over and over.

Service Model

We discussed how YANG was used to abstract a network element functionalities, but YANG is also used to create Service Models.

The intent of a Service Model is to provide another (in this case, service-oriented) abstraction layer, providing a curated view of the various network functionalities and related parameters required to deliver that service. This is very convenient for the OSS/BSS, as you can more easily tie in the business aspect and correlate for a given customer.

IETF and its ecosystem started defining and standardizing network services (L2VPN, L3VPN, etc.), and more are still in the works (BGP, SR, SRv6, etc.). There are also open source communities doing this, such as OpenConfig and Open ROADM.

But this added another layer of mapping, as now systems need to translate from Service Model to Device Model (which is a 1:n mapping, assuming the consumption of a vendor-specific model, or a 1:1 mapping, when vendors are adopting standardized models such as OpenConfig).

Basics of network configuration

When configuring a network element, there are three fundamental elements:

The configuration itself: This is often referred to as a template, containing either CLI, XML or JSON depending on the protocol/tool used to apply the configuration.
The notion of golden template: This is a parameterized template that enables reusability. Dynamic parameters within the template are extracted and replaced with placeholders. When a system applies that template, it will first render the resulting configuration by leveraging externally provided parameters. Technology such as Jinja or Velocity may be used here. There are two kinds of parameters:
- The static ones: This can either be a default value, or a value that is rarely changed. Typically, these are hard-coded within the template.
- The dynamic ones: This is a value that needs to be fetched or (de)allocated from a system, either internal or external (a database, an inventory, an IPAM, other network element, etc.).
The protocol used to apply the configuration: There are many protocols to do this, I’ll just name the most common ones: ssh/cli, NETCONF, RESTCONF, gRPC.

Pitfalls of Network Programmability

Any system that attempts to perform network programmability must offer at least these three capabilities. But when we look at these systems, there are few pitfalls to avoid.

When selecting a network vendor, based on some of the elements mentioned before, they are few things to consider:

Are the vendor’s models complete?
Does the vendor offer good support for its models?
How widely adopted and how open are the vendor’s models?
Does the vendor support OpenConfig / other standard models?
What are the supported protocols to configure the network equipment?
What is the typical upgrade cadence?

The goal is to understand how much you should invest in automation. If the answers are not good enough, you will constantly have to update the automation that you want to build. This moves the problem into software development rather than network operations, which won’t unlock the benefits.

If you plan on doing Java development (or JVM-based) to control the network and to ingest vendor-provided YANG models in your Java application, be ready to have a solid continuous integration/continuous delivery (CI/CD) process. This enables you to release your application on a very high cadence and to adjust to use cases and bugs as they arise.

A recommendation is to separate the content used to program the network from the platform enabling provisioning of the network. That way each has their own life cycle and the content can evolve at a rapid cadence, leaving the more stable platform untouched.

What technology to use

The goal of network programmability is to address a need by dynamically changing the current state of the network. This means three things for the software applying that change:

Ensure the current state is valid before applying the change.
Apply the change and cross fingers it worked.
Validate the change was successful and no surrounding functionalities have been impacted.

Overall, it requires a pre-check and a post-check, hinting at the need of a workflow engine in order to sequence the various steps.

In addition to that, as a developer you will need a programming/scripting language; Python/Ansible are the ones that saw the best traction when it came to network provisioning. It provides proven and adopted libraries/modules, such a Jinja for templating and ncclient for NETCONF protocol, which helps simplify a lot the development process.

The overall development technologies you’d need likely exist in open source and have a decent community sharing around them.

Also, there have been a lot of activities in the networking space within open source communities. Whether to build a full-blown network orchestration platform, such as ONAP, Open Source Mano, etc. or simply to provide programmable hardware interface to boost the customizability, such as P4, and even more recently, eBPF (that is gaining momentum through its adoption in Cilium, a container network Interface plugin for Kubernetes).

Finally, you can use GitOps methodology as you separate the content (template, scripts, etc.) from the platform (Python/Ansible executors, workflow engine, etc.), enabling the content to have its own life cycle. This method may help with version-control and centralization of your code, as well as provide the source-of-truth and trigger for network provisioning actions.

Closing remarks

The further you dig into the subject of network programmability, the denser it gets, with many challenges to solve. The trend is heading towards having self-autonomous networks, meaning systems that are able to program the network, track its current state, and remediate the state when it’s not as expected.

But that latter part, "remediate the state when it’s not as expected," is difficult to achieve, and is currently driving some work in artificial intelligence and machine learning. Kubernetes could help make creating a control plane to manage the life cycle of defined resources easier (thanks to the Operator Lifecycle Manager framework), and could become another way to drive network automation.

The components required for a successful network programmability journey don’t stop here. You will also need an inventory system to track network elements and provide information on how to access them, a monitoring system to analyze the actual state of the network, and a policy system to identify what remediation action should be taken in case of a detected event.

A foundational layer for network programmability that provides a workflow engine and inventory management, among other features, is Red Hat Ansible Automation Platform. It has seen a lot of traction in the Network Automation space. Check out our Network Automation use cases to learn more about how you can modernize your network with Red Hat.

关于作者

Alexis de Talhouët

Telco Solutions Architect

Alexis de Talhouët is a Telco Solutions Architect at Red Hat, supporting North America telecommunication companies. He has extensive experience in the areas of software architecture and development, release engineering and deployment strategy, hybrid multi-cloud governance, and network automation and orchestration.

Read full bio

按频道浏览

探索所有频道