Kubernetes labels—the metadata tags attached to Kubernetes resources and objects—can be a cure to several headaches DevOps teams encounter. By using Kubernetes labels correctly, DevOps teams can more quickly troubleshoot issues along the application development pipeline, apply configuration changes en masse, and solve cost monitoring, allocation, and management challenges.
Using Kubernetes labels effectively benefits from an understanding of tagging methods, labeling use cases, and other recommended practices I will describe below.
Understanding Kubernetes labels tagging methods
Kubernetes includes two methods for tagging metadata to objects to organize cluster resources: labels and annotations.
Labels
Kubernetes labels are key-value pairs that can connect identifying metadata with Kubernetes objects. Kubernetes offers integrated support for using these labels to query objects and perform bulk operations on selected subsets. Many organizations use Kubernetes labels to share information within DevOps teams, such as labeling the owner of a particular pod or deployment.
Creating labels is straightforward:
"metadata": {
"labels": {
"key1" : "value1",
"key2" : "value2",
"key3" : "value3"
}
}
Annotations
Annotations are also key-value pairs for connecting non-identifying metadata with objects. In general, you do not use annotations for queries or to perform operations on object subsets.
Creating annotations is also straightforward:
"metadata": {
"annotations": {
"key1" : "value1",
"key2" : "value2",
"key3" : "value3"
}
}
Understanding why and when to use Kubernetes labels
Organizations mainly use Kubernetes labels for grouping resources for queries or enabling bulk operations. I'll describe each below.
Grouping resources for queries
Applying a certain label to resources makes it easy to view them all with a quick query.
[In this Red Hat special edition eBook, get an overview of Kubernetes storage and how it’s implemented. Download Storage Patterns for Kubernetes for Dummies. ]
Take an example scenario where a DevOps team needs to quickly learn why a development environment is unavailable. If all pods for that environment include the label dev
, this kubectl
command can immediately find their status:
kubectl get pods -l 'environment in (dev)'
NAME READY STATUS RESTARTS AGE
cert-manager-6588898cb4-25n79 0/1 ErrImagePull 0 3d2h
Without proper labeling, the DevOps team will likely need to use a general kubectl get pods
command and then use grep
to painstakingly search through its output. Advanced users may know tricks with jsonpath or Go templating to make that easier, but it's an advanced skill set. What's nice about this approach is how the DevOps team in this example quickly sees that a dev
pod encountered an issue pulling an image, which enables a faster resolution.
Enabling bulk operations
You can use label sectors to perform bulk operations. Take a scenario where a team deletes all dev and staging environments each night to reduce compute expenses. Kubernetes labels make it possible to automate that activity. For example, this command deletes all objects labeled environment: dev
or environment: sit
:
kubectl delete deployment,services,statefulsets -l 'environment in (dev,sit)'
8 Kubernetes labeling best practices to follow
The following tactics help DevOps teams optimize the benefits of Kubernetes labels and avoid common labeling errors.
1. Use correct syntax
The syntax for creating a Kubernetes label key-value pair is in the format <prefix>/<name>
.
The prefix is optional and must be a valid DNS subdomain (such as "company.com"). Prefixes are useful for kubectl
, kube-scheduler
, and others that aren't private to the user. Applications installed with Helm usually include prefixes on their label keys. Prefixes also enable the use of multiple labels that would otherwise conflict, such as those in third-party packages. If you aren't distributing resources outside the company, you can skip the prefix and anticipate no package conflicts.
[ You might also be interested in A sysadmin's guide to basic Kubernetes components. ]
The name is the arbitrary property name of the label. For example, you can use the name environment
with label values production
, testing
, and development
to label environment types effectively. A name can be up to 63 characters long and supports alphanumeric characters, dashes, underscores, and dot characters. One stipulation is that the first and last character must be alphanumeric (unless empty).
2. Know your label-selection options
You can select labeled objects based on equality or set.
Equality-based selections let you retrieve objects with labels equal or not equal to a certain value (or values). In the syntax, =
and ==
represent equality, and !=
represents inequality. You can include multiple labels separated by commas, in which case all conditions need to match. For example, selecting environment=dev,release=nightly
finds all resources that include both those labels.
Set-based selections let you find resources with multiple values at once. Sets are much like the IN
keyword in SQL. For example, environment in (dev,uat)
selects resources labeled with the name environment
and values dev
or uat
.
3. Use Kubernetes-recommended labels
Kubernetes offers a list of recommended labels for grouping resource objects. The prefix app.kubernetes.io
differentiates these recommended labels from your own company.com
custom labels. For example, app.kubernetes.io/name
, app.kubernetes.io/instance
, and app.kubernetes.io/component
labels are recommended to represent application names, instances, and components, respectively.
4. Standardize label naming conventions across your organization
Every team using Kubernetes resources needs to be on the same page by following strict labeling conventions, or the system can fall apart. Your development pipeline should perform static code analysis against resource config files to verify the presence of all required labels. When labels are improperly applied, automated processes may fail, and any monitoring tools you're using may provide false-positive alerts.
5. Add required labels to pod templates
Including labels you consider essential in pod templates enables Kubernetes controllers to create pods with the consistent states you specify. These templates are part of workload resources, like Deployments and DaemonSets. You can begin with a small list of labels in the template. For example, ones that require environment, release, and owner labels.
6. Do a lot of labeling
Thoroughness in labeling objects meaningfully increases your infrastructure visibility. Imagine scenarios where it's important to rapidly identify a process that is eating up resources or debug a critical issue. The more labeling you have in place, the easier it is to find precisely what you're looking for in Kubernetes.
7. Label cross-cutting concerns
It's valuable to label cross-cutting concerns in line with your organization's needs. Labels are intended to include cross-cutting metadata. For example, I recommend labeling the environment (dev, staging, production), service tiers (free, pro), release channels (nightly, stable, beta), specific tenants in multi-tenant scenarios, and appropriate support contacts. Kubernetes experts recommend using annotations for data like this, but I find keeping everything in labels to be useful.
8. Automate labeling
Within your continuous integration/continuous delivery (CI/CD) pipeline, you can automate some labels for cross-cutting concerns. Attaching labels automatically with CD tooling ensures consistency and spares developer effort. CI jobs should also enforce proper labeling by making a build fail and notifying the responsible team if a label is missing.
3 Kubernetes labeling practices to avoid
When Kubernetes labeling initiatives go awry, it is often because of one or more of the following reasons.
1. Using labels to store data that often changes (without a good reason)
Labels shouldn't store frequently changing data. An example is tracking the size of a database by storing the number of rows as a label. Unless that database gets updated only at fixed times, you don't want to do this.
2. Storing application-level semantics
While Kubernetes labels can join resource objects with metadata, they aren't meant to act as a data store for applications. Because Kubernetes resources are often used for only a short time and are loosely associated with applications, labels soon become out of sync.
3. Getting loose with label names
Strict labeling conventions are a best practice for a reason. Loose label naming significantly increases the time and difficulty of querying the information you're looking for.
Get labeling
With labels, Kubernetes provides powerful capabilities to achieve infrastructure visibility, perform efficient operations, and respond quickly to issues. Organizations and their DevOps teams can leverage these labeling features and realize tremendous benefits by following best practices. Think about what labels you might add or which tools you would use to query such labeled resources to gain these advantages.