There are lots of formats for configuration files: a list of values, key and value pairs, INI files, YAML, JSON, XML, and many more. Of these, YAML sometimes gets cited as a particularly difficult one to handle for a few different reasons. While its ability to reflect hierarchical values is significant and its minimalism can be refreshing to some, its Python-like reliance upon syntactic whitespace can be frustrating.
However, the open source world is diverse and flexible enough that no one has to suffer through abrasive technology, so if you hate YAML, here are 10 things you can (and should!) do to make it tolerable. Starting with zero, as any sensible index should.
0. Make your editor do the work
Whatever text editor you use probably has plugins to make dealing with syntax easier. If you're not using a YAML plugin for your editor, find one and install it. The effort you spend on finding a plugin and configuring it as needed will pay off tenfold the very next time you edit YAML.
For example, the Atom editor comes with a YAML mode by default, and while GNU Emacs ships with minimal support, you can add additional packages like yaml-mode to help.
If your favorite text editor lacks a YAML mode, you can address some of your grievances with small configuration changes. For instance, the default text editor for the GNOME desktop, Gedit, doesn't have a YAML mode available, but it does provide YAML syntax highlighting by default and features configurable tab width:
With the drawspaces
Gedit plugin package, you can make white space visible in the form of leading dots, removing any question about levels of indentation.
Take some time to research your favorite text editor. Find out what the editor, or its community, does to make YAML easier, and leverage those features in your work. You won't be sorry.
1. Use a linter
Ideally, programming languages and markup languages use predictable syntax. Computers tend to do well with predictability, so the concept of a linter was invented in 1978. If you're not using a linter for YAML, then it's time to adopt this 40-year-old tradition and use yamllint
.
You can install yamllint
on Linux using your distribution's package manager. For instance, on Red Hat Enterprise Linux 8 or Fedora:
$ sudo dnf install yamllint
Invoking yamllint
is as simple as telling it to check a file. Here's an example of yamllint
's response to a YAML file containing an error:
$ yamllint errorprone.yaml
errorprone.yaml
23:10 error syntax error: mapping values are not allowed here
23:11 error trailing spaces (trailing-spaces)
That's not a time stamp on the left. It's the error's line and column number. You may or may not understand what error it's talking about, but now you know the error's location. Taking a second look at the location often makes the error's nature obvious. Success is eerily silent, so if you want feedback based on the lint's success, you can add a conditional second command with a double-ampersand (&&
). In a POSIX shell, &&
fails if a command returns anything but 0, so upon success, your echo
command makes that clear. This tactic is somewhat superficial, but some users prefer the assurance that the command did run correctly, rather than failing silently. Here's an example:
$ yamllint perfect.yaml && echo "OK"
OK
The reason yamllint
is so silent when it succeeds is that it returns 0 errors when there are no errors.
2. Write in Python, not YAML
If you really hate YAML, stop writing in YAML, at least in the literal sense. You might be stuck with YAML because that's the only format an application accepts, but if the only requirement is to end up in YAML, then work in something else and then convert. Python, along with the excellent pyyaml library, makes this easy, and you have two methods to choose from: self-conversion or scripted.
Self-conversion
In the self-conversion method, your data files are also Python scripts that produce YAML. This works best for small data sets. Just write your JSON data into a Python variable, prepend an import
statement, and end the file with a simple three-line output statement.
#!/usr/bin/python3
import yaml
d={
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
f=open('output.yaml','w')
f.write(yaml.dump(d))
f.close
Run the file with Python to produce a file called output.yaml
file.
$ python3 ./example.json
$ cat output.yaml
glossary:
GlossDiv:
GlossList:
GlossEntry:
Abbrev: ISO 8879:1986
Acronym: SGML
GlossDef:
GlossSeeAlso: [GML, XML]
para: A meta-markup language, used to create markup languages such as DocBook.
GlossSee: markup
GlossTerm: Standard Generalized Markup Language
ID: SGML
SortAs: SGML
title: S
title: example glossary
This output is perfectly valid YAML, although yamllint
does issue a warning that the file is not prefaced with ---
, which is something you can adjust either in the Python script or manually.
Scripted conversion
In this method, you write in JSON and then run a Python conversion script to produce YAML. This scales better than self-conversion, because it keeps the converter separate from the data.
Create a JSON file and save it as example.json
. Here is an example from json.org:
{
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
Create a simple converter and save it as json2yaml.py
. This script imports both the YAML and JSON Python modules, loads a JSON file defined by the user, performs the conversion, and then writes the data to output.yaml
.
#!/usr/bin/python3
import yaml
import sys
import json
OUT=open('output.yaml','w')
IN=open(sys.argv[1], 'r')
JSON = json.load(IN)
IN.close()
yaml.dump(JSON, OUT)
OUT.close()
Save this script in your system path, and execute as needed:
$ ~/bin/json2yaml.py example.json
3. Parse early, parse often
Sometimes it helps to look at a problem from a different angle. If your problem is YAML, and you're having a difficult time visualizing the data's relationships, you might find it useful to restructure that data, temporarily, into something you're more familiar with.
If you're more comfortable with dictionary-style lists or JSON, for instance, you can convert YAML to JSON in two commands using an interactive Python shell. Assume your YAML file is called mydata.yaml
.
$ python3
>>> f=open('mydata.yaml','r')
>>> yaml.load(f)
{'document': 34843, 'date': datetime.date(2019, 5, 23), 'bill-to': {'given': 'Seth', 'family': 'Kenlon', 'address': {'street': '51b Mornington Road\n', 'city': 'Brooklyn', 'state': 'Wellington', 'postal': 6021, 'country': 'NZ'}}, 'words': 938, 'comments': 'Good article. Could be better.'}
There are many other examples, and there are plenty of online converters and local parsers, so don't hesitate to reformat data when it starts to look more like a laundry list than markup.
4. Read the spec
After I've been away from YAML for a while and find myself using it again, I go straight back to yaml.org to re-read the spec. If you've never read the specification for YAML and you find YAML confusing, a glance at the spec may provide the clarification you never knew you needed. The specification is surprisingly easy to read, with the requirements for valid YAML spelled out with lots of examples in chapter 6.
5. Pseudo-config
Before I started writing my book, Developing Games on the Raspberry Pi, Apress, 2019, the publisher asked me for an outline. You'd think an outline would be easy. By definition, it's just the titles of chapters and sections, with no real content. And yet, out of the 300 pages published, the hardest part to write was that initial outline.
YAML can be the same way. You may have a notion of the data you need to record, but that doesn't mean you fully understand how it's all related. So before you sit down to write YAML, try doing a pseudo-config instead.
A pseudo-config is like pseudo-code. You don't have to worry about structure or indentation, parent-child relationships, inheritance, or nesting. You just create iterations of data in the way you currently understand it inside your head.
Once you've got your pseudo-config down on paper, study it, and transform your results into valid YAML.
6. Resolve the spaces vs. tabs debate
OK, maybe you won't definitively resolve the spaces-vs-tabs debate, but you should at least resolve the debate within your project or organization. Whether you resolve this question with a post-process sed
script, text editor configuration, or a blood-oath to respect your linter's results, anyone in your team who touches a YAML project must agree to use spaces (in accordance with the YAML spec).
Any good text editor allows you to define a number of spaces instead of a tab character, so the choice shouldn't negatively affect fans of the Tab key.
Tabs and spaces are, as you probably know all too well, essentially invisible. And when something is out of sight, it rarely comes to mind until the bitter end, when you've tested and eliminated all of the "obvious" problems. An hour wasted to an errant tab or group of spaces is your signal to create a policy to use one or the other, and then to develop a fail-safe check for compliance (such as a Git hook to enforce linting).
7. Less is more (or more is less)
Some people like to write YAML to emphasize its structure. They indent vigorously to help themselves visualize chunks of data. It's a sort of cheat to mimic markup languages that have explicit delimiters.
Here's a good example from Ansible's documentation:
# Employee records
- martin:
name: Martin D'vloper
job: Developer
skills:
- python
- perl
- pascal
- tabitha:
name: Tabitha Bitumen
job: Developer
skills:
- lisp
- fortran
- erlang
For some users, this approach is a helpful way to lay out a YAML document, while other users miss the structure for the void of seemingly gratuitous white space.
If you own and maintain a YAML document, then you get to define what "indentation" means. If blocks of horizontal white space distract you, then use the minimal amount of white space required by the YAML spec. For example, the same YAML from the Ansible documentation can be represented with fewer indents without losing any of its validity or meaning:
---
- martin:
name: Martin D'vloper
job: Developer
skills:
- python
- perl
- pascal
- tabitha:
name: Tabitha Bitumen
job: Developer
skills:
- lisp
- fortran
- erlang
8. Make a recipe
I'm a big fan of repetition breeding familiarity, but sometimes repetition just breeds repeated stupid mistakes. Luckily, a clever peasant woman experienced this very phenomenon back in 396 AD (don't fact-check me), and invented the concept of the recipe.
If you find yourself making YAML document mistakes over and over, you can embed a recipe or template in the YAML file as a commented section. When you're adding a section, copy the commented recipe and overwrite the dummy data with your new real data. For example:
---
# - <common name>:
# name: Given Surname
# job: JOB
# skills:
# - LANG
- martin:
name: Martin D'vloper
job: Developer
skills:
- python
- perl
- pascal
- tabitha:
name: Tabitha Bitumen
job: Developer
skills:
- lisp
- fortran
- erlang
9. Use something else
I'm a fan of YAML, generally, but sometimes YAML isn't the answer. If you're not locked into YAML by the application you're using, then you might be better served by some other configuration format. Sometimes config files outgrow themselves and are better refactored into simple Lua or Python scripts.
YAML is a great tool and is popular among users for its minimalism and simplicity, but it's not the only tool in your kit. Sometimes it's best to part ways. One of the benefits of YAML is that parsing libraries are common, so as long as you provide migration options, your users should be able to adapt painlessly.
If YAML is a requirement, though, keep these tips in mind and conquer your YAML hatred once and for all!
About the author
Seth Kenlon is a Linux geek, open source enthusiast, free culture advocate, and tabletop gamer. Between gigs in the film industry and the tech industry (not necessarily exclusive of one another), he likes to design games and hack on code (also not necessarily exclusive of one another).
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit