Easing into automation with Ansible
In the end of 2015 and the beginning of 2016, we decided to use Red Hat Enterprise Linux (RHEL) as our third operating system, next to Solaris and Microsoft Windows. I was part of the team that tested RHEL, among other distributions, and would engage in the upcoming operation of the new OS. Thinking about a fast-growing number of Red Hat Enterprise Linux systems, it came to my mind that I needed a tool to automate things because without automation the number of hosts I can manage is limited.
I had experience with Puppet back in the day but did not like that tool because of its complexity. We had more modules and classes than hosts to manage back then. So, I took a look at Ansible version 126.96.36.199 in July 2016.
What I liked about Ansible and still do is that it is push-based. On a target node, only Python and SSH access are needed to control the node and push configuration settings to it. No agent needs to be removed if you decide that Ansible isn’t the right tool for you. The YAML syntax is easy to read and write, and the option to use playbooks as well as ad hoc commands makes Ansible a flexible solution that helps save time in our day-to-day business. So, it was at the end of 2016 when we decided to evaluate Ansible in our environment.
As a rule of thumb, you should begin automating things that you have to do on a daily or at least a regular basis. That way, automation saves time for more interesting or more important things. I followed this rule by using Ansible for the following tasks:
- Set a baseline configuration for newly provisioned hosts (set DNS, time, network, sshd, etc.)
- Set up patch management to install Red Hat Security Advisories (RHSAs).
- Test how useful the ad hoc commands are, and where we could benefit from them.
Baseline Ansible configuration
For us, baseline configuration is the configuration every newly provisioned host gets. This practice makes sure the host fits into our environment and is able to communicate on the network. Because the same configuration steps have to be made for each new host, this is an awesome step to get started with automation.
The following are the tasks I started with:
- Register Red Hat Enterprise Linux and attach a subscription with Ansible
- Configure DNS with Ansible
- Synchronize time across hosts
- Configure the repos our hosts would use
- Make sure a certain set of packages were installed
- Configure Postfix to be able to send mail in our environment
- Configure firewalld
- Configure SELinux
(Some of these steps are already published here on Enable Sysadmin, as you can see, and others might follow soon.)
All of these tasks have in common that they are small and easy to start with, letting you gather experience with using different kinds of Ansible modules, roles, variables, and so on. You can run each of these roles and tasks standalone, or tie them all together in one playbook that sets the baseline for your newly provisioned system.
Red Hat Enterprise Linux Server patch management with Ansible
As I explained on my GitHub page for ansible-role-rhel-patchmanagement, in our environment, we deploy Red Hat Enterprise Linux Servers for our operating departments to run their applications.
This role was written to provide a mechanism to install Red Hat Security Advisories on target nodes once a month. In our special use case, only RHSAs are installed to ensure a minimum security limit. The installation is enforced once a month. The advisories are summarized in "Patch-Sets." This way, it is ensured that the same advisories are used for all stages during a patch cycle.
The Ansible Inventory nodes are summarized in one of the following groups, each of which defines when a node is scheduled for patch installation:
- [rhel-patch-phase1] - On the second Tuesday of a month.
- [rhel-patch-phase2] - On the third Tuesday of a month.
- [rhel-patch-phase3] - On the fourth Tuesday of a month.
- [rhel-patch-phase4] - On the fourth Wednesday of a month.
In case packages were updated on target nodes, the hosts will reboot afterward.
Because the production systems are most important, they are divided into two separate groups (phase3 and phase4) to decrease the risk of failure and service downtime due to advisory installation.
You can find more about this role in my GitHub repo: https://github.com/Tronde/ansible-role-rhel-patchmanagement.
Updating and patch management are tasks every sysadmin has to deal with. With these roles, Ansible helped me get this task done every month, and I don’t have to care about it anymore. Only when a system is not reachable, or yum has a problem, do I get an email report telling me to take a look. But, I got lucky, and have not yet received any mail report for the last couple of months, now. (Yes, of course, the system is able to send mail.)
Ad hoc commands
The possibility to run ad hoc commands for quick (and dirty) tasks was one of the reasons I chose Ansible. You can use these commands to gather information when you need them or to get things done without the need to write a playbook first.
I used ad hoc commands in cron jobs until I found the time to write playbooks for them. But, with time comes practice, and today I try to use playbooks and roles for every task that has to run more than once.
Here are small examples of ad hoc commands that provide quick information about your nodes.
Query package version
ansible all -m command -a'/usr/bin/rpm -qi <PACKAGE NAME>' | grep 'SUCCESS\|Version'
ansible all -m command -a'/usr/bin/cat /etc/os-release'
Query running kernel version
ansible all -m command -a'/usr/bin/uname -r'
Query DNS servers in use by nodes
ansible all -m command -a'/usr/bin/cat /etc/resolv.conf' | grep 'SUCCESS\|nameserver'
Hopefully, these samples give you an idea for what ad hoc commands can be used.
It’s not hard to start with automation. Just look for small and easy tasks you do every single day, or even more than once a day, and let Ansible do these tasks for you.
Eventually, you will be able to solve more complex tasks as your automation skills grow. But keep things as simple as possible. You gain nothing when you have to troubleshoot a playbook for three days when it solves a task you could have done in an hour.
[Want to learn more about Ansible? Check out these free e-books.]