Image

Ansible is a simple and powerful open source automation tool that can streamline many of your IT infrastructure operations. You can automate simple tasks like installing packages, or complex workflows such as deploying a clustered solution with multiple nodes or patching your operating system with many steps. Whether the workflows are simple or complex, you need to integrate appropriate optimization techniques into the Ansible playbook content.
This article covers some of the major optimization methods available in Ansible for speeding up playbook execution.
A specific task in a playbook might look simple, but it can be why the playbook is executing slowly. You can enable callback plugins such as timer
, profile_tasks
, and profile_roles
to find a task's time consumption and identify which jobs are slowing down your plays.
Configure ansible.cfg
with the plugins:
[defaults]
inventory = ./hosts
callbacks_enabled = timer, profile_tasks, profile_roles
Now execute the ansible-playbook
command:
$ ansible-playbook site.yml
PLAY [Deploying Web Server] ************
TASK [Gathering Facts] **********************
Thursday 23 December 2021 22:55:58 +0800 (0:00:00.055) 0:00:00.055
Thursday 23 December 2021 22:55:58 +0800 (0:00:00.054) 0:00:00.054
ok: [node1]
TASK [Deploy Web service] *******************
Thursday 23 December 2021 22:56:00 +0800 (0:00:01.603) 0:00:01.659
Thursday 23 December 2021 22:56:00 +0800 (0:00:01.603) 0:00:01.658
...<output removed>...
PLAY RECAP **********************************
node1: ok=9 changed=4 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
Playbook run took 0 days, 0 hours, 0 minutes, 14 seconds
Thursday 23 December 2021 22:56:12 +0800 (0:00:00.541) 0:00:14.100 *****
===============================================================================
deploy-web-server : Install httpd and firewalld ------- 5.42s
deploy-web-server : Git checkout ---------------------- 3.40s
Gathering Facts --------------------------------------- 1.60s
deploy-web-server : Enable and Run Firewalld ---------- 0.82s
deploy-web-server : firewalld permitt httpd service --- 0.72s
deploy-web-server : httpd enabled and running --------- 0.55s
deploy-web-server : Set Hostname on Site -------------- 0.54s
deploy-web-server : Delete content & directory -------- 0.52s
deploy-web-server : Create directory ------------------ 0.41s
Deploy Web service ------------------------------------ 0.04s
Thursday 23 December 2021 22:56:12 +0800 (0:00:00.541) 0:00:14.099
=====================================================================
deploy-web-server ------------------------- 12.40s
gather_facts ------------------------------- 1.60s
include_role ------------------------------- 0.04s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
total ------------------------------------- 14.04s
The output details the time it took for each task, role, and so on. This information helps you identify which task takes more time than the others.
[ Download an excerpt of Jesse Keating's Mastering Ansible to learn more about putting automation to work. ]
When a playbook executes, each play runs a hidden task, called gathering facts, using the setup
module. This gathers information about the remote node you're automating, and the details are available under the variable ansible_facts
. But if you're not using these details in your playbook anywhere, then this is a waste of time. You can disable this operation by setting gather_facts: False
in the play.
With gathering facts enabled:
$ time ansible-playbook site.yml
PLAY [Deploying Web Server] *********************
TASK [Gathering Facts] **************************
ok: [node1]
...<output removed>...
PLAY RECAP **************************************
node1: ok=9 changed=4 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
ansible-playbook site.yml 3.03s user 0.93s system 25% cpu 15.526 total
With gather_facts: False
disabling fact gathering, performance increases:
$ time ansible-playbook site.yml
PLAY [Deploying Web Server] ****************
...<output removed>...
PLAY RECAP **************************************
node1: ok=8 changed=4 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
ansible-playbook site.yml 2.96s
user 1.00s
system 26%
cpu 14.992 total
The more nodes you have, the more time you save by disabling fact gathering.
Ansible uses batches for task execution, which are controlled by a parameter called forks
. The default value for forks
is 5, which means Ansible executes a task on the first five hosts, waits for the task to complete, and then takes the next batch of five hosts, and so on. Once all hosts finish the task, Ansible moves to the next tasks with a batch of five hosts again.
You can increase the value of forks
in ansible.cfg
, enabling Ansible to execute a task on more hosts in parallel:
[defaults]
inventory = ./hosts
forks=50
You can also change the value of forks
dynamically while executing a playbook by using the --forks
option (-f
for short):
$ ansible-playbook site.yaml --forks 50
A word of warning: When Ansible works on multiple managed nodes, it uses more computing resources (CPU and memory). Based on your Ansible control node machine capacity, configure forks
appropriately and responsibly.
Establishing a secure shell (SSH) connection is a relatively slow process that runs in the background. The global execution time increases significantly when you have more tasks in a playbook and more managed nodes to execute the tasks.
You can use ControlMaster
and ControlPersist
features in ansible.cfg
(in the ssh_connection
section) to mitigate this issue.
ControlPersist=60s
keeps the connection idle for 60 seconds:
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
By default, Ansible checks and verifies SSH host keys to safeguard against server spoofing and man-in-the-middle attacks. This also consumes time. If your environment contains immutable managed nodes (virtual machines or containers), then the key is different when the host is reinstalled or recreated. You can disable host key checking for such environments by adding the host_key_checking
parameter in your ansible.cfg
file and setting it to False
:
[defaults]
host_key_checking = False
I don't recommend this outside of a controlled environment. Make sure you have a clear understanding of the implications of this action before you use it in critical environments.
[ Explore Red Hat Ansible Automation Platform 2 in this interactive guide. ]
When Ansible uses SSH, several SSH operations happen in the background for copying files, scripts, and other execution commands. You can reduce the number of SSH connections by enabling the pipelining parameter (it's disabled by default) in ansible.cfg
:
# ansible.cfg
pipelining = True
By default, Ansible waits for every host to finish a task before moving to the next task, which is called linear strategy.
If you don't have dependencies on tasks or managed nodes, you can change strategy
to free
, which allows Ansible to execute tasks on managed hosts until the end of the play without waiting for other hosts to finish their tasks:
- hosts: production servers
strategy: free
tasks:
You can develop or use more strategy plugins as needed, such as Mitogen, which uses Python-based executions and connections.
When a task executes, Ansible waits for it to complete before closing the connection to the managed node. This can become a bottleneck when you have tasks with longer execution times (such as disk backups, package installation, and so on) because it increases global execution time. If the following tasks do not depend on this long-running task, you can use the async
mode with an appropriate poll
interval to tell Ansible not to wait and proceed with the next tasks:
---
- name: Async Demo
hosts: nodes
tasks:
- name: Initiate custom snapshot
shell:
"/opt/diskutils/snapshot.sh init"
async: 120 # Maximum allowed time in Seconds
poll: 05 # Polling Interval in Seconds
The global execution time of Ansible playbooks relies on multiple configurations. You can do your infrastructure a favor by finding the best combination of configuration parameters for your needs.
This isn't a complete list, of course. You can use many other parameters to control and optimize Ansible playbook execution, such as serial
, throttle
, run_once
, and more. Refer to the documentation to learn more and apply the settings based on your Ansible environment.
Gineesh Madapparambath is a Platform & DevOps Consultant at Red Hat Singapore, specializing in automation and containerization with Ansible and OpenShift. More about me