How to cache Ansible facts with Redis
Persisting Ansible fact data improves execution time, particularly when running different playbooks in sequence or targeting many hosts.
Ansible facts are data collected by Ansible from the target system and stored in a dictionary for further reference. Facts include data about the operating system, IP addresses, attached filesystems, and more. You can access this data by using the
ansible_facts dictionary variable. For more information, please refer to An introduction to Ansible facts.
Retrieving this information from the target is often computationally expensive since the control nodes need a lot of memory to constantly fetch it from the source. Ansible has cache plugins that allow it to store gathered facts for later use.
Fact caching is always enabled with the default cache plugin—a memory plugin that caches the data for the current Ansible execution. The memory plugin is ephemeral and does not persist data.
To prevent Ansible from always fetching the data during execution, other plugins with persistent storage are available to allow caching the data across runs. However, only one fact cache plugin can be active at a time.
As I'll explain in this article, you can use Redis to store the gathered facts. Redis is an in-memory store used to hold strings, hashes, lists, sets, sorted sets, streams, and more. Redis is also capable of preserving data even after reboots and system crashes.
Use the following command to list the supported plugins:
$ ansible-doc -t cache -l
If Redis is not listed, install the
community.general collection with this command:
$ ansible-galaxy collection install community.general
You can install Redis on the local control node or a remote server. In this article, I'll install and configure it on the control node, as Redis will be used solely for fact caching.
Follow the steps below for a basic Redis setup. You can also deploy Redis with additional configurations depending on your requirements, but these settings are out of the scope of this article. I'm using a Red Hat Enterprise Linux (RHEL) 9 machine with the BaseOS and Appstream repositories enabled for this example. This machine works as an Ansible control node and Redis server.
Install Redis and the Redis Python client using DNF:
$ sudo dnf install -y redis python3-redis
Enable and start Redis:
$ sudo systemctl enable --now redis
[ Ansible vs. Red Hat Ansible Automation Platform: Do you know the difference? ]
Before integrating Redis, run the
setup module with the
time command and save the result for comparison:
$ time ansible localhost -m setup
You can now select the cache plugin for fact caching in the Ansible configuration. You can use the environment variable or the configuration file.
ANSIBLE_CACHE_PLUGIN environment variable looks like this:
$ export ANSIBLE_CACHE_PLUGIN=redis
You can also set it in the
ansible.cfg configuration file:
[defaults] fact_caching=redis fact_caching_timeout = 7200 fact_caching_connection = localhost:6379:0
The configuration values represent the following:
- fact_caching indicates the cache plugin in use.
- fact_caching_timeout sets the expiration timeout in seconds for the cache plugin data. To ensure data does not expire, set this option to 0. However, it's important to set this option to a sensible value in production, depending on how your data changes. By default, it is set to 86400s, which is 24 hours. You can set it to a lower value and automate the process to refresh the data automatically. I set it to two hours in this example.
- When fact_caching_timeout expires, all playbooks that require facts and have the option gather_facts set to false will not work. To make them work, refresh the fact data in the cache.
- fact_caching_connection is a colon-separated string representing Redis connection information.
The format is host:port:db:password. For example:
Test the fact cache
setup module on target nodes:
$ ansible localhost -m setup
You can now use a playbook with gathering facts disabled, and it still works. For example, write the playbook
ansible_facts.yml like this:
--- - name: Testing fact cache hosts: localhost gather_facts: false tasks: - debug: var=ansible_facts
time command to run the playbook to determine how long it takes to retrieve the facts:
$ time ansible-playbook ansible_facts.yml
This playbook should run faster than the
setup command you executed before enabling fact caching.
You can also check if the facts are caching by querying the Redis keys using the
$ redis-cli 127.0.0.1:6379> keys * 1) "ansible_cache_keys" 2) "ansible_factslocalhost"
Playbooks can also bypass or clear the fact cache for every host in inventory by using the
--flush-cache option. This is important when the cache must be cleared before the timeout value expires. After executing the playbook with the
--flush-cache option, the Redis command keys will be empty:
$ redis-cli 127.0.0.1:6379> keys * (empty array)
Fact caching using Redis allows persisting Ansible fact data across different playbook executions. It improves execution time, particularly when running different playbooks in sequence or targeting many hosts.
[ Want to test your sysadmin skills? Take a skills assessment today. ]
Ansible facts make it easier for sysadmins to control under what circumstances playbooks will run based on actual system information.
Automation allows you to apply compliance and security policies consistently across your servers, verify compliance, and remediate servers.
Here's how to optimize your Ansible playbooks to make them run faster.