How to cache Ansible facts with Redis

1 marzo 2023Evans Amoany3 minuti (tempo di lettura)

Ansible facts are data collected by Ansible from the target system and stored in a dictionary for further reference. Facts include data about the operating system, IP addresses, attached filesystems, and more. You can access this data by using the ansible_facts dictionary variable. For more information, please refer to An introduction to Ansible facts.

Retrieving this information from the target is often computationally expensive since the control nodes need a lot of memory to constantly fetch it from the source. Ansible has cache plugins that allow it to store gathered facts for later use.

[ Write your first Ansible playbook in this hands-on interactive lab. ]

Fact caching is always enabled with the default cache plugin—a memory plugin that caches the data for the current Ansible execution. The memory plugin is ephemeral and does not persist data.

To prevent Ansible from always fetching the data during execution, other plugins with persistent storage are available to allow caching the data across runs. However, only one fact cache plugin can be active at a time.

As I'll explain in this article, you can use Redis to store the gathered facts. Redis is an in-memory store used to hold strings, hashes, lists, sets, sorted sets, streams, and more. Redis is also capable of preserving data even after reboots and system crashes.

Use the following command to list the supported plugins:

$ ansible-doc -t cache -l

If Redis is not listed, install the community.general collection with this command:

$ ansible-galaxy collection install community.general

Install Redis

You can install Redis on the local control node or a remote server. In this article, I'll install and configure it on the control node, as Redis will be used solely for fact caching.

Follow the steps below for a basic Redis setup. You can also deploy Redis with additional configurations depending on your requirements, but these settings are out of the scope of this article. I'm using a Red Hat Enterprise Linux (RHEL) 9 machine with the BaseOS and Appstream repositories enabled for this example. This machine works as an Ansible control node and Redis server.

Install Redis and the Redis Python client using DNF:

$ sudo dnf install -y redis python3-redis

Enable and start Redis:

$ sudo systemctl enable --now redis

[ Ansible vs. Red Hat Ansible Automation Platform: Do you know the difference? ]

Configure Ansible

Before integrating Redis, run the setup module with the time command and save the result for comparison:

$ time ansible localhost -m setup

You can now select the cache plugin for fact caching in the Ansible configuration. You can use the environment variable or the configuration file.

The ANSIBLE_CACHE_PLUGIN environment variable looks like this:

$ export ANSIBLE_CACHE_PLUGIN=redis

You can also set it in the ansible.cfg configuration file:

[defaults]
fact_caching=redis
fact_caching_timeout = 7200
fact_caching_connection = localhost:6379:0

The configuration values represent the following:

fact_caching indicates the cache plugin in use.
fact_caching_timeout sets the expiration timeout in seconds for the cache plugin data. To ensure data does not expire, set this option to 0. However, it's important to set this option to a sensible value in production, depending on how your data changes. By default, it is set to 86400s, which is 24 hours. You can set it to a lower value and automate the process to refresh the data automatically. I set it to two hours in this example.
- When fact_caching_timeout expires, all playbooks that require facts and have the option gather_facts set to false will not work. To make them work, refresh the fact data in the cache.
fact_caching_connection is a colon-separated string representing Redis connection information.

The format is host:port:db:password. For example:

localhost:6379:0:password

Test the fact cache

Run the setup module on target nodes:

$ ansible localhost -m setup

You can now use a playbook with gathering facts disabled, and it still works. For example, write the playbook ansible_facts.yml like this:

---
- name: Testing fact cache
  hosts: localhost
  gather_facts: false
  tasks:
    - debug: var=ansible_facts

Use the time command to run the playbook to determine how long it takes to retrieve the facts:

$ time ansible-playbook ansible_facts.yml

This playbook should run faster than the setup command you executed before enabling fact caching.

You can also check if the facts are caching by querying the Redis keys using the redis-cli command:

$ redis-cli
127.0.0.1:6379> keys *
1) "ansible_cache_keys"
2) "ansible_factslocalhost"

Playbooks can also bypass or clear the fact cache for every host in inventory by using the --flush-cache option. This is important when the cache must be cleared before the timeout value expires. After executing the playbook with the --flush-cache option, the Redis command keys will be empty:

$ redis-cli
127.0.0.1:6379> keys *
(empty array)

Wrap up

Fact caching using Redis allows persisting Ansible fact data across different playbook executions. It improves execution time, particularly when running different playbooks in sequence or targeting many hosts.

[ Want to test your sysadmin skills? Take a skills assessment today. ]

Sull'autore

Evans Amoany

I work as Unix/Linux Administrator with a passion for high availability systems and clusters. I am a student of performance and optimization of systems and DevOps. I have passion for anything IT related and most importantly automation, high availability, and security.

Read full bio