Internet of Things (IoT) devices are great. They can be deployed out in the field attached to a building or a truck and send back all this amazing data. If things go really well you might have hundreds no thousands of devices deployed. 

Great, we have thousands of devices deployed but how do we manage them? How do we configure and update them without having to send someone out and touch the device? Using Red Hat Ansible Automation Platform and Red Hat Ansible Tower we can centrally manage these remotely deployed devices.

Ansible over cellular connections

If you are lucky, your deployed devices will have wired network connections and you can treat the device as you would any other server or network device. 

Most likely this is not the case and your connectivity is through a cell network. Cell networks can present several challenges when trying to manage a device with Ansible. Cell networks can drop connections, they can have inconsistent latency, the data transfer rates vary from device to device. All these potential issues can affect the performance of your Ansible playbooks.

SSH Tuning

At the heart of every Ansible playbook is an SSH connection to a device. There are a few SSH options that can be used to improve the robustness of the SSH connection over a cell network.

Keep Alives

One issue that can happen is when a cell connection is dropped the underlying network connection is not properly closed, SSH thinks the connection is still up and the playbook hangs until the kernel finally closes the connection but that can take hours. To prevent the playbook from hanging, SSH client keep alives should be configured. 

The SSH options ServerAliveInterval sets the interval between the sending of keep alives in seconds and ServerAliveCountMax sets the number of keep alives the server can miss before the client drops the connection. I like to set these SSH options in the ansible.cfg

Here we set the keep alive interval to be every 5 minutes and allow the server to miss 2 keep alives. With these settings it would take approximately 15 minutes for the client SSH connection to be closed.

Ansible pipelining

A general Ansible performance recommendation is to enable pipelining. Pipelining reduces the number of SSH connections made during the execution of a Playbook.

Playbook Design

When designing playbooks to manage devices over cell connections there are few additional considerations that should be taken into account.

Fact Caching

Enabling fact caching can speed up playbooks by not having to run setup on every playbook run, but also saves on data usage by not having to pull the same data every playbook run. This is where Red Hat Ansible Tower really helps. 

A job can be scheduled once a day to run setup on all the hosts and cache the facts for use by later job runs. Then all your playbooks can set gather_facts: false. If  a playbook needs to have the latest facts you can enable gather_facts on that specific playbook.

Stage data

One scenario is the need to update firmware on the device. A typical playbook may copy the firmware and then execute the upgrade as one playbook. The new approach would be to have one playbook to copy the firmware to the device and have another to execute the install. 

The advantage is that you can stage large files over several hours or days and then quickly run playbooks that use the files later.

Validate staged data

All playbooks that use previously staged files should validate the files are correct. The best approach is to validate file checksum. Not all devices provide a method to check checksums so at a minimum files sizes should be checked.

Limiting System Resource Usage

Oftentimes the deployed devices are not the most powerful machines.  While managing these devices we do not want the Ansible automation to use all the system resources which could prevent the device from operating as expected. 

I recommend using a separate Ansible user and using cgroups to put constraints on the amount of system resources the user can use.

Working with Embedded Devices

Up until now it has been assumed the deployed device is running a full operating system like Red Hat Enterprise Linux. This is often not the case. Many devices are custom built embedded hardware running a bare bones operating system. 

Some of these devices may not even support executing commands without a terminal / TTY. We can still use Ansible to manage these devices. We will treat them as Ansible Network devices. To do so will require writing custom Ansible Networking modules for the device. 

Once the modules are written you can manage the remote device using Ansible like you would manage any other network switch or router.

Conclusion

Ansible makes it easy to manage IoT devices the same way you manage servers and network devices. Following these tips while developing Ansible playbook will make your deployments and management of IoT devices easier and more successful.

Want to get started with Ansible? Contact Red Hat Consulting today!