Imagine you have hundreds or thousands of hosts to manage from your Ansible Automation Platform (AAP) controller, but you cannot reach some of them. It could be because firewalls are blocking you, or maybe the service account or sudo is not yet configured on your managed nodes. Another possibility is that the environment changed, and suddenly you cannot automate some of your nodes.
[ Get started with IT automation with the Ansible Automation Platform beginner's guide. ]
Check the documentation for more information about what AAP requires to connect to its targets.
If you have only a handful of exceptions, you can just grab the output of your Ansible playbook and check them case by case.
But what if you have dozens or hundreds of cases to investigate? Wouldn't it be nice to have a summary of all these exceptions that you could open in a spreadsheet and distribute to your fellow sysadmins and network subject matter experts to help you?
Read on to learn how I solved this issue in three steps.
1. Check connectivity to the targets
First, I wrote a playbook to check connectivity to my targets:
---
- name: Check Connectivity and Report
hosts: nodes
gather_facts: false
tasks:
- name: 01 - Test Connectivity
ansible.builtin.ping:
register: connectivity
ignore_unreachable: true
- name: 02 - Save summary of connectivity check
ansible.builtin.set_fact:
summary: "{{ (summary | default([])) + [ item + ';' + _result ] }}"
vars:
_result: "{{ (hostvars[item]['connectivity']['msg'] | default('OK')).splitlines() | join() }}"
loop: "{{ ansible_play_hosts }}"
delegate_to: localhost
run_once: true
- name: 03 - Show result
ansible.builtin.debug:
msg: "{{ summary }}"
delegate_to: localhost
run_once: true
- name: 04 - Save result to csv file
ansible.builtin.copy:
content: "{{ (summary | sort | join('\n')) + '\n' }}"
dest: /tmp/connectivity_test.csv
delegate_to: aapwork
run_once: true
...
The playbook runs against all my nodes, and I explicitly set gather_facts to false because I want to accomplish the connectivity test in a task with a special flag (ignore_unreachable).
Some comments about the tasks:
- 01 - Test Connectivity:
ignore_unreachableis set to true. Without this, the playbook would not execute the remaining tasks for this node. Notice that the next tasks run on localhost, but that is all I need to use the connectivity test's results for my summary. - 02 - Save summary of connectivity check: This executes after all nodes are tested. I run a loop based on
ansible_play_hosts(an Ansible magic variable containing a list of all hosts processed in this playbook). For each host, I add an element into the array/list named summary. I used some Jinja2 filters to handle cases where a line feed appears in the output. This summarization task includes:delegate_to = localhostruns on the localhost (AAP controller or Ansible controller).- The
run_once = trueloop processes the list of hosts, but I only invoke the task once (instead of running the loop multiple times).
- 03 - Show result: This is a simple display of the array/list accumulated in the previous task. Also, it's executed only once and on the localhost. (And yes, these two tasks could be coded as a block.)
- 04 - Save result to a CSV file: The last task dumps the array/list containing the summary to a file, which is an external server in my example. Here are some important aspects:
- I executed this in my AAP, so the localhost is my Execution Environment. This is why I want to write the file to a server I can connect to later to retrieve the output file. Saving a file and trying to retrieve it from an EE would require additional steps, which are not necessary for this use case.
- If you run this playbook from the command line, it is OK to use localhost as the delegated host in this task because it is easy to get the file manually.
- The Jinja2 templates sort the output and convert each list item to a line in the file.
[ Get an Ansible Automation Platform trial subscription. ]
2. Execute the playbook
Here's a look at the playbook's execution in AAP:
Notice that the playbook finished successfully (as I had the ignore_unreachable option set to True).
Also, in my limited inventory, I had one case of "Invalid/incorrect password" and another case of "Failed to connect to the host via ssh."
In a more realistic environment, I would have many more hosts and issues to analyze, which is where this playbook could be really useful.
[ Learn about migrating to Ansible Automation Platform 2. ]
3. View the output in a spreadsheet
In the last task, I wrote a CSV file, which I grabbed and opened using a spreadsheet application.
Follow the steps to open the CSV file in your favorite spreadsheet tool. Remember to select the semicolon character (and only it) as the field separator because my playbook uses this in task 02 - Save summary of connectivity check.
Wrap up
In a scenario where you could have many different issues for many hosts, having a summary like this in a spreadsheet might be really helpful.
Connectivity problems to your managed hosts can happen at the beginning of a project when groups of hosts are added (during the acquisition of another company, for example) or due to network, firewall, or security changes. If this happens to you, this troubleshooting method may help you identify the source of your problems more efficiently.
[ Looking for more on system automation? Get started with The Automated Enterprise, a complimentary book from Red Hat. ]
저자 소개
Roberto Nozaki (RHCSA/RHCE/RHCA) is an Automation Principal Consultant at Red Hat Canada where he specializes in IT automation with Ansible. He has experience in the financial, retail, and telecommunications sectors, having performed different roles in his career, from programming in mainframe environments to delivering IBM/Tivoli and Netcool products as a pre-sales and post-sales consultant.
Roberto has been a computer and software programming enthusiast for over 35 years. He is currently interested in hacking what he considers to be the ultimate hardware and software: our bodies and our minds.
Roberto lives in Toronto, and when he is not studying and working with Linux and Ansible, he likes to meditate, play the electric guitar, and research neuroscience, altered states of consciousness, biohacking, and spirituality.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래