Monitoring Red Hat Ansible Automation Platform using Performance Co-Pilot

January 30, 2025Nikhil Jain4-minute read

In this article, you’ll learn about the Performance Co-Pilot (PCP) tool and how we take advantage of it to implement system and application monitoring for Red Hat Ansible Automation Platform.

What is Performance Co-Pilot (PCP)

PCP is an open source performance monitoring and analysis framework developed by Red Hat. It provides a suite of tools, libraries and services to monitor, retrieve and analyze performance metrics from different systems, services and applications. PCP is designed for scalability, enabling it to monitor anything from a single server to a large, distributed network of machines in real time.

Key features of PCP:

Scalability: PCP can be used to monitor both individual systems and distributed environments
Multisource data collection: It collects data from multiple sources, including the operating system (OS), databases, network interfaces and custom applications
Extensibility: New metrics can be added by developing custom agents or extensions
Storage and retrieval: PCP can store performance data for historical analysis and supports real-time data retrieval
Real-time monitoring: It provides real-time metrics, enabling live performance analysis
Graphical and command-line interfaces: PCP includes both graphical (e.g., pmchart) and command-line tools (e.g., pminfo, pmval and pmlogsummary) for monitoring and performance data analysis

Typical components:

Performance Metrics Collector Daemon (PMCD): The central daemon that gathers metrics from agents
Performance Metrics Name Space (PMNS): A hierarchical namespace that organizes the performance metrics
Performance Metrics Inference Engine (PMIE): A tool for generating alerts or actions based on real-time metric thresholds
PMLogger: For logging performance metrics for later analysis
PMProxy: Acts as a proxy protocol, enabling PCPto monitor clients to connect to one or more PMCD instances via PMProxy

Use cases:

System performance analysis: PCP can monitor CPU, memory, disk I/O, network usage and other system metrics
Application monitoring: PCP can monitor specific applications or services to understand their resource consumption and performance trends
Historical data analysis: It can store performance data over time for historical trend analysis or forensic analysis after system failures

Why monitor Ansible Automation Platform using PCP ?

Monitoring Ansible Automation Platform using PCP is important for several reasons:

Performance insights: PCP provides detailed metrics and insights into the performance of your Ansible Automation Platform. This helps in identifying bottlenecks and optimizing resource usage
Proactive issue detection: By continuously monitoring performance metrics, you can detect potential issues before they escalate into significant problems, allowing for proactive troubleshooting
Resource management: Understanding resource utilization (CPU, memory, disk I/O) helps in effective capacity planning and ensures that your automation environment runs smoothly without resource contention
Scalability: As your automation needs to grow, monitoring helps assess when and how to scale your Ansible Automation Platform infrastructure, ensuring it can handle increased workloads without degradation in performance
Compliance and auditing: Monitoring tools can help maintain compliance with internal and external regulations by providing a clear audit trail of automation activities and resource usage
Integration with other tools: PCP can integrate with other monitoring and alerting systems, providing a comprehensive view of your infrastructure and enabling better incident response
User experience: Ensuring that your automation tasks run efficiently improves the overall user experience for teams relying on Ansible Automation Platform for deployment and configuration management
Historical data analysis: PCP retains historical performance data, allowing you to analyze trends over time, which is essential for making informed decisions about future infrastructure changes or optimizations

In summary, using PCP to monitor Ansible Automation Platform enhances performance, reliability and efficiency, so that automation efforts contribute positively to organizational goals.

Set up monitoring on Ansible Automation Platform using PCP

Currently, monitoring setup in Ansible Automation Platform is supported for both traditional and containerized installations on virtual machines (VMs). To enable monitoring, you must assign the setup_monitoring boolean to True in the set-up inventory file under the [all:vars] section. For example:

[all:vars]
setup_monitoring = True

When you run the installer, it will execute the monitoring role to configure the PCP on the Ansible Automation Platform cluster. This role installs and activates the necessary services, including pcp, pmcd and pmproxy. On the traditional RPM-based deployment, PCP is installed via DNF and run via systemd. On the containerized install, it is run in a container alongside all other Ansible Automation Platform components. Additionally, the installer sets up Performance Metric Domain Agents (PMDAs)—which are plug-ins that run as daemons for pmcd—to monitor key components such as nginx, redis, postgres and openmetrics on the Ansible Automation Platform hosts.

Furthermore, on the traditional installation, the installer designates the gateway node as the central hub for collecting PCP metrics from all nodes in the Ansible Automation Platform cluster to effectively archive metrics.

PCP uses port 44322 to expose metrics. Please make sure that port 44322 is open in your security groups if applicable. If it is not, metrics will still be available locally on the host for local analysis with the PCP command line tools, but not for any external tools to aggregate.

Once the setup is complete, you can log in via ssh to any of the gateway nodes and run the following command to check all the metrics that PCP is collecting.

Retrieving archived metrics

You can use the PCP CLI tools to retrieve metrics from an archive file. In traditional installation, the archives can be located at /var/log/pcp/pmlogger/

For example:

/var/log/pcp/pmlogger/controller.example.com/20241004.00.10

In containerised installation, the archives can be located at /home/ansible/aap/pcp_archives

For example:

/home/ansible/aap/pcp_archives/controller.example.com/20241004.00.10

Examples

To list all metrics that were enabled when the archive file was created, enter the following command:
```
# pminfo --archive <ARCHIVE_FILE_LOCATION>
```
To view the host and time period covered by an archive file, enter the following command:
```
# pmdumplog -l <ARCHIVE_FILE_LOCATION>
```
To list disk writes for each partition over the time period covered by the archive file:
```
# pmval --archive <ARCHIVE_FILE_LOCATION> \
-f 1 disk.partitions.write
```
To list disk write operations per partition, with a 2-second interval, over the time period between 14:00 and 14:15:
```
# pmval --archive <ARCHIVE_FILE_LOCATION> \
-d -t 2sec \
-f 3 disk.partitions.write \
-S @14:00 -T @14:15
```
To list average values of all performance metrics, including the time and value of the minimum/maximum, over the time period between 14:00 and 14:30, and format the values as a table:
```
# pmlogsummary <ARCHIVE_FILE_LOCATION> \
-HlfiImM \
-S @14:00 \
-T @14:30 \
disk.partitions.write \
mem.freemem
```
To display system metrics stored in an archive, starting from 14:00, in an interactive manner similar to the top tool:
```
# pcp --archive <ARCHIVE_FILE_LOCATION> \
-S @14:00 \
atop
```

Takeaways

Monitoring Ansible Automation Platform is essential for the reliability, performance and security of the services it supports. It helps detect and address issues like slow response times, server errors and security vulnerabilities in real time, minimizing downtime and potential disruptions to users. By continuously tracking key metrics, such as traffic, usage and resource consumption, monitoring enables the platform to operate at optimal efficiency.

Where to go next

For detailed information, check out Ansible Automation Platform documentation
To download and install the latest version, visit the Ansible Automation Platform installation guide
Interested in release notes? Visit Ansible Automation Platform release notes
For further reading and information, visit our e-books

About the author

Nikhil Jain

Nikhil Jain is a Principal Software Engineer with Red Hat’s Performance and Scale Engineering team who focuses on the testing, analysis and improvement of Red Hat Ansible Automation Platform products and services.

Read full bio

Keep exploring

The automated enterpriseE-book
Try Red Hat Ansible Automation Platform with self-paced, hands-on labsInteractive lab
Red Hat Ansible Automation Platform: A beginner’s guideE-book

Browse by channel

Explore all channels

Monitoring Red Hat Ansible Automation Platform using Performance Co-Pilot

What is Performance Co-Pilot (PCP)

Key features of PCP:

Typical components:

Use cases:

Why monitor Ansible Automation Platform using PCP ?

Set up monitoring on Ansible Automation Platform using PCP

Retrieving archived metrics

Examples

Takeaways

Where to go next

Red Hat Ansible Automation Platform | Product Trial

About the author

Nikhil Jain

More like this

Keep exploring

Browse by channel

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links