In this article, we will build up on top of the basic setup from the previous article. One additional detail if you are using the pcp-zeroconf package in Red Hat Enterprise Linux 7.6: pmlogger will be configured in /etc/sysconfig/pmlogger to log metrics in 10sec steps. The default setting uf 60sec provides less granularity, but also stores less data.

The setup from the last article already enabled us to use Performance Metric Domain Agents (PMDAs), agents which are specifically written to be used with Performance Co-Pilot (PCP). These are provided in the normal RHEL repos:

[root@rhel7u5a ~]# yum search pcp-pmda
Loaded plugins: product-id, search-disabled-repos, subscription-manager
======================== N/S matched: pcp-pmda ===========================
pcp-pmda-activemq.x86_64 : Performance Co-Pilot (PCP) metrics for ActiveMQ
pcp-pmda-apache.x86_64 : Performance Co-Pilot (PCP) metrics for the Apache webserver
pcp-pmda-bash.x86_64 : Performance Co-Pilot (PCP) metrics for the Bash shell
pcp-pmda-bind2.x86_64 : Performance Co-Pilot (PCP) metrics for BIND servers
pcp-pmda-bonding.x86_64 : Performance Co-Pilot (PCP) metrics for Bonded network interfaces
pcp-pmda-cifs.x86_64 : Performance Co-Pilot (PCP) metrics for the CIFS protocol
[..]

 

Setting up the trivial PMDA

When we install pcp-pmda packages, these are deployed into /var/lib/pcp/pmdas. To get example code for programming custom metrics, we can use yum install pcp-devel. The code in /var/lib/pcp/trivial is a quite minimal PMDA, lets try it out:

[root@rhel7u5a ~]# yum -y install pcp-devel
[..]
[root@rhel7u5a ~]# cd /var/lib/pcp/pmdas/trivial
[root@rhel7u5a trivial]# ./Install
You will need to choose an appropriate configuration for installation of
the "trivial" Performance Metrics Domain Agent (PMDA).

  collector 	collect performance statistics on this system
  monitor   	allow this system to monitor local and/or remote systems
  both      	collector and monitor configuration for this system

Please enter c(ollector) or m(onitor) or b(oth) [b] c
Installing files ...
gcc -fPIC -fno-strict-aliasing -D_GNU_SOURCE -Wall -O2 -g -DPCP_VERSION=\"3.12.2\"   -c -o trivial.o trivial.c
[..]

Updating the Performance Metrics Name Space (PMNS) ...
Terminate PMDA if already installed ...
Updating the PMCD control file, and notifying PMCD ...
Check trivial metrics have appeared ... 1 metrics and 1 values
[root@rhel7u5a trivial]#

With this, we can access the new metric:

[root@rhel7u5a trivial]# pmrep trivial
  t.time
 	s/s
 	N/A
   0.997
   0.997

Customizing the trivial PMDA

I am using custom metrics for quite some purposes, for example: to monitor from which countries my website is accessed. In this article, we will modify the trivial PMDA to monitor temperature values from our system.

After installation of package lm_sensors, the sensor data is available via sensors -u. Availability of the sensor data via sensors -u is not tested as part of the Red Hat Hardware certifications, so there are systems where lm_sensors is not able to access that data.

In that case, you could for example try to access the temperature sensor of a hard disk, using smartctl -a /dev/sda.

Red Hat Enterprise Linux 7.5 comes with PCP 3.12, where trivial is implemented in C. PCP upstream additionally includes Python and Perl implementations, which we will use. On my system, several metrics with the sensor values are available, but for our example here we will just implement a single metric: the first temperature value from the output of sensors -u.

Let’s deploy a file /var/lib/pcp/pmdas/trivial/pmdatrivial.perl like this:

#!/usr/bin/perl
#
# Copyright (c) 2012,2018 Red Hat.
# Copyright (c) 2008,2012 Aconex.  All Rights Reserved.
# Copyright (c) 2004 Silicon Graphics, Inc.  All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.
#


use strict;
use warnings;
use PCP::PMDA;
use vars qw( $pmda );


sub trivial_fetch_callback  	# must return array of value,status
{
    	my ($cluster, $item, $inst) = @_;
    	my $temperature;
    	foreach ( qx( /usr/bin/sensors -u ) ) {
            	next unless m/: ([0-9.]+)/;
            	next unless $1 > 0;
            	$temperature=$1;
    	}
    	if ($cluster == 0 && $item == 0) { return ($temperature, 1); }
    	return (PM_ERR_PMID, 0);
}


$pmda = PCP::PMDA->new('trivial', 250); # domain name and number
$pmda->connect_pmcd;


$pmda->add_metric(pmda_pmid(0,0), PM_TYPE_U32, PM_INDOM_NULL,
            	PM_SEM_INSTANT, pmda_units(0,0,0,0,0,0),
            	'trivial.temp1', 'temperature',
            	'temperature from first sensor in deg Celsius');


$pmda->set_fetch_callback( \&trivial_fetch_callback );
$pmda->set_user('pcp');
$pmda->run;

You might be curious about the number 250 here, which we use. These numbers are used to distinguish PMDAs, kind of like Object Identifiers (OIDs) for SNMP. pminfo -m lists up the numbers in use on your system.

Now we need to update the file Install in the same directory regarding the perl daemon, update the last part to look like this:

[..]
. $P_DIR/etc/pcp.env
. $PCP_SHARE_DIR/lib/pmdaproc.sh

iam=trivial
dso_opt=true
perl_opt=true
python_opt=true

pmdaSetup
pmdaInstall
exit

We will now remove the currently installed trivial PMDA, install our new version, and verify that we can access our custom value:

[root@rhel7u5a trivial]# ./Remove
[..]
[root@rhel7u5a trivial]# ./Install
You will need to choose an appropriate configuration for installation of
the "trivial" Performance Metrics Domain Agent (PMDA).
  collector 	collect performance statistics on this system
  monitor   	allow this system to monitor local and/or remote systems
  both      	collector and monitor configuration for this system
Please enter c(ollector) or m(onitor) or b(oth) [b] c
Install trivial as a daemon or python or perl or dso agent? [daemon] perl
Updating the Performance Metrics Name Space (PMNS) ...
Terminate PMDA if already installed ...
Updating the PMCD control file, and notifying PMCD ...
Check trivial metrics have appeared ... 1 metrics and 1 values
[root@rhel7u5a trivial]# pminfo trivial
trivial.temp1
[root@rhel7u5a trivial]# pmrep trivial
  t.temp1
   	42
   	42

Congratulations, your temperature value is available!

We used the trivial PMDA here, if you want to monitor multiple custom values, the simple PMDA is good to start with.

Archiving custom metrics

So far, we have just accessed our custom metrics from the running pmcd daemon. When investigating the archive files in /var/log/pcp/pmlogger/<hostname> with pminfo -a, we see that the custom metric is not getting logged:

[root@rhel7u5a trivial]# pminfo \
-a /var/log/pcp/pmlogger/rhel7u5a/20180823.05.58.0 trivial
Error: trivial: Unknown metric name

This is because pmlogger is not yet configured for the new metric.

Let’s open the file "/var/lib/pcp/config/pmlogger/config.default" in an editor and jump to the end of the file. Right before the [access] section, we will insert a section with our metric:

[..]
log mandatory on every 60 seconds {
    	trivial.temp1
}

[access]
disallow .* : all;
disallow :* : all;
allow local:* : enquire;

We can now perform a syntax check with pmlogger -C -c /var/lib/pcp/config/pmlogger/config.default string and then restart pmlogger with systemctl restart pmlogger. Now a new archive file appears in directory /var/log/pcp/pmlogger/<hostname>, which also contains our metric:

[root@rhel7u5a rhel7u5a]# pminfo -a 20180827.07.44.0 trivial
trivial.temp1
[root@rhel7u5a rhel7u5a]# pmrep -a 20180827.07.44.0 trivial
  t.temp1
  	N/A
   	42
   	42
   	42
[...]

With this, our custom metrics get archived for future investigations.

If there are issues with the pmlogger configuration, then /var/log/pcp/pmlogger/<hostname>/pmlogger.log is a good place to start further investigations.

Graphical representation

Let’s turn our numbers into graphics!

Red Hat Enterprise Linux already has the required packages. After installing them, we can start the pmwebd daemon:

[root@rhel7u5a ~]# yum install -y pcp-webapp\* pcp-webapi pcp-webjs
[..]
[root@rhel7u5a ~]# systemctl start pmwebd
[root@rhel7u5a ~]# systemctl enable pmwebd

At this point, multiple frameworks to generate graphics from our PCP metrics are available when accessing URL http://<ip>:44323/ of this system with a web browser.

Graphite is especially interesting, it can be accessed via http://<ip>:44323/graphite . When using the search function of the graphite interface, we will notice that our metrics are presented with the hostname prepended, for example rhel7u5a.trivial.temp1 .

At this point, we have already installed a version of Grafana. Newer versions provide additional functionality, but as they are not part of Red Hat Enterprise Linux, they are not covered by Red Hat Support. In Grafana, graphite can be selected as data source. As URL http://<ip>:44323/graphite can be used to access our PCP metrics. The following graph was created with this combination:

Visualizing system monitoring

Final thoughts

As you have seen, monitoring custom metrics with PCP is easy to setup. How are the I/O stats and temperatures of your systems varying over the day? How do they change after activating different tuned profiles? On a laptop, are temperature values different after running powertop --auto-tune? If you run multiple web servers behind a load balancer, you could now setup logging of the access statistics and verify if they receive equal numbers of accesses per second.


About the author

Christian Horn is a Senior Technical Account Manager at Red Hat. After working with customers and partners since 2011 at Red Hat Germany, he moved to Japan, focusing on mission critical environments.  Virtualization, debugging, performance monitoring and tuning are among the returning topics of his
daily work.  He also enjoys diving into new technical topics, and sharing the findings via documentation, presentations or articles.

Read full bio