What actions do you take when patching goes wrong?
Patching and updating systems is a key step in reducing possible attack vectors against your infrastructure. When there are systems in your environment that are not up to date with patches, there could be attack vectors that you don’t know about potentially affecting your entire organization. However, what steps do you have in place for when a patching event doesn’t go as expected?
[ Readers also liked: Securing an inherited Linux system ]
For example, dependencies might not be met, there could be mismatched versions across i686 and x86_64 RPMs, new package versions might not work as expected, or something else might go wrong. When something goes wrong, it’s important to have a plan for how to proceed. This will reduce the stress level and ensure that everybody working on the task knows what the other people are doing.
Test patches
When patching, it’s important first to test the patches in a test environment that matches your production environment. If you test the patches on different hardware, with different versions of software, or with different workloads or processes than your production environment, there is no guarantee that the patching test results will reflect what will happen in production. Once you have a test environment where you can verify that a given patch bundle should be installed, you will greatly reduce the chances of issues when installing the updates.
Failed patches
If updates fail to install, the first thing to do is capture any output on the console or in the logs. This might be simply copying a log file to a separate location, or it could be copying the text displayed on the console screen. Depending on how patching was attempted, you might want to try rerunning the updates, this time with verbose output enabled. Once you have the error output, you will want to see any differences between your test environment versus what you have in production. You also need to verify all of the patches were applied in testing and that no patches or errors were accidentally missed. One essential item to check is the list of the installed RPMs on your test server to compare with the versions in your production server.
For example, on the production server:
# rpm -qa --queryformat '%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}\n'| sort &> /tmp/rpm-qa.prod.output.txt
You could then compare that against the output collected on your test server in its /tmp/rpm-qa.dev.output.txt
.
You should also check to make sure that the available yum
repositories are the same on both systems. You can do this in three simple steps.
First, clear the cache:
# yum clean all
# rm /var/cache/yum/* -rf
Next, refresh subscription-manager:
# subscription-manager refresh
Third, list the repositories in yum
with the -v
argument so that you can see extra information such as the repodate and the number of packages in the repositories:
# yum repolist -v
In the following example, we’ll look at the rhel-8-for-x86_64-appstream-rpms repository used by a client to my Red Hat Satellite server:
Repo-id : rhel-8-for-x86_64-appstream-rpms
Repo-name : Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
Repo-revision: 1605844838
Repo-updated : Thu 19 Nov 2020 11:02:03 PM EST
Repo-pkgs : 13,502
Repo-size : 31 G
Repo-baseurl : https://opendemo.usersys.redhat.com/pulp/repos/opendemoorg/Library/content/dist/rhel8/8/x86_64/appstream/os
Repo-expire : 1 second(s) (last: Wed 31 Dec 1969 07:00:00 PM EST)
Repo-filename: /etc/yum.repos.d/redhat.repo
The key lines here are repo-id, repo-updated, repo-pkgs, and repo-baseurl. If my test and production systems show different information for their upstream repositories, then there is a chance that dependencies might not be met or something else might fail. If that is the case, you would need to investigate why the systems are seeing different information.
Other settings
Suppose the test and production systems have the expected RPMs and same repositories, but patching is still failing. In that case, other possible causes could be a misapplied security setting, low disk space, or maybe incorrect user permissions. To investigate those, checking logs such as /var/log/messages
, /var/log/secure
, and /var/log/audit/audit.log
might be helpful, as well as using the command df -h
to check disk space. Also, Red Hat customers are welcome to open a support ticket for assistance in resolving the issue.
[ Free online course: Red Hat Enterprise Linux technical overview. ]
Wrap up
There are many possible causes for patches that fail to install correctly, but being able to compare your test environment against your production environment will make troubleshooting much easier. Configurations, dependencies, workloads, and repositories should all be the same in the two environments.
Peter Gervase
I am a Senior Principal Security Architect at Verizon. Before that, I worked at Red Hat in various roles such as consulting and in the Solutions Architect where I specialized in Smart Management, Ansible, and OpenShift. More about me