Not applying updates? You're doing it wrong
All your excuses for not doing updates—from fear of downtime to concerns about testing—are wrong.
I once worked in an IT department where making zero changes to production applications was the norm. That all changed when the company's newly formed information security team did a risk assessment.
The logic behind the department's policy was that any change, any change at all, could affect what was a known-good working system. As a result, zero changes were authorized. We would stand up new systems, but after their initial configuration and testing, they were frozen in amber, never to be changed again.
[ Learn how to manage your Linux environment for success. ]
This was not the first place I worked where applying updates was viewed with skepticism. I have heard a variety of reasons over the years why organizations do not apply updates, including:
- Updates might break something.
- Our vendor is certified on this specific version.
- The updates aren't tested.
- We cannot take downtime.
- The system doesn't connect to the internet.
- Our other security practices mean updates aren't needed.
However, here's the thing: While some of those statements may be true in some way, in many ways, they are not.
Bad reasons to skip updates
I'll consider each of these disagreements and offer my counterpoint below.
1. It might break something or it hasn't been tested
Whenever there is a change to a working system, there is a potential that something may no longer work as expected. But that is true of many technology tasks. If someone changes firewall rules, the firewall may block or allow traffic that it did not before. If users change database content, such as adding records, the database is fundamentally not the same as it was previously. Change happens constantly in IT environments. Attempting to ignore changes does not mean they are not happening.
Often when I hear this argument, it is because they do not have testing or validation procedures, or the procedures they have are ineffective. Robust testing and validation procedures can significantly reduce the risk to production workloads. Don't get me wrong; I'm not saying that testing will remove all possible risk, but it will significantly reduce it. Furthermore, the more practiced the organization is at this, the more it develops expertise and generally improves.
2. Our vendor is certified on this specific version
I hear this one a lot. First, your vendor wants to keep you as a customer. If they are not meeting your business needs, you need to provide that feedback to them. Red Hat Enterprise Linux (RHEL) offers several extended lifecycle options that allow systems to maintain a specific update release while receiving critical security updates. Extended Update Support (EUS) is available for RHEL releases in a full-support lifecycle. This add-on service allows systems to stay on specified releases longer.
[ Try this no-cost online course: Red Hat Enterprise Linux technical overview. ]
3. We cannot take downtime
This is my least favorite contention because it ultimately smacks of poor architecture or management. Everything requires maintenance. Everything. Business-critical assets like fuel pipelines, aircraft, trucks, ships, and buildings all receive regular maintenance. Without maintenance, these critical business assets would be at risk of random breakdowns (which can still occur), but with regular inspections and maintenance, their longevity and reliability are far improved.
Computers are no different. If this computer system is so business critical that it can not be down for any amount of time, you are just waiting for a critical failure to disable it. Eventually, you will have a bad disk, power supply, memory stick, motherboard, or countless other things that can go wrong with it. You will take downtime; it is only a question of when. Regular, scheduled maintenance allows updated software to be applied, environments validated, and, if needed, additional maintenance procedures to be performed.
[ Get the guide to installing applications on Linux. ]
4. The system doesn't connect to the internet or our security practices mean it's not needed
Security practices are often layered. How many locks does your front door have? If you drive a car, do you both lock the doors and take the key with you? When you remove money from an ATM, do you not insert your unique card and type in a secret code to access your accounts?
System security shouldn't be different. Maybe this system does not connect to the internet, but what about the systems it connects to? As mentioned above, changes happen all the time in computing environments. Even if the system isn't supposed to have access to the internet, if someone makes a mistake and connects it to the wrong VLAN with a switch update–what then?
How to implement a consistent update strategy
There is a reason security standards include system updates within compliance requirements. Without applying at least some of the updates released for your system, you are increasing the risk that something will happen that causes you to have an unplanned outage or, worse, some sort of system compromise. In the case of the employer I mentioned above, adopting a security policy and the auditing that accompanied it caused the company to reverse course on updates. An audit of the systems in production showed many unmitigated vulnerabilities, many of which had known exploits that allowed attackers to access systems or data.
Once we identified that updates were required, we set about determining what updates we should target. Ultimately, we settled on Critical and Important security updates because those are produced throughout RHEL's lifecycle. Further, the security policy's intention was to close vulnerabilities that could pose a risk to the systems or business, so bug fixes or feature enhancement updates did not meet this goal. Lastly, the company did not include Low or Moderate-rated updates in the update strategy because the vulnerabilities they addressed were not considered risky enough to cause the environment to take more changes while RHEL was in its full-support lifecycle phase.
After deciding the scope of updates, the next task was determining the frequency. The security policy mandates application within 30 days of vendor publication.
[ Learn about enabling live kernel patching on Linux. ]
Since we were interacting with the systems more frequently, we tried to combine update procedures and any downtime they included (like reboots) with hardware updates. Beyond RHEL, we also had to consider things like firmware updates from our system suppliers and updates provided by layered software, like database vendors. Often, we could combine all of these events in a single maintenance window for systems. If possible, we would also schedule hardware replacements in this window as well. As an added benefit, since we were inspecting and interacting with the machines regularly, we could better track the hardware and software inventory and get a picture of our overall population makeup and health.
During our first round of environment updates, we found some unexpected systems, like a couple of Red Hat Linux 7 systems (not Red Hat Enterprise Linux, but its predecessor), or a mission-critical system that was responsible for producing tens of millions of dollars of revenue that had been running for 10 years on a single server with no backups or disaster plan if it were to fail.
You wouldn't fly on an airplane without maintenance for a decade for fear of a catastrophic failure. Why wouldn't you protect your business-critical applications from a similar fate?
For more insight, watch this Into the Terminal episode about CVE mitigation.
An automated patch-management system helps keep your server infrastructure patched and maintained in a timely manner.
Kernel live patching is a great way to keep your infrastructure updated while minimizing manual work and avoiding system restarts
Use automation to reduce the time IT teams spend deploying patches and apply updates consistently across systems.