On June 30, 2024, CentOS Linux 7 will reach End of Life (EOL). Explore Red Hat’s options to help ease your migration, including Red Hat Enterprise Linux 7 for Third Party Linux Migration. Find out more


When we first launched in-place upgrades using the new Leapp framework at Red Hat Summit in 2019, we asked our panel of Red Hat Accelerators if it was something they’d actually use.  

They ALL said, "probably not." They viewed major version Red Hat Enterprise Linux (RHEL) upgrades as a chance to replace hardware and to clear up technical debt. But things seem to have changed. Here what our customers are telling us in 2023:

Approaches to updating a Red Hat Enterprise Linux instance -- 65% responded "Use Red Hat upgrade tools on OS image in place", 32% responded "Deploy a new OS image and reinstall applications"

Source: Annual Linux market report for 2021

What caused this change? Almost all of our customers have started thinking about hardware upgrades and software upgrades as separate challenges. Our solutions architects in the financial sector have been reporting that all of the big banks are planning on using in-place upgrades for 100% of their systems going forward.

It’s something to ponder as RHEL 7 approaches End of Maintenance in June 2024.

While major version upgrades have been a major pain point historically, there’s great upside for all RHEL users in this growing trend. Big customers have taught themselves to fully automate the process (most frequently using Ansible Playbooks and Leapp actors). The logic that drives these automations is increasingly available to the entire Red Hat community, where even more modest deployments can benefit.

What does an automated upgrade at scale look like?

Red Hatter Bob Mader was at one of those large banks in 2019, where he piloted an effort to automate 80,000 in-place upgrades from RHEL 6 to 7.  End of support was approaching rapidly (sound familiar?) They developed a high level automation process.

RHEL In-place Upgrade Automation - Key Features to Succeed at Scale (slide)

Bob and his team then automated everything, end-to-end, with a process that culminated in a push button that the application owners themselves could click to get an automated in-place upgrade. 

The most important risk-reduction element, and the idea that enabled the most buy-in from the apps teams, was having a snapshot rollback capability. It removed the concern that the upgrade is too risky. If something went wrong (which happened less than 1% of the time), they just rolled back. 

At the heart of any automation, you’ll need to create custom modules (usually Ansible Playbooks). The Leapp framework doesn't magically enable flawless upgrades. Leapp instead updates the Red Hat-signed packages to packages needed with the newer kernel. If you’re like most organizations, you've got tools and agents on your standard RHEL builds (security scanners. backup and recovery tools, Chef, Puppet, etc…). All of these tools will likely need to be upgraded, re-registered or reconfigured separately. 

Finally, Bob’s team learned that it smooths out the process if you build a reporting dashboard. A lot of customers will use a tool like Splunk for this, but it could also be done with an open source solution. This dashboard is a way to visualize the upgrade process as well as the upgrade data and the upgrades being completed over time.

This reporting piece is so important. It helps you navigate from “Why can't your people manage the life cycle of this platform?” or “We're going to get fined by the regulators because we're out of compliance!!!” to your leadership seeing lots and lots of upgrades happening every month and a compliance report that starts looking much, much better. If you can demonstrate you’re upgrading a thousand hosts a month, the boss will be really, really happy.

The completed implementation

The final portrait of the upgrades infrastructure you’ll create will look like this:

The completed implementation illustrated, with an Analysis Phase, an Upgrade Phase, and a Commit Phase

This workflow is for actual Linux system engineers. The little icons in the corners indicate which of these are automated with Ansible Playbooks. There are three phases:

The first phase is building the pre-upgrade report. This report allows you to assess if there are any upgrade blockers. If there are, you will receive advice on how to apply recommended remediations.  After you address these, you run the report again, and iterate through that sequence until the blockers are removed. You should also automate remediations where possible so you don't have to deal with manual steps on every system. Once automated remediations are built and executed, most pre-upgrade reports will reflect that your systems are ready to upgrade. 

If you use Red Hat Satellite (in this case, to provide a communication path to your hosts), you can create a consolidated report using Red Hat Insights tasks.

The second phase is the upgrade phase. With your pre-upgrade report in hand, and your remediations developed, you can schedule the maintenance window to do your upgrade. There are usually two playbooks for this step:

  • The first creates the recoverable snapshot (we're not going to make any changes to the host before we create this!) 
  • The second is the upgrade playbook which is going to run our custom modules, the pieces that have to go before the upgrade and post upgrade at the appropriate times. 

In the middle, the automations will perform the Leapp upgrade itself, which is what actually brings the host up to the next major version.

The third phase is the commit phase. The upgrade has completed, the system is up on the new OS version, and it's time to take a look and see if the applications are still working. 99% of the time the systems will work. RHEL system libraries, which have really good compatibility attributes, assure that applications run as expected. If you get through the upgrade and you realize, “Oh my goodness, the application is not working,” you have the roll back button to take you right back to the prior state - without causing an outage. 

In this final phase, you typically let the systems burn in for a while. Once they seem solid and everyone says, “Yeah, I think everything's working,” you can execute the commit phase, which deletes the snapshots, recovers the storage space, and you’re done.

In conclusion

This probably feels like a lot of steps, but they can be handled incrementally. And you don't have to create everything from scratch. Red Hat has a lot of RHEL upgrade experts at this point, and we’re releasing a ton of content upstream. If you're a smaller company, you might be thinking, “Wow, this sounds pretty hard for a group like ours,” but you’ll just follow the same steps, using a smaller version of this process with fewer playbooks. The basic principles still apply.  And, of course, we’re here to help. 

More about RHEL upgrades

Documentation

Videos

Community automations and helpers

https://github.com/oamg/ansible-leapp

This upstream repo has the collection of Ansible roles for automating RHEL in-place upgrades using Leapp. These roles provide standardized methods for using the Leapp framework to perform pre-upgrade analysis and the RHEL upgrade itself. When you are ready to develop your own custom playbooks to run upgrades for your enterprise, you should consider using roles from this Ansible collection to make your job easier. 

https://github.com/oamg/leapp-supplements

The Leapp-supplements repo is where you can find example custom Leapp actors for handling common third-party products and specific requirements. These actors can act as pre-upgrade checks in addition to those included in the mainstream Leapp framework. Custom actors may also implement custom automation that can't be done with Ansible when tight integration with the RHEL upgrade tooling is required. It also has the Makefile for custom actor RPM packaging.

We’re here to help

As automated upgrades have evolved over the past several years, Red Hat Consulting Services  has been instrumental in assisting many customers with very large enterprise conversions. If the thought of converting a large environment has you feeling overwhelmed or unsure where to begin, Red Hat Consulting Services can share their expertise and guidance to help you get there, and possibly save you time and money in the process.


关于作者

Bob is an industry veteran with a lifetime of experience in IT dating back to the 1980s. Before coming to Red Hat in 2022, he held software consulting roles at DEC/HP and later moved to the banking industry as a pioneer leading Wall Street's early adoption of Linux. Today as a member of Red Hat's Customer-led Open Innovation team, he is committed to growing the community that's developing automation to make RHEL in-place upgrades successful at enterprise scale.

Read full bio

Bob Handlin has helped build and promote products in various parts of the tech industry for more than 20 years. He currently focuses on RHEL migrations and upgrades, but also assists with storage technologies and live patching.

Read full bio