Issue #10 August 2005

Deploying RHN: One syadmin does more with less

The problem

We had the classic IT problem. We had to find a way to do more with less. We had 35 servers with four system administrators, and life was good. The servers were stable, and the applications were running great. Then as things in the IT world are apt to do, things changed. The amount of projects were growing, and we had to increase the number of servers to keep up with this new demand for our services. However, we had to do it with fewer resources because now there was only two system administrators. As both the IT manager and one of the system administrators, I was one of the people held responsible for finding a new technology platform that we could administrate with only two people. Our old technology solution had been good, not perfect, but good. Yet we needed something better. We could no longer afford the luxury of having additional system administrators; we would have to find a solution that would automate more of our administration duties. The solution would also have to be scalable, stable, and be a supported platform for our software, which is predominately Sun® ONE™. Sneaking in behind all these problem was the tail-end of every IT issue, downtime would have to be kept to a minimum.

The amount of projects were growing, and we had to increase the number of servers to keep up with this new demand for our services. However, we had to do it with fewer resources.

I am employed by the Center for the Application of Information Technologies (CAIT), a part of Western Illinois University. CAIT develops, deploys, and supports online learning system and applications for educational entities, businesses, public agencies, and not-for-profit organizations. These learning systems support users all over the world. This requires us to keep our systems up 24/7 and to keep as close as possible to the five 9's of uptime. We have users depending on us to stay operational and to keep their information secured.

To find a solution, we had to first examine what aspect of administration took the most time. Our technology at the time was composed of a mix of different servers. Our original plan called on using a heterogeneous environment of Sun® Solaris™ UNIX® 8 and 9 and Debian® Linux®. The core applications servers were all Solaris. They were running either Sun ONE Web Server, Sun ONE Directory Server, or MySQL. To compliment these servers and to save money, we had Debian Linux servers. These servers were mainly running supportive applications such as email, help desk ticket software, listservs, and storage systems. As we examined our methodologies, we noticed that we seemed to be losing the most time in package management and security fixes. We also lost some time in configuring and troubleshooting Solaris servers as they tended to be more complex than our Debian Linux servers.

We spent hours upon hours of time trying to keep our servers up-to-date. Every week we would spend time poring over security and patch emails for the Solaris servers, trying to find if any of them applied to us. If we found any, we would determine how critical it was for us and how soon we would have to patch or upgrade. As soon as our system down time came around, we would patch the servers and reboot if necessary. However, we had no centralized methodology, so we would have to go to every server that had the issue and upgrade/patch them one at a time. We mounted the patch on a centralized NFS mount to save time from having to copy it to every server. The Debian servers package management was easier. We still spent hours poring over security and patch information every week, and we had to login in to every server to patch them, but the upgrade and patching did not require us to download anything beforehand. We could run a simple apt-get dist-upgrade and see all of our upgrades and fixes download and install themselves automatically for us. This did save considerable time over the Solaris servers; however, we still wasted time going to each and every server. There was still no central package management system. We did investigate alternative solutions such as using cfengine. It is open source software that lets you run or apply commands to multiple systems at once, so it could theoretically let us patch all the systems at once if we wanted to. It would take time to setup though, and it would be an additional piece of third party software we would have to install on our servers. I wanted a more encompassing solution that would satisfy all the issues but would use package management that integrated in with the operating system as part of it and not an add on.

The solution

The solution we needed to our problem ended up being right under our noses. Months before the changes that would cause our IT problems, we had discussed at one of our system administrators meetings about the different strengths and weaknesses of various flavors of UNIX and Linux. During this conversation one of our system administrators advocated us switching some services to Red Hat® Enterprise Linux®. He pointed out various benefits of Red Hat Enterprise Linux and Red Hat Network (RHN). I was impartial to Red Hat at the time—it had its strengths and weaknesses, but I had never used it in conjunction with RHN. My experience with Red Hat was mainly with its involvement in Fedora™. We agreed to order a few new servers and install Red Hat Enterprise Linux on them. We would then migrate some services over and evaluate further deployment then. After the installation I logged into RHN for the first time and was very impressed. Our two servers were listed along with a status report on them. They were obviously up-to-date, which the graphics indicated with a blue check symbol. What impressed me most was that I could browse the systems packages and see what ones were installed or not, and then I could push out the packages or updates that the server needed. I could even schedule these events to happen when I needed them to. It was not soon after we evaluated RHN for the first time that we were left with only two system administrators.

When the time came to look for a solution to our problems we came back to our Red Hat Enterprise Linux servers.

When the time came to look a solution to our problems we came back to our Red Hat Enterprise Linux servers. Red Hat Network and Red Hat Enterprise Linux used together offered us a means to manage and deploy packages and updates to multiple servers at one time, and with only a click of the button. We could then schedule those upgrades and patches during downtime or at our convenience. RHN also emailed us whenever there was relevant updates and patches. Migrating all of our servers would save us tremendous time in keeping our servers managed. We would still research any possible security fixes, but being notified about them along with a notice that a patch is available is extremely useful. I also enjoyed how, like Debian, Red Hat is very easy to configure and manage. With the Red Hat graphical tools making some management duties even easier. Best of all Red Hat Enterprise Linux is a supported platform by our Solaris software. I ordered twenty licenses for the initial roll out. This was our solution.

I was very excited when we received the initial email confirming our license purchase. Our first roll out would be as a replacement for our aging MySQL server. We ordered all brand new servers to help with the upgrade, so that we would not have to shut down the old servers until the new ones were 100% operational. The installation went smooth, even the update after the install had no issues. The next step was installing and configuring MySQL, which was handled by one of our chief technologists. Again there were no issues as MySQL came up and responded to our queries. Over the course of the next year we proceeded to gradually migrate servers to our new technology solution.

The results

Today RHN and Red Hat Enterprise Linux have become a staple of our administration strategy. In the morning we receive our emails from RHN letting us know the status of our servers. It tells us which servers need to have a security patch or an upgrade, and if a server is not checking in with RHN. After reviewing the information one of us will log into RHN and review the list again. Then, using our server group, which has all our servers in it, we fix all the problems with a single click of the button. Depending on the issue, we will occasionally schedule the upgrade to occur during off hours. We also utilize RHN when one of us is on the road or at home. It lets us check-up and keep-up on patches while we are away.

RHN and Red Hat Enterprise Linux freed up enough time for only two of us to effectively manage and administrate the same amount of servers that took us four to do before.

Almost all twenty licenses are deployed, and we have had little to no problems. Our servers are running smoothly and our applications are running great. With both the software and hardware changes, we are running faster than ever. RHN and Red Hat Enterprise Linux freed up enough time for only two of us to effectively manage and administrate the same amount of servers that took us four to do before.

We had the IT problem and we solved it! We are doing more with less. The only regret I have is that we can't put the few non-Red Hat servers we have left on RHN to manage their packages.

About the author

David Kirlin is the IT Manager at the Center for the Application of Information Technologies. He has over 11 years of Linux experience and is an advocate for intelligent solutions. His free time is filled with writing and various activism.