A few weeks ago, I talked with the venerable Ken Hess on the "Red Hat Enterprise Linux Presents …" live stream. The topic of discussion was general systems administration practices, and it became clear that Ken and I have very different opinions of what that is.
Both Ken and I worked in what I can only describe as the Gilded Age of systems administration. In this time, administrators would lovingly hand craft the systems that they administered. There was literally a guild, which still exists today: The System Administrators Guild. Also, during this period, a lot of systems were incredibly expensive. As a result, administrators would often manage 5-20 servers. One of the factors was that compute hardware in the Unix space was extremely expensive. My Silicon Graphics Indy workstation, pictured below, was about $26,000 when new.
In this era, we needed different skills to be effective administrators, and we spent a lot of our time doing tasks like:
Storage planning: Large disk drives were 1G, and filesystems didn't support features like resizing. When you installed a system and set up its storage configuration, the sizing selected for different filesystems and their placement on the disk was important for ensuring the machine's longevity. If you chose poorly, you'd find yourself months later redoing the whole thing and restoring content from a backup.
Software management: Packaging of software was almost non-existent. Generally, you were downloading a source archive, compiling it, and then installing it on the machine. However, because that software wasn't packaged, you, as the administrator, then got to maintain it as well. This consisted of monitoring the project you had downloaded (like Apache) for updates to be issued. Once they were, you were then able to download the updated version, compile again, and install it. What fun, right?
Recompiling the kernel: If you were lucky, when you needed an extra device on your machine, like a tape library, scanner, or optical storage, the system's kernel would have the drivers for it. However, oftentimes, it didn't. That meant that you got to re-compile your kernel to either add the driver or, again, if you were lucky, make a driver module for the kernel. What a great way to spend a day at work!
Managing individual processes: These systems were often not single-use. Often you had a system functioning as a webserver for some information, but also running data analytics jobs or rendering, running as the mail server, providing DNS services for the organization, and acting as a file server. Because the system did so many things, a runaway Bind process or Apache daemon could drastically impact your organization. This meant that you were checking on the systems and looking at their processes pretty frequently to catch these problems early or writing your own scripts to run in cron jobs to notify you of potential issues. Unlike most things we rely on today, we didn't have comprehensive monitoring applications. We had to write those ourselves.
Managing users: Because systems were multi-function, they were also used by multiple people who did different things. That meant that different people needed access to various systems. Therefore, you also managed individual accounts across your fleet of systems. There were some central user services, like NIS, but at this time, there wasn't great control over which systems a particular user was allowed to access. That meant that users could access any system in the organization if they were in the central user service. If you worked somewhere less open than that, you got to spend your time using useradd and userdel to maintain who had access to what systems.
[ Readers also enjoyed: Sysadmin careers: Is your sysadmin job going away? ]
Clearly, today, we have many technologies that have obsoleted these tasks, from central user management and monitoring services to packaging formats and better software and hardware ecosystems. That also means that we spend our time at work doing different tasks. Having all these improvements to our technology resources over the years means that we now administer much larger populations of systems. If before was the Gilded Age, now is Industrialized Age of system administration. Larger populations of systems and deployment models like cloud mean that we are operating at a speed and efficiency that would have been impossible in the days of yore.
Today, I'd suggest that skills that allow administrators to perform tasks more efficiently or at a larger scale are more critical to have. Skills such as:
Standardization: Earlier, I talked about systems that had multiple purposes. However, having systems dedicated to a specific purpose means that you can administer all of them together. If one needs an update, they probably all need that update. If one is getting a new configuration setting, they probably all need that configuration setting.
Automation: Automation is also critical to standardization. If you discover that you need to apply an nginx update to all your web servers, you need a method to actually do that. Whether that is rolling your own tooling, scripts, or using a framework like Ansible, you need to have an efficient, repeatable way to accomplish tasks across your systems.
Monitoring: With larger populations of systems to manage, you probably can't check them all. Using a monitoring method allows you to identify problems earlier. Monitoring, when combined with standardization and automation, allows what could have been a cascading failure to be caught early and resolved. For example, if one of your web servers has low disk space on one of its file systems, many of your systems of that type are probably in a similar state (though maybe not yet across the monitoring alert threshold). You could use your automation utilities to address the filesystem issue and apply that to the population to prevent near-future problems from occurring.
Reporting: As you get those larger populations, you can't look at them all individually. You need to collect data from them about their configuration, installed packages, and other features. Again, when combined with automation and standardization, this is a powerful tool as you can do things like apply updates across vast swaths of your population that need them. Equally important is knowing what is in that population. Recently, I was asked if we used a specific vendor's software in our environment. Because I regularly collect data about what is deployed, I was able to report, with confidence, that we did not. Further, I provided some additional data on things like when systems had their last maintenance and details of what that maintenance was. If needed, I can supply a history of actions performed on systems from the population level down to individual boxes.
[ New research from HBR Analytic Services - IT talent strategy: New tactics for a new era ]
Wrapping up
As countless industries move from individual, small scale practitioners to larger industrialized processes, so must system administrators adapt. As the article's title suggests, system administration of the Gilded Age is dead, long live system administration in the Industrial age!
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래