Curiosity solves problems
There is an old—and I think, incredibly stupid—saying that "curiosity killed the cat." I heard this plenty as a kid, though fortunately not from my parents. I personally think this dumb saying is used mostly to stifle kids and adults when their inquisitiveness takes them to places that some parents, teachers, managers, and caregivers would rather not deal with. This is one of the ways in which boxes are built around us early on.
Another terrible saying along the same lines is, "You can't teach an old dog new tricks." This one is usually used as an excuse by or for people who don't like to learn new things. This can be others or even ourselves constructing that box around us.
Author's note: Portions of this article are based on sections of my book, "The Linux Philosophy for SysAdmins."
Let's take a trip in my Wayback machine to 1970 in Toledo, Ohio. I was working at a chemical plant in a very boring job as a tester, along with seven or eight other people. We would take chemical formulations dreamed up by our chemists, compound them into vinyl, and press them into various types of fabrics used in the automotive industry for seats and vinyl roofs. Our jobs were to test the resulting raw vinyl and coated fabrics to see if they met all of the specifications supplied by the auto company that had ordered them.
Never seen a vinyl roof on a car? Yeah, this was that long ago!
One of my co-workers, Charlie, was a negative sort of guy. His complaining was incessant. He would complain about the working conditions—we had lots of volatile chemicals around, and it was pretty easy to get high and stay there if that is what you wanted, but that was also dangerous. He complained about the danger and about how boring the job was, but we all did some of that—it was part of being in that type of job, which was, in fact, boring and dangerous. But Charlie complained about everything from the moment he walked in until the ending whistle blew in the afternoon.
One day we had a conversation that went something like this.
Charlie said one morning at about 8:30am, "I hate this job. Quitting time can't get here fast enough."
I was getting pretty fed up with his negativity, so I said, "Charlie, if you hate this job so much, why don't you find another job?"
"I don't know how to do anything else."
So I said, "Well, why don't you learn something new? I'll be going back to university next term, and I'm not going to stay here for long after I get my degree. I plan to get a better job."
"That's easy for you—you're young. I'm old, and you can't teach an old dog new tricks."
I asked him, "How old are you, Charlie?"
"Thirty-six," he said.
Even then, in my early twenties and seemingly invulnerable and immortal in my own mind, I knew that thirty-six was not old. Charlie had built this tiny box around himself, most certainly based on things that those around him were saying, even if not directly to him.
Right then, I vowed to myself that I would never stop learning—that I would learn something new every day. And I have kept that vow. Of course, that has been pretty easy, what with both my vocation and avocation being computers for most of the last 50 years—there is plenty about which to be curious.
My personal motto is that "curiosity solves problems." Following our curiosity leads us to places that are outside the box, places that enable us to learn new things, places that allow us to solve our problems in ways that we could not otherwise. Sometimes curiosity can lead me directly to the cause of a problem, and other times, the connection is indirect.
I recently wrote about one such case in another article here at Enable Sysadmin. I won't repeat all the details, but here are the highlights:
I discovered that one of the hard drives on my primary workstation was quite hot. My initial reaction was to replace that hard drive, but I was curious, so I first attempted to mitigate the problem by moving one of the more heavily used filesystems to another hard drive. I also checked the airflow around the drive, and there was essentially none, but there was a place to install a fan that would force cooling air over the drive; over all of the drives, in fact. That significantly reduced the temperature of the drive and gave me time to pursue the problem further with less risk of damage to the hard drive.
But I was still curious; I wanted to know why the drive was overheating in the first place. So, I used glances and htop to determine that some software, which was part of the GUI desktop I was using at the time, was generating constant usage of the /home filesystem. I killed that process and ensured it wouldn't start again, and that is what resolved the root cause of the problem.
I could have been content to install the fan and let it be because the temperature was down to a reasonable level. But my curiosity led me to locate the root cause of the problem for a more reliable fix.
Sitting down on the job
In another instance, I fixed an IBM System/3 computer by sitting on it. In 1976, I was an IBM CE in Lima, Ohio. Two of us were installing the IBM System/3, which was smaller than an IBM mainframe—like a 360 or 370—but still large enough to need a room of its own, high voltage power, and significant air cooling.
We had assembled the main CPU and had started to attach the IBM 1403 line printer controller, which was contained in a slightly lower than desktop-height unit to the left of the CPU. That nice large work surface is just the right height to sit on.
We had just bolted the printer controller to the frame of the CPU and were doing one of the many, many checks built into the installation instructions. We connected the leads of an Ohm meter between the frame of the CPU and a specific terminal on the power supply of the printer controller. The result was supposed to be an open circuit; that is, infinite resistance, which would indicate that the hot leads of the power supply were not shorted to the frame. In this case, there was a short—zero resistance—which was bad.
There would not have been a spectacular display of noise and fireworks like you see on TV, but it would have been a problem, as it would prevent the computer from powering up. Best to catch this while it was still being assembled.
After an hour of trying to find the problem, we were unable to do so. We called the support center for the System/3 in Boca Raton, Florida, and Vern guided us through several more unsuccessful problem determination steps.
A bit frustrated, I sat on the printer control unit. Out of the corner of my eye, I saw the needle on the Ohm meter swing very briefly to indicate an open circuit and then settle back into the short circuit indication. That little flicker piqued my curiosity, and it gave us a place to start looking.
I mentioned this to the other CE and to Vern in Boca Raton (who would later be one of my own mentors when I went down there for a few years as a Course Development Representative). We removed the top, where I had perched, from the printer control unit, and, with a bit of luck, found that one of the bolts holding the top to the frame of the printer controller had come loose and fallen into the power supply and caused the short. When I sat on the top of the controller, the frame moved just enough to cause the bolt to no longer make the contact required to produce the short. Removing that loose bolt from the power supply fixed the problem.
Vern, who was responsible for System/3 support at that time, made some changes to the instructions to cover this problem in case it happened again. He also worked with the manufacturing people to ensure that it did not happen again, putting in place a requirement to check that the bolt was properly tightened during the build process.
The thing to remember is to really observe what is going on in all parts of the system and to be curious about anything that seems unusual. Pay attention to everything and don't ignore the slightest event. Sometimes watching top, glances, or one of the other utilities used to monitor the internal functioning of the kernel or the network can provide a momentary glimpse of something that gets us started in the right direction.
And sometimes it takes just a bit of luck like sitting on the printer control unit.
Failure is an option
I have not failed. I've just found 10,000 ways that won't work.
—Thomas A. Edison
Although thousands of specific combinations of individual materials and fabrication technologies during testing did not lead to a viable light bulb, Edison continued to experiment. As such, the failure to resolve a problem or create code that performs its defined task does not mean that the project or overall goal will fail. It means only that the specific tool or approach did not result in a successful outcome.
I have learned much more through my failures than I have in almost any other manner. I am especially glad for those failures that have been self-inflicted. Not only did I have to correct the problems I caused myself, but I also still had to find and fix the original problem. This always led to a great deal of research that caused me to learn much more than if I had solved the original problem quickly.
This is just my nature, and I think it is the nature of all good sysadmins to exercise our curiosity and to look upon these situations as learning opportunities. As mentioned previously, I spent many years as a trainer, and some of the most fun experiences were when demonstrations, experiments, and lab projects would fail while I was teaching. Those were fantastic learning experiences for me as well as for the students in my class. Sometimes I even incorporated those accidental failures into later classes because they enabled me to teach something important.
Build it, and you will learn
Everyone learns in their own way. As a trainer, I saw this every time I taught a class, regardless of the subject. However, following our curiosity is the same—we all have that spark that inspires us to seek and discover more. Our methods may not be the same, but they will lead us all to greater knowledge and skill.
I have always preferred building my own computers rather than purchasing them off the shelf. One reason is that I get exactly the hardware that I want without paying for an operating system I don't want and will never use. I do this primarily because I am always curious about hardware. It also takes me back to my days of fixing hardware for IBM, which I found to be fun. As I write this, I am awaiting the arrival of a motherboard and processor with which I'll rebuild one of my older computers. Geeks just want to have fun!
I started with Linux in about 1996 by installing it on all of my computers at home. This forced me to learn Linux and not look back. I had several computers and created a complete internal network in my home office. Over the years, my network has grown and changed, and I have learned more with every alteration. Much of this was driven by my curiosity rather than any specific need.
I have static IP addresses from my ISP and two firewalls to provide outside access and protect my internal network. One of these firewalls is a Raspberry Pi with CentOS on it. I have had Intel boxes with Fedora and CentOS on them over the years. I learned a lot about using both in roles as a firewall and a router. I have a server that runs DHCP, HTTP, SMTP, IMAP, NTP, DNS, and other services to provide them to my internal network and to make some of those services available to the outside world, such as my web site and incoming email. I have learned a great deal about using Linux in a server role in general. I have learned an incredible amount about implementing and managing each of these services.
I have several desktop workstations and a laptop connected to my wired network. The laptop, my Kindle, and my mobile phone can also connect using one of my wireless routers; I don't use the wireless provided by my ISP due to the monthly cost, and it does not give me the opportunity to learn about configuring wireless routers. Learning how to set up my email server to best work with these devices while doing my best to provide those services in a secure manner has been challenging—and fun.
To me, curiosity is the driving force behind learning. I have been curious about computers and especially programming and operating systems since I encountered my first computer. I am still curious about how things work—my current series of articles for Opensource.com resulted from my own curiosity about systemd and why people have such strong opinions about it.
By using my home network for indulging my curiosity, I have lots of safe space in which to fail catastrophically and to learn the best ways to recover from that. And there are lots of ways to fail, so I am learning a lot. I learn when I accidentally break things, but I also learn a great deal when I intentionally bork things. In these instances, I know what I want to learn and can target the breakage in ways that enable me to learn about those specific things.
When working on real-world problems, I continue to ask "why" until I find the root cause of the problem or the authoritative answer to whatever question(s) I have.
Be the curious sysadmin. It works for me.
[ Want to test your sysadmin skills? Take a skills assessment today. ]