The accidental forkbomb: How a *nix script goes bad

When brilliant strategies go wrong, they can really go wrong and your opponent has to come to your rescue.

Posted: November 13, 2020 by Jason Frisvold (Sudoer alumni)

The accidental forkbomb — Image by Pexels from Pixabay

One of the first industry jobs I had was at a small regional ISP. At the time, 56k modems were shiny and new. We had a couple dozen PoPs (points of presence) where we installed banks of modems and fed the data back to our main office via a series of full and fractional T1 lines.

We provided the typical slew of services—email, net news, and general internet access. Of course, to provide these services, we needed servers of our own. The solution was to set up a cluster of SCO Unix systems. Yes, *that* SCO. It's been quite a while, but a cluster setup like this is hard to forget. The servers were set up in such a way that they had dependencies on each other. If one failed, it didn't cause everything to come crashing down, but getting that one server back up generally required restarting everything.

The general setup was that the servers' NFS mounted each other on startup. This, of course, causes a race condition during startup. The engineers had written a detailed document that explained the steps required to bring the entire cluster up after a failure. The entire process usually took 30-45 minutes.

At the time, I was a lowly member of tech support, spending the majority of my time hand-holding new customers through the process of installing the software necessary to get online. I was relatively new to the world of Unix and high-speed networking and sucking up as much knowledge as I could.

[ You might also like: Linux terminal highlights: Going beyond cowsay ]

One of the folks I worked with, Brett, taught me a lot. He wrote the network monitoring system we used and split his time between that and keeping the network up and running. He was also a bit of a prankster at times.

At the end of one pretty typical day, I happened to be on the Unix cluster. Out of the blue, my connection failed, and I was booted back to my local OS. This was a bit weird, but it did happen occasionally, so I simply logged back in. Within a few seconds, I was booted out again.

I started doing a bit of debugging, trying to figure out what was going on. I don't remember everything that I did, but I do remember putting together some quick scripts to log in, check various processes, and try to figure out what was happening. At some point, I determined that I was being booted off the system by another user—Brett.

Once I figured out what was going on, I had to fight back. So I started playing around with shell scripting, trying to figure out how to identify the PID of his shell so I could boot him offline. This went back and forth for a bit, each of us escalating the attacks. We started using other services to regain access, launch attacks, etc.

Finally, I launched what I thought would be the ULTIMATE attack. I wrote a small shell script that searched for his login, identified the shell, and subsequently killed his access. Pretty simple, but I added the ultimate twist. After the script ran, it ran a copy of itself. BOOM. No way he can get back on now.

And it worked! Brett lost his access and simply could not gain a foothold over the next five minutes or so. And, of course, I had backgrounded the task so I could interact with the console and verify that he was beaten. I had won. I had proven that I could beat the experienced engineer, and damn, I felt good about it.

Until...

ksh: fork: Resource temporarily unavailable
The beginning of the end

I had never seen such an error before. What was this? Why was the system doing this? And why was it streaming all over my console, making it impossible for me to do anything?

[ Free cheat sheet: IT job interview tips ]

It took a few moments, but Brett noticed the problem as well. He came out to see what had happened. I explained my brilliant strategy, and he simply sighed, smiled, and told me I'd have to handle rebooting and re-syncing the servers. And then he took the time to explain to me what I had done wrong. That was the day I learned about "exec" and how important it is.

Unfortunately, Brett passed away about a decade or so after this. He was a great friend, a great mentor, and I miss him.

Something went wrong and someone is to blame

When backups fail: A cautionary sysadmin tale

When anything fails, fingers start to point. Here's one story of failed backups and inherited responsibility.

How to write a sysadmin job description

Writing a job description can be overwhelming. Here are some solid suggestions on how to target the right candidates instead of putting yourself in a position of drowning in candidates without the right skillsets.

More stupid Bash tricks: Variables, find, file descriptors, and remote operations

These tips and tricks will make your Linux command line experience easier and more efficient.

Topics: Linux Scripting Troubleshooting

Jason Frisvold

Jason is a 25+ year veteran of Network and Systems Engineering. He spent the first 20 years of his career slaying the fabled lag beast and ensuring the passage of the all important bits. For the past 5 years he has transitioned into the DevOps world, doing the same thing he used to, but now wit More about me

The accidental forkbomb: How a *nix script goes bad

Jason Frisvold

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.

The accidental forkbomb: How a *nix script goes bad

Jason Frisvold

Try Red Hat Enterprise Linux

Download it at no charge from the Red Hat Developer program.

Related Content