No one wants their systems to be breached. No one wants their data stolen. And no one wants to recover a breached system. Neither discovery nor recovery are fun activities. In fact, I'd have to say that of all the horrible tasks facing you as a sysadmin, recovering a system from a malicious attack is the worst. In this article, I give you 12 steps to system recovery, some post-mortem tips, and a last resort option after you find that one or more of your systems has been compromised or breached.
12 steps to system recovery
- Disconnect the system from the network
- Make an offline copy of the disk(s)
- Take a snapshot of running processes
- Run antimalware checks (See: 3 antimalware solutions for Linux systems)
- Remove unauthorized user accounts
- Force password changes for all users
- Change the root password
- Secure SSH (See: Locking down sshd)
- Restore the system from an offline backup
- Set up enforcing host-based firewalls
- Limit access via iptables and /etc/hosts.deny and /etc/hosts.allow (See: Give your Linux system's firewall a security boost)
- Test the system in a protected network
Take every precaution and implement every best practice that you can. If I were to offer up a 13th step, I'd say implement multi-factor authentication to further block future attempts on this and your other systems.
Five post-mortem tips
Post-mortem is one of the dirty sysadmin terms because we know that it means meetings, calls, reports, and blame. Lots of blame. But post-mortem is also a chance to explore how to mitigate problems, such as getting breached. The question that always comes up in the first post-mortem call is, "How could this have been prevented?" How you answer it will affect your future as a trusted sysadmin. Trust me on that. The following is a list of tips to help you in post-mortem discussions.
- Stay calm and professional
- Present the facts not opinion or speculation
- Take notes
- Listen with interest to all parties
- Don't chide or blame coworkers, vendors, or hosting companies
Take the post-mortem as an opportunity to grow and learn. I know it's hard. You really want to go off on a rant about third parties, poor choices, and how you "told them this was going to happen." You have to resist and stick to the above points. The arc of your career is far more important than your ability to toss some blame over the fence to someone else or to have a momentary gloat about how you "knew this was going to happen."
The last resort
If you can't satisfactorily clean up your damaged systems, then your last resort is to reimage the system from scratch. Wipe the system and reinstall it from media. Restore users and data from a clean backup and then test, test, and test to be sure the system's security is hardened.
Recovering from a breach is time-consuming. Even if you can spin up a new virtual machine in five minutes to replace the compromised one, you will still spend hours and hours performing forensics on the breached system. Do not allow the compromised VM to connect to your production network again. Either keep it offline or only allow limited host-only connectivity.
Remember to stay calm and realize that there are groups of attackers out there that constantly probe for vulnerabilities. Your systems likely weren't specifically targeted but were part of a sweep that happened to return a positive result. One of the most valuable lessons I ever learned is that even if you did tell everyone that there were vulnerabilities or problems with a system or setup, you're making yourself as bad as the attacker by pointing out your warnings. And no matter how good you are, or think you are, no one, and I mean no one, wants to hear it.
[ Want to learn more about security? Check out the IT security and compliance checklist. ]