It is always an exciting day for systems administrators when we get to decommission a system. It means one less resource to maintain and potentially a successful upgrade somewhere else in the environment.
In our rational exuberance, however, we can't jump in and send the resource to the great data center in the sky. Too often, an "unused" system actually serves a heretofore undocumented business function, holds critical information that folks will need down the line, or provides cross-network connectivity that nobody on staff remembers configuring.
Therefore, having a documented resource decommissioning plan is vital to making sure this process goes off without a hitch. I'll go through some of the steps to ensure a smooth decommissioning process.
Verify the resource's function
Once you get a decommissioning request from stakeholders, verify that the resource actually is unused. Perform an independent cross-check by looking at access logs, deployment directories, timestamps, and network logs. Ask around to see if folks have an infrequent but vital access pattern that might not show up in access logs, such as using backup files to generate a report.
Make a backup and rollback plan
Before scheduling your decommissioning window, document the current status of the resource. As an intermediate step, can you disable the services running on the system instead of powering it off entirely? Is it easy to take a long-term backup of the resource for later spin-up, if needed?
For a resource like a database, taking a database dump for offline storage and retrieval is fairly trivial, whereas this may be more difficult for hardware appliances. Documenting the backup and recovery plan is vital if you find things are not as unused as you thought. Make sure to test your backups as well! You do not want to find out in six months that your backups failed and you wiped the drives.
[ Watch this on-demand webinar to learn how to prepare your IT infrastructure for the next 10 years. ]
Once this due diligence is complete, schedule a decommissioning date and time. Even though the resource is supposedly unused, select a window that would have minimal impact because it's best to assume the system may still be in use. Notify appropriate people at your organization that this server will be shut down. Use multiple channels, including email, instant messaging, and calendar notifications. Be sure to send notifications numerous times so that it is difficult for people to miss your messages.
Decommission the resource
On decommissioning day, I like to have a shared document that explicitly states, in order, every step that will be taken and the individual or team responsible for each step. I often sit on a videoconference with the folks performing the work, so we can work through the document together. This process ensures that steps are not missed and it is easy to communicate the decommissioning process.
Depending on the criticality of the resource, I might stop the service that is running, like Nginx, and let that sit for some period of time. If no issue reports surface, I move forward with powering off a host and, depending on if this is a virtual or physical resource, fully deleting the instance or unracking the server.
Through smart collaboration with your stakeholders and the business, you can ensure that your decommissioning process goes off without a hitch. Have a plan, overcommunicate, and make more backups than you think you need.