In this final entry of my three-part mini-series focusing on Linux housekeeping, I cover virtual machine storage sprawl. Upon first glance, you might believe that virtual machine storage sprawl is the same problem as virtual machine sprawl, but it isn't. Sure, it's related, but there are key differences. Virtual machine (VM) sprawl is the problem of creating virtual machines and then never decommissioning them. They're created for whatever reason and then left to linger for undetermined amounts of time until an angry sysadmin decides to burn everyone's eyes with a white-hot email describing the issue in painful detail. VM storage sprawl is the copying, storing, and ignoring of VM disk images on shared spaces such as public drives or random network shares. Now it's a housekeeping problem. And, specifically, a Linux sysadmin problem.
The problem clearly defined
To the casual user, there's no problem. They've copied a 75GB virtual machine image to a shared space on the network for "safekeeping," and now it's your problem. The bigger problem is that multiple users have copied VM images to the "network" that range from 10GB to 250GB in size. You're looking at 1TB or more of VM images that are now your problem.
But that's what shared storage space is for, right? The problem isn't the use of the space for legitimate data storage; the problem is that now your backups of those shared spaces will take hours longer to complete.
One potential solution
System administrators are problem solvers by nature. When presented with a problem, we seek out a solution that we think will work for everyone. However, we also know from experience that the best technical solution doesn't always satisfy personal desires. My potential solution to the problem of users copying their gigantic VM images to shared spaces is to ignore these files for backup. We can set our backup job to ignore files over 5GB in size, which will exclude all but the smallest VM images.
This solution seems to work logically and practically. However, once it's public knowledge that VM images are not part of the standard nightly backup plan, chaos ensues. Users demand that their idle VM images be included in standard backup activities because a loss would be devastating, which is why they copied them to a network share in the first place. There is another way.
[ Free webinar: Migrating to a modern virtualized environment. ]
An alternative solution
With a little extra cost in time and money, you can set up a separate backup storage location for virtual machine images, which won't affect your standard backups in time or in space used. Here's how I would approach the problem: I would create a new subdirectory named VM_Images in the shared space. Users can then drop their giant files into this shared space specifically designated for that use. Each user will create a user directory for themselves and then place their files within it.
"How does this solve the problem," you ask?
Excluding the VM_Images directory from regular backups solves the issues of your regular backups completing on time. Set up a separate backup job to copy the VM_Images directory and all subdirectories and files to a different location, separate from your standard backups.
Problem solved.
The VM images are being backed up as desired, and standard network share backups are not affected.
This should be an adequate solution that satisfies everyone's needs.
Going one step further
In the past, to solve a similar issue, I also included ISO images and virtual machine-related files such as VirtualBox installers into the alternate backup location solution. All of these files add up. Including ISOs, installers, and VM images, I had one client that collected more than 1.5TB of these types of files. Once I separated backups, life for everyone was much better. Backups took a reasonable amount of time. VM images and associated files were backed up to an alternate location, and superfluous files such as ISOs and installers were now separated from company data.
Wrapping up
As a system administrator with a long history of making users angry, I can tell you honestly that there's almost no solution that satisfies everyone. The best you can do is adhere to company policy, security policy, and best practices when approaching any problem that you face. The only problem with that philosophy is that someone will petition for an exception and get it. I have made too many enemies of staff and management alike by trying to quote regulations, best practices, and security policies.
Perhaps a better approach is to enlist a member of each team to resolve such issues. If your organization is large enough, include representatives from storage, network, and security to help settle disputes and to do what makes the most sense collectively for users, management, and IT.
[ A free course for you: Virtualization and Infrastructure Migration Technical Overview. ]
About the author
Ken has used Red Hat Linux since 1996 and has written ebooks, whitepapers, actual books, thousands of exam review questions, and hundreds of articles on open source and other topics. Ken also has 20+ years of experience as an enterprise sysadmin with Unix, Linux, Windows, and Virtualization.
Follow him on Twitter: @kenhess for a continuous feed of Sysadmin topics, film, and random rants.
In the evening after Ken replaces his red hat with his foil hat, he writes and makes films with varying degrees of success and acceptance. He is an award-winning filmmaker who constantly tries to convince everyone of his Renaissance Man status, also with varying degrees of success and acceptance.
More like this
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit