If you work in the world of automation and DevOps, you probably have come across the term "Infrastructure as Code" (IaC). Per Wikipedia:
"Infrastructure as code (IaC) is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools."
For the past year, I've been managing infrastructure—Red Hat Virtualization (RHV) and proprietary virtualization technology environments. Early on, I automated some tasks, but most of my work was manual. I attempted to automate the provisioning of vCenter and ESXi deployment and configuration. It was not as big of a hit as I wanted, and it certainly was not a pipeline of any sort. After that, I automated a few other smaller infrastructure tasks.
However, over the past three or four months, I've started investing more time in automating entire deployments. That's because I was asked to work on automating the deployment of the RHV environment we use for testing. Our infrastructure contains Red Hat OpenStack, RHV, and some VMware environments; we manage about 20TB of storage and 23 bare-metal hosts; and I manage more than 50 virtual machines.
I'm also a Red Hat Certified Specialist in Ansible Automation. So when I was asked to take on this automation project, I decided to use some of the Ansible roles already written by the oVirt team. To make them work, I had to add some custom playbooks and shell scripts, and putting all of that together as a Jenkins pipeline made it work seamlessly.
After working on this project, I can say I know what it means to have your infrastructure as code. Today, two of our four RHV environments are documented in code and maintained in a version-controlled Git repository.
Through my experience, I have learned the IaC approach has a lot of pros and some cons that you would be wise to take into account.
Pros of IaC
- I am relatively relaxed because I know if something goes wrong, I can call my Jenkins pipeline to re-deploy my env with one click.
- Before automation, if something went wrong, I had to meticulously redeploy and configure my entire environment. It was a challenge to bring back the identical environment. That's because anytime there is a change in the environment, the systems that interact with the environment also need updates. One such example is making sure the data center, cluster, network, and data store names are exactly the same after re-deployment.
- I can share this code with other teams or people, and they can utilize it to set up their env.
- Most of the code I write has a base that can take a list of input variables that allow you to customize the deployment for your needs. Hence, it is shareable code that others can re-use.
- My env will be identical every time I run the pipeline.
- As my pipeline is using a prescribed set of parameters for the deployment, every time I run my pipeline, I get exactly the same environment, in terms of the number of hosts, networks, data centers, clusters, and data stores—and their names are identical.
- My pipeline is scalable.
- My code is generalized in such a way that passing a configuration file with the right parameters will allow me to deploy env to one, two, or 100 servers because some of the parameters run in a loop that can take any number of parameters.
Cons of IaC
- Sometimes it's hard to maintain all of this code. As software versions change, the code may need to be updated. This is definitely an overhead.
- I already have about 8,500 lines of code in my RHV deployment pipeline repo, and there are still a few things on my to-do list that I need to update; some are features, some are technical debt.
- When execution fails somewhere, it may not be as easy to restart from the exact same point, and re-executing from scratch may take a long time.
- If you're using code someone else has written, you may need to spend a lot of time understanding it, which can be difficult.
- For example, one of the Ansible roles I was using did not have some of my required features. I spent a lot of time first understanding, then debugging, then adding new features, and then getting it merged into the repo by the maintainers (where we had more cycles of review–feedback–update–repeat).
IaC has changed my career for the better in many ways, but there are some moments when it's challenging to handle as a team of one.
What do you think? What are your best IaC practices or pain points?