Immutable Infrastructure

Introduction

What is normal behavior when a server goes haywire? SSH into it and see what’s going on, right? How do you normally update a server? SSH into it and (on Ubuntu) run sudo apt-get upgrade, right? How do you normally deploy your applications? SSH into your server and run git pull or docker pull, right? That behavior is the behavior of a traditional mutable infrastructure. Mutable meaning these servers change after their original deploy.

Meme: "WHAT IF I TOLD YOU" - All Templates - Meme-arsenal.com

What if I told you there’s another paradigm to consider? In the paragraphs ahead, I’ll explain the benefits of immutable infrastructure and give examples of how to implement immutable infrastructure. Immutable meaning servers not changing after they’re deployed. Need to update a server? Deploy a new one. Need to deploy your code? Deploy a new server. Need to fix a server? Nope, shoot it in the head and deploy a new one.

Benefits of Immutable Infrastructure

The biggest benefit to immutable infrastructure is infrastructure as code. In order to be capable of always deploying a new server, you’ll be forced to automate and you’ll be forced to continue automating. Other benefits include:

  • Forcing you to aggregate all logging information (you can’t retrieve logs from a dead server)
  • No more configuration drift. You’ll no longer have differences between servers that are alike because 1. they don’t change after being deployed and 2. they’re being deployed with the same automation.
  • No more snowflake servers. No more servers that only Joe Blow, who used to be on your team but is now on another team, knows how to configure.
  • Troubleshooting becomes easier. Delete the server and deploy a new one. Your servers are no longer pets. They’re cattle.
  • The ability to test new server configurations before they’re deployed via staging environments.
  • The ability to quickly roll back to a known working state if a new server update breaks things.

Tools of the Trade

The number 1 thing that is needed for immutable infrastructure is “cloud” infrastructure. Wait, does that mean I have to pay for the cloud in order to have immutable infrastructure? No, on-prem you can use a number of VM technologies or you can use Kubernetes (sigh, yes you’re not escaping this post without me talking about this cult revolution). On-prem VM solutions that all have an API for deploying servers include Openstack, VMware, and Proxmox. I use Openstack at work and at home. It can be very daunting at first look, but it’s much easier that perceived. I will expand on that more in a future blog post.

Alternatively, you can use public clouds too. AWS, GCP, Azure, OVHcloud, and Digital Ocean to name a few.

Configuration Management

Now that you can easily deploy servers, it’s time to talk about automation to configure them.

Everyone starts this conversation off by ignoring user-data. If your VM platform supports it (Openstack, AWS, GCP, OVHcloud, or Digital Ocean to name a few), user-data is a script that can be passed to your VM provisioner that will run as soon as the VM boots. You can technically automate your entire infrastructure with user-data and Docker. I recommend doing initial configuration with this before moving on to something more complicated like I’m going to suggest below.

Other common configuration management tools include Chef, Puppet, SaltStack, and Ansible. Nothing against the others, they’re fine tools if you design your code properly, but I recommend Ansible. 1. because it’s a push model. You push Ansible configuration to a server instead of a server pulling it and 2. because it’s agentless. You do not need a long running “master” that servers pull from and you do not need an agent on the servers to pull from the “master”.

Simple Immutable Infrastructure

Again, user-data combined with Docker can be very powerful. I recommend the following:

  1. Create a user-data script for each “role” of server in your environment.
  2. Package applications for each “role” into Docker images.
  3. Your user-data script will handle installing Docker and running the Docker image for that role.
  4. Use a tool like Terraform or use Bash to deploy all of your instances with the user-data script passed to it. Terraform can be beneficial over Bash in the fact that it is written to be immutable. If it senses configuration change for a server (like a new network added or a change in image), it will handle deleting that server and replacing it for you with a new one.
  5. Ansible can be used after the server is deployed for any extra configuration. Normally, this is needed for things like loadbalancers that need to know about other servers in the environment (like your app servers).
  6. Profit

Enter Kubernetes

Yes, another blog post talking about the Cult that is Kubernetes. Why am I talking about Kubernetes in a post about immutable infrastructure? Because Kubernetes deployments are actually immutable.

Kubernetes operates with a master/worker paradigm. You apply yaml definitions to tell Kubernetes the state you want your infrastructure to look like, and the masters spread Docker containers out to the workers to create that state. If a worker goes down, the master notices and spreads new containers out to other workers without your intervention. You can then kill off the worker that went down, create a new worker, and add it to the Kubernetes cluster.

Within the past year I have actually moved away from the “simple immutable infrastructure” model above and moved to Kubernetes. Why? Because the model above just pushes Docker images around. Kubernetes was written to do just that and it’s yaml deployment definitions are much more portable if you want to switch cloud providers or even use a different provider as fail over site. Instead of rewriting step 4 for a different cloud, you can just apply your already written yaml to a new Kubernetes cluster in a different cloud.

Every major cloud provider that provides support for user-data also provides Kubernetes as a service if you do not want to deploy it yourself. Openstack, AWS, GCP, Azure, and Digital Ocean to name a few. There are also plenty of on-prem deployment options. One I highly recommend is Rancher.

If you know nothing about Kubernetes, you can get up to speed by reading a great book called Kubernetes Up and Running. This book goes over deploying a simple cluster on-prem and all of the API pieces that help you deploy your apps onto Kubernetes.

Final Thoughts

Infrastructure is not something you want changing out from underneath you while it’s running live. Creating immutable infrastructure simplifies troubleshooting in many ways, allows you opportunities to test your infrastructure before putting it in place, forces good practice like infrastructure as code, automation, and aggregated logging, and also allows you to more easily roll back to a previous state if need be.

Thank you for reading and happy infrastructuring!

Jacob Cody Wimer