“Why do I keep on drifting? Yes, I wish I knew why? I am not aware of the reason myself. Why do I keep on drifting?”
Avijeet Das
In a previous blog post I talked about Gap analysis, the process of determining which asserts you own are under Infrastructure-as-Code (IaC) management. Once you have identified those assets and bought them under IaC, there are still issues which can occur. One of those is “drift” which is when the code model of your resources in IaC differs from the reality of the deployed resources.
Drift may occur for many reasons, some of which are :
- A change has been made to the IaC model but not deployed yet.
- A change has been manually made to the deployed asset.
- An implicit change has occurred, such as if there is a new version of software to be deployed
- The terraform provider has been changed, which may happen if it is a thirdparty library such as the AWS or Google Cloud (GCP) providers. Often such as change is an new API field being added.
Terraform has the plan command, which will help us to determine what has changed. I like to run all my terraform plans at least once a week to generate a report on all the drift in my environments. Terraform helps with this as there is a toggle to ask for JSON output on the terrafrom plans. Using this JSON output is not trivial however as it is pretty verbose. With a little bit of post-processing we may however extract the needed information and aggregate it into a comprehensive view data model.
Having marshaled the complete set of drift in our environments as JSON, it is then a standard case of rending the JSON into a usable report, which I do using the python template language JINJA2. As in the previous case of gap analysis, some of our changes we wish to be whitelisted as we do not care about them (e.g. GKE versions change very rapidly, so we usually only want to know about major version changes).
This technique an also be applied for impact analysis, which is the process of identifying the changes which will be made by your IaC tooling if a deployment is applied.
Techniques like gap analysis and drift analysis are powerful ways to know what your assets are and to keep your IaC model of those assets in sync with the reality of what is deployed. Impact analysis allows us to run deployment plans with confidence as we know before the act what changes will be applied.
Of course these techniques work with tools other than terraform, such as ArgoCD, Ansible, Helm etc.