How to manage Terraform state

When you use Terraform to generate and update resources, you may have noticed that each time you ran terraform plan or terraform apply, Terraform was able to locate the resources it had previously produced and make the appropriate changes. This was possible because Terraform was able to remember which resources it had produced in the past. But how could Terraform find out which part of the infrastructure it had to change? It’s possible to have a wide variety of infrastructure in your AWS account, deployed in a variety of ways (manually, using Terraform, via the CLI), raising the question of how Terraform can tell which infrastructure it is responsible for.

In this blog, we will learn how Terraform keeps tabs on your infrastructure’s health and how that affects the organisation, security, and isolation of your Terraform project’s files.

Jump to

What is Terraform state?

Terraform generates a state file with information about the infrastructure it has built every time it is invoked. When executed in the /foo/bar directory, Terraform generates the terraform.tfstate file. You may find a special JSON format here that documents the mapping between the Terraform resources defined in your configuration files and their actual implementation in the world. Take the following as an example of a possible Terraform configuration:

resource “aws_instance” “example” {
ami “ami-40d28157”
instance_type = “t2.micro”
}
After running terraform apply, here is a small snippet of the contents of the terraform.tfstate file:

{
“aws_instance.example”: {

“type”: “aws_instance”,
“primary”: {
“id”: “1-66ba8957”,
“attributes”: [
“ami”: “ami-40d28157”,
“availability_zone”:”us-east-1d”,
“id”: “1-66ba8957”,
“instance_state”: “running”,
“instance_type”: “t2.micro”,
“network_interface_id”: “ent-7c4fcf6e”,
“private_dns”: “ip-172-31-23-109. ec2. internal”,
“private_ip”: “172.31.23.109”,
“public_dns”: “ec2-54-159-88-79.compute-1.amazonaws.com”,
“public_tp”: “54.123.88.79”,
“subnet_id”: “subnet-3b29db10”
}
}
}
}

Using this simple JSON format, Terraform knows that aws_instance.example corresponds to an EC2 Instance in your AWS account with ID i-66ba27457. Every time you run Terraform, it can fetch the latest status of this EC2 Instance from AWS and com- pare that to what’s in your Terraform configurations to determine what changes need to be applied.

The State File Is a Private API

The state file format is a proprietary API that is updated with each new release of Terraform. This API is only intended for usage within Terraform itself. Never manually alter the Terraform state files, and never create code that reads them directly. Both of these things are strictly forbidden. Use the terraform import command form state command if, for some reason, you find that you need to change the state file. This should be an uncommon event, but it might happen.

If you’re using Terraform for a personal project, saving the state of your configuration in a local terra-form is recommended. However, if you wish to utilise Terraform as a team on an actual product, you will run into a few issues, including the following:

Shared storage for state files

Each member of your team requires access to the exact same Terraform state files in order for your organisation to be able to utilise Terraform to keep its infrastructure up to date. That implies you have to save the files in a location that is accessible to several users.

Locking state files

When data is shared, you immediately run into a new difficulty known as locking. Without the use of a lock. When several Terraform processes perform concurrent modifications to the state files, you run the risk of encountering race situations if numerous team members are executing Terraform at the same time. This can result in data loss, conflicts, and corruption of the state file.

Isolating state files

When making modifications to your infrastructure, isolating different environments is a good practise that should be followed. For instance, if you are going to make a modification in a testing or staging environment, you will want to make sure that there is no chance that you would accidently break production.

Shared Storage for State files

The most common technique for allowing multiple team members to access a common set of files is to put them in version control (e.g., Git). With Terraform state, it is a bad idea for two reasons:

Manual error

It’s too easy to forget to get the latest changes from version control before running Terraform or to push your latest changes to version control after running Terraform. It’s only a matter of time before someone on your team runs Terraform with out-of-date state files and, as a result, rolls back or duplicates previous deployments.

Secrets

All of the information in Terraform state files is stored in plain text. This is a problem because some Terraform resources need to store sensitive data. For example, if you use the aws db instance resource to create a database, Terraform will store the user name and password for the database in plain text in a state file. Putting plain-text secrets anywhere, including version control, is a bad idea. As of November 2016, this is still an open issue in the Terraform community, but there are some reasonable workarounds, which I will talk about shortly.

Use Terraform’s built-in support for Remote State Storage instead of version control to handle shared storage for state files. Using the terraform remote config command, you can set up Terraform to fetch and store state data from a remote store every time it runs. Several remote stores, such as Amazon S3, Azure Storage, HashiCorp Consul, and HashiCorp Terraform Pro and Terraform Enterprise, are supported.

For the following reasons, I normally suggest Amazon S3 (Simple Storage Service), Amazon’s managed file store:

Since it is a managed service, no additional infrastructure deployment or management is required.
It’s built to last forever with a durability of 99.999999999% and an availability of 99.99%, so it won’t crash or lose your data.
Since encryption is an option, private information need not be hidden away in state files. While the state files will still be visible in an unencrypted form to anybody on your team with access to that S3 bucket, at least the data will be secured at rest ($3 supports server-side encryption using AES-256) and in transit (Terraform uses SSL to read and write data in S3).
It has versioning capabilities, so you may save and restore to previous versions of your state file in the event of an error.
Almost all Terraform deployments qualify for the free tier, therefore it doesn’t break the bank.

Because S3 is an ultimately consistent file storage, changes may take a few seconds to propagate. There is a very little possibility that you will end up with stale state if you have a big, geographically spread team that makes frequent updates to the same Terra form state. You may wish to utilise an alternative remote state store, such as Terraform Pro or Terraform Enterprise, for these types of use scenarios.

Creating s3 to Store State file

To enable remote state storage with S3, the first step is to create an S3 bucket. Create a main.tf file in a new folder and at the top of the file, specify AWS as the provider.

provider “aws” {
regton “us-east-1”
}

Next, create an S3 bucket by using the aws_53_bucket resource: [

resource “aws_s3 bucket” “terraform_state” {
bucket = “terraform-up-and-running-state”

versioning {
enabled = true
}
lifecycle {
prevent_destroy = true
}

This code sets three parameters:

bucket

This is the S3 bucket’s name. It should be noted that it must be unique. Keep this name in mind, as well as the AWS region you’re using, since you’ll need both of these pieces of information later on.

versioning

This block enables versioning on the S3 bucket, so that every update to a file in the bucket actually creates a new version of that file. This allows you to see older versions of the file and revert to those older versions at any time.

prevent destroy

The second lifecycle setting you’ve seen is prohibit destroy (the first being create before destroy). When you set prohibit destroy to true on a resource, any attempt to delete that resource (for example, by executing terraform destroy) will result in an error. This is an excellent approach to avoid accidentally deleting a key resource, such as this S3 bucket, which contains all of your Terraform state.

Run terraform plan, and if everything looks OK, create the bucket by running ter raform apply. After this completes, you will have an S3 bucket, but your Terraform state is still stored locally. To configure Terraform to store the state in your S3 bucket (with encryption), run the following command, filling in your own values where specified:

> terraform remote config\
-backend=s3 |
-backend-config=”bucket (YOUR BUCKET_NAME)” \
-backend-config=”key=global/s3/terraform.tfstate” -backend-config=”regton-us-east-1″ \
-backend-config=”encrypt=true”

Remote configuration updated
Remote state configured and pulled.

This means that Terraform automatically pushes and pulls state data to and from S3, and S3 stores every revision of the state file, which might be handy for troubleshooting and rolling back to earlier versions if something goes wrong.

Locking State File

Enabling remote state fixes the problem of sharing state files with peers, but it introduces two additional issues:

1. For each Terraform project, each developer on your team must remember to perform the terraform remote config command. It’s easy to make a mistake or forget to perform this lengthy command.

2. While Terraform remote state storage guarantees that your state is kept in a shared place, it does not offer locking for that location. As a result, race situations are still feasible if two developers use Terraform on the same state files at the same time.

Terraform Pro or Terraform Enterprise

HashiCorp, the business that developed Terraform, provides commercial solutions dubbed Terraform Pro and Terraform Enterprise, both of which includes locking for state files, Terraform itself is an open-source project.

Build server

You may eliminate the requirement for locking totally by enforcing a rule in your team that no one can run Terraform locally to edit a shared environment. This will prevent anybody from making changes to the shared environment (e.g., staging, production). Instead, all of the changes need to be applied automatically by a build server such as Jenkins or CircleCI, both of which can be configured to ensure that more than one change is never implemented at the same time. It is a good idea to utilise a build server to automate deployments regardless of the locking technique that you employ since it enables you to find faults and enforce compliance requirements by running automated tests prior to applying any change. This allows you to detect bugs earlier in the process.

Terragrunt

Terragrunt is a lightweight wrapper for Terraform that is free source and allows automated configuration of remote state as well as locking support through the use of Amazon DynamoDB. Since DynamoDB is included in the AWS free tier, most teams ought to be able to use it for locking without incurring any costs.

Isolating State Files

Sharing and locking state data remotely makes teamwork much easier. However, there is still a further issue that has to be addressed, and that is isolation. To get started quickly using Terraform, it’s tempting to specify all of your infrastructure in a single Terra- form file or a series of Terraform files in a single folder. A major drawback of this method is that it stores all of your Terraform state in a single file, making it vulnerable to corruption from an error anywhere.

By way of illustration, you may accidentally damage your production programme while attempting to deploy a new version of it in staging. On the other hand, if you don’t employ locking or encounter a rare Terraform error, you might end up with a corrupted state file and a busted infrastructure across all environments.

If you are controlling all your environments from a single set of Terraform settings, you are defeating the purpose of having distinct environments in the first place.

You may achieve this by creating individual directories to store the Terraform configuration files for each environment. You may separate the configurations for each environment into their own folders, with stage containing the settings for the staging environment and prod containing the settings for the production environment. Then, if something goes wrong in one environment, it won’t affect the others because Terraform will be using individual state files for each.

How to manage Terraform state 2

Conclusion

Infrastructure as code (IAC) requires more careful consideration of isolation, locking, and state than conventional coding does since the trade-offs involved are different. Most errors you’ll encounter while creating code for a normal app are going to be rather minor and will only affect a certain subset of that app. Bugs in the code that manages your infrastructure can have far-reaching effects, affecting not only individual applications but also data storage, networks, and other components.

Add comment

Cancel reply

AIOps vs. MLOps vs. LLMOps: 2024 Handbook for DevOps Pro

Where does DevOps fit in Enterprise SaaS in 2024?

Motadata AIOps Vs SolarWinds NPM: 2024 Comparison

Categories

Recent Posts

RSS feed

Follow Us