5 minutes read

POSTED Nov, 2020 dot IN

Advanced Tips on Managing Multi-Account Setup on AWS with Terraform

Serkan Özal

Written by Serkan Özal


Founder and CTO of Thundra

linkedin-share
 X


In this article, we’ll discover how to use HashiCorp's popular infrastructure as code tool, Terraform in our advantage on AWS, specifically to manage multiple accounts as this is the way AWS wants its customers to consider. Although managing a DevOps team gets complicated proportional to the size of the organization, Terraform's use of managing multiple accounts along with the tools AWS released recently help us overcome this complexity.

Considering Multiple AWS Accounts

One of the main reasons to use multiple accounts on AWS is to manage more granular security throughout your organization and resource segregation. Having this option gives you the ability to define access policies for whatever resource you might have.

While having one big AWS account might have its benefits, depending on your organization's size, managing IAM permissions becomes a pain and certainly isn't the best practice. One of the big advantages of using different accounts for your needs is that it becomes impossible for accounts to have access and intervene with each other.

The difficult part here is the billing side, especially if the accounts were not linked to each other. Deciding how to consolidate each and every one of your AWS accounts' billings definitely makes your purchase department's job a bit more difficult. To overcome this issue, AWS introduced AWS Organizations which allows you to operate a hierarchy of accounts and billings. AWS also provides a service called AWS Control Tower to set up and provision multi-account AWS environments quickly with built-in best practices.

Trendier Approach

The approach that's being suggested by AWS for organizations is to use multiple accounts, mainly because to prevent administrator privileges being used as a default.

Dealing with the IAM permissions to grant the least privileged access to the individuals in the organization is a daunting task and a never-ending effort. Instead of dealing with this complexity, creating an account for every individual seems like a more practical solution. It also implies security by default because if you are an administrator in one account, you can’t access other accounts even if you’re in the same organization.

Generally speaking, IaC tools are wired to work on a single AWS account. However, the trend we were speaking of can be achieved by Terraform. The ability to have multiple "providers" in a single script would not only allow you to access multiple AWS accounts but from different cloud vendors as well.

Infrastructure-as-Code with Multiple Accounts

Managing Resources in Different Accounts

There are a couple of ways to achieve this. As we've already mentioned, Terraform can manage resources in different accounts. To achieve this, we declare multiple "provider" blocks, typically one per AWS account. However, this AWS provider block will use the credentials to connect to the AWS API; ergo, will only work in the account referred by those credentials. This is being the default way, Terraform will apply the states to the account of this user. In this case, Terraform code would simply look like this:

provider “aws” {
  region = “us-east-1”
}

However, if you need to create resources in a different account "provider" block has the option "assume role" for this purpose. This option will let you assume the role of another account and get access to it. In practice, the Terraform code would look like this:

provider “aws” {
  region = “us-east-1”
  assume_role {
    role_arn = “arn:aws:iam::123456789012:role/iac”
  }
}

To achieve this, you might need to set up some IAM permissions that will allow that user to assume the role that's given. AWS has very thorough documentation on how to achieve this. You can read it here.

Intermediate Role to Access a Third Account

Another use case is to extend the previous one a bit further by adding the third AWS account. In this scenario, the assumed role would have access to a third account and basically becomes the middle man between the user account and the final account.

                   Figure 1: Flow control for an intermediate role to access a final AWS account

Although the security benefits are limited in this case, such a setup might worth your time in a specific situation. Bear in mind that this flow of control can be quite complicated over time and managing all the IAM permissions can be quite cumbersome and difficult to debug.

Indeed, instead of targeting account 333333333333 in the above diagram, hackers would try to gain access to account 222222222222 to gain control or resources located in account 333333333333. If account 222222222222 is used to control resources in other accounts beyond 333333333333, you could argue that the security is weaker because gaining access to account 222222222222 would open up even wider access.

Other Advanced Terraform Strategies

Multi-States Approach

It’s almost inevitable that the number of resources you manage becomes large. At this point, it would be wise to split your Terraform scripts into multiple states as managing it all in a single state has certain drawbacks:

  • Every time you apply even tiny changes, you will fear that Terraform will touch some foundational resources you don’t want touched.
  • Erroneous changes in the foundational resources applied blindly through continuous deployment could be devastating.
  • The IAM permissions required to apply the Terraform script would be wide-ranging and certainly more than necessary for a CD setup.
  • Quite a lot of time is required to apply the changes because Terraform will need to fetch the state of all the resources managed by the state, even if the vast majority won’t change.
  • The impact of a failed deployment could be wide-ranging.

We can split the resources into two categories: those who change rarely and those who change frequently. In the first category, there are things like VPCs, VPNs, databases, gateways, etc. These resources get updated quite rarely and even when they do get updated, those changes would typically be done by a human, rather than by a CD pipeline. In the second category, we have EC2 instances, EKS deployments, ECS services, etc. Updating these resources are generally done by machines, ie. CD pipelines. Such resources can be placed in a different state than the ones in the first category. This distinction helps Terraform to manage a small, fast-changing fraction of your resources and require a minimal amount of permission to run these scripts. This way of segregation would be ideal when you are running automated deployments.

One interesting example of this type of setup is a Kubernetes cluster managed by the foundational stack, where the Kubernetes deployments are managed by the CD stack.

Modularity

Terraform also supports modules, which helps you to organize your scripts and divide them into different parts. To do that, you can simply put your code in a different directory and call “module” like this:

provider “aws” {
  region = “us-east-1”
}

module “mymodule” {
  source = “./modules/my_module”
  variable1 = “value1”
  variable2 = “value2”
}

You can also publish your modules for public use and find other modules to use that were published by the Terraform community in the Terraform “registry”. However, it would be wise to check and make sure that modules you want to use are compliant with your security policies.

Keeping up with the DRY (Don’t Repeat Yourself) principle would be ideal, but, there are still some boilerplate code which you can’t modularize.

Environments

Ideally, your organization manages multiple environments like testing, staging, production, etc., and these environments look as similar as possible to each other to avoid broken deployments and have everything working together across all environments.

Terraform has a solution called “workspaces,” for this issue. However, it requires you to switch between environments. When you do this manually, it becomes error prone when eventually someone forgets to switch between environments and make a deployment that’s not being intended. This can be catastrophic as you might guess.

You can, however, have different sets of scripts for each environment; but this doesn’t comply with the DRY principle, even if you use modules as there would be a lot of duplication between environments and managing all together would be tiresome. There is another tool called  Terragrunt which developed specifically for this issue. You can read more about it here.

Conclusion

To sum up, it is possible to run Terraform in large organizations to manage not only multi-account AWS setups, but multiple “providers” as well. We’ve also covered some advanced techniques to help you organize and tidy up your Terraform code. Using these tactics, it’s easier to achieve a centralized and automated way of managing multi-accounts on AWS.