I recently had to deploy quite a huge infrastructure on AWS with hundred of resources of many types. This infrastructure was written in Terraform and was split across many repositories as we had various reasons not to use a mono-repo. On top of that, some teams were dependent on the infrastructure that we had to built and they were responsible for their own infrastructure.
Managing infrastructure layers and dependencies
Building infrastructure that has dependencies imply to define some strong conventions that should not be modified. Your infrastructure will be referenced by others with Terraform Data Sources giving a name or an ID from your own infra, so they need to know what are the name of the resource they are referencing. It may also be filtered with tags or various pattern, a common usage is to find private subnets in a VPC, in which case your resource’s tags Names and Values should be present and well defined. Doing any change to these value, or conventions, will require you to synchronize with teams that have dependencies before redeploying any infrastructure and services. You may also want to have cost dashboard, per teams or per environment or any other groups that is relevant to you, usually using tags. These are examples, among others, of why it can be tedious to change even just a name of a resource in your infrastructure.
Testing your IaC …
We had many discussions about best practices, naming conventions, tagging strategies, security best practices etc but, you know, whatever we agreed on, we sometimes continue not to apply these rules. So, after facing some issues for not having these shared practices, we decided to gave a try to terraform-compliance to strengthen our Terraform scripts and enhance usage of our own conventions.
Terraform-compliance is an open source BDD test framework that allows testing terraform scripts in an easy way. It uses radish (which is Gherkin compatible) to write scenarios in feature files that enable you to check your IaC before being deployed, interesting isn’t it ! There are indeed other tools that do provide similar functionality like Sentinel, which is the official framework from HashiCorp, or Chef InSpec which is also open source. They seem to be really good products as well, but they may not to be as easy to integrate as terraform-compliance (claims: I could be wrong since I haven’t tested these tools, I would love to hear if you already gave them a try).
Well, we started by defining some simple naming conventions and tags validation scripts, directly inspired from their website examples :
Scenario Outline: Naming Standard on all available resources Given I have <resource_name> defined When it contains <name_key> Then its value must match the "myproject-(prod|uat|dev)-someapplication-.*" regex Examples: | resource_name | name_key | | AWS EC2 instance | name | | AWS ELB resource | name | | AWS RDS instance | name | | AWS S3 Bucket | bucket | | AWS EBS volume | name | | AWS Auto-Scaling Group | name | | aws_key_pair | key_name | | aws_ecs_cluster | name |
For each resource in the above table, it will check for the name_key (which may be named differently depending on which terraform resource you want to check the name) to be present and to respect the given regex.
Then, it led us to a more consistent definition of our tags (mostly using tagging strategies from AWS documentation), besides adding security scenarios to our BDD tests. For instance, checking that we have no policy allowing to write on all our buckets :
Scenario: Reject if a policy allows to write to any bucket Given I have role_policy defined When it contains policy And it contains Statement And its Effect is Allow And its Action includes "s3:PutObject","s3:PutObjectAcl" And it contains resource Then its value must not match the "\*" regex
It was really quick to start with these examples, and simple to integrate in our pipelines, we did a kind of (I keep it short for readability) :
pip install terraform-compliance # get dependency terraform plan -out plan.out # generate the plan to be tested terraform-compliance -p plan.out -f dir # run your BDD tests
You may of course use a python virtualenv and a git repository with ‘git’ prefix to reference your features (after -f option) so you can easily share your BDDs among teams. It will give you an overview result :
And detailed information about which scenario(s) and resource(s) do not comply with your BDDs. Below, a tag is missing in iam role resource, and the value check cannot be done :
So you just have to fix it and you’re good to go.
… helps improving your infra overall
Using this tool helped us a lot to have better infrastructure standards shared across teams and had strengthened integration between various infrastructure layers and components. It also highly improved our Terraform scripts and resources qualities, as we now cannot misname our resources or forget to add important tags to it, for instance. We also felt that it really improved our infrastructure security, since we are removing bad pattern (like wildcard policies) as we add more BBDs.
Terraform-compliance is a really good option if you don’t have a dedicated team to manage your infrastructure or if you don’t use a big company managed cloud with already set rules and conventions. It allows you quickly and easily to set up your BDDs tests, the most important part should remain the definition of what tests are consistent with your context.
Sources :
Image by Susanne Jutzeler, suju-foto from Pixabay
Logo from terraform-compliance from github
https://terraform-compliance.com/
https://www.inspec.io/
https://www.hashicorp.com/sentinel/
http://radish-bdd.io/
https://aws.amazon.com/answers/account-management/aws-tagging-strategies/
https://docs.aws.amazon.com/fr_fr/AmazonS3/latest/dev/using-with-s3-actions.html
0 Comments Leave a comment