The fewer facilities a programming language has,
the more important rigour is in everyday use to
prevent emergence of catastrophic complexity
Slide 3
Slide 3 text
Bio
• CTO at Event Store, a company which builds an open source stream database
• Terraform Core maintainer at HashiCorp from vEarly to ~v0.8
• Member of the first SRE team at HashiCorp, developing patterns for scalable
Terraform and ran for advanced Terraform training classes
• Terraform AWS provider maintainer until ~September 2019
• Pulumi contributor since June 2018
• Tweet @jen20 primarily about Rust and Brexit (ffs)
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
Survey Questions…
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
Infrastructure as Code
Slide 8
Slide 8 text
Infrastructure as Code?
Slide 9
Slide 9 text
Infrastructure as Code
Infrastructure as Software
Slide 10
Slide 10 text
Ancient History…
Slide 11
Slide 11 text
Terraform 0.1
Slide 12
Slide 12 text
Terraform 0.1
• No modules - everything lived in one namespace
• No terraform_remote_state resource
• No remote state at all!
• Very “forgiving” HCL dialect (pre-HCL 1.0)
By Terraform 0.6, we had the
tools to avoid creating a mess…
Slide 21
Slide 21 text
… but people already had large
amounts of Terraform code, and the
‘best practices’ were already in the wild.
Slide 22
Slide 22 text
Scalable Infrastructure Patterns
Slide 23
Slide 23 text
Microservices
Slide 24
Slide 24 text
Microservices
• This is a tortuous analogy because I don’t actually like the name Microservices, or
many of the concepts that it has come to embody.
• Perhaps Service Oriented Architecture (SOA) is a better name
• The gestalt is that we prefer to keep services (“applications”):
• Small, so they can easily be understood (“micro”)
• Independent, so that issues in one service are less likely to affect another
• Similar to the UNIX philosophy of applications doing one thing (and doing it well).
Slide 25
Slide 25 text
Terraform
• We should also prefer to keep Terraform “applications” as small as possible:
• An “application” is the module where we type terraform {plan,apply}
• One “application” represents a single state file
• A single state file is:
• A Security boundary (RBAC in TFE, )
• A Blast Radius boundary in terms of accidental destruction
Slide 26
Slide 26 text
Terraform
• One state file for an entire infrastructure is A Bad Time waiting to happen
• Since Terraform 0.4, we don’t have to do this!
• Prefer to keep configuration for each aspect of a system in a
separate “application”
• Aspects of the system live in a hierarchy
• Refer to outputs of “applications” lower in the hierarchy by using the
terraform_remote_state data source
Slide 27
Slide 27 text
Bastion Host “Application” Web Site “Application”
Network “Application”
terraform_remote_state
Private subnet IDs
Public subnet IDs
terraform_remote_state
Private subnet IDs
Slide 28
Slide 28 text
Prefer to have more state files
which are individually smaller
Slide 29
Slide 29 text
“organizations which design systems...
are constrained to produce designs
which are copies of the communication
structures of these organizations.”
— Melvin Conway, 1967
Slide 30
Slide 30 text
The “Composition Root”
Pattern
Slide 31
Slide 31 text
No content
Slide 32
Slide 32 text
The “Composition Root” Pattern
• “Where should we compose object module graphs?”
• “As close as possible to the application’s entry point”
• “A Composition Root is a (preferably) unique location in an application where
modules are composed together.”
• “Only at the entry point of the application is the entire module graph finally
composed”
• “Only applications should have composition roots. Libraries and frameworks
shouldn’t have them”
Slide 33
Slide 33 text
Applying this to Terraform…
Slide 34
Slide 34 text
Modules should either manage
resources or compose other modules
Slide 35
Slide 35 text
There should be one composition
root per application, at the root,
parameterised for environments
Slide 36
Slide 36 text
Composition Root Configuration
• Composition Roots should consist of very little code and minimal logic:
• Variables - only things that need to change per deployment or workspace at the
current time!
• State Storage Configuration - to confi gure the backend.
• Data Sources - to query the state of the cloud based on variables passed in, and to
link to other “applications” in the same system.
• Module Declarations - to instantiate the modules which create resources.
• Outputs - only the things that need to be provided to other stacks at the current time!
Slide 37
Slide 37 text
No content
Slide 38
Slide 38 text
Resource Modules
Slide 39
Slide 39 text
“SOLID” Principles
Slide 40
Slide 40 text
Resource Modules
• The modules instantiated by composition roots should be:
• Simple - no “clever” hacks unless they absolutely cannot be avoided.
• Flat - no nested module dependencies.
• Pure - treat them like functions.
• More importantly they should be:
• Cohesive - resources related to the same aspect of a system should live in the same module.
• Decoupled - no dependencies on other modules, only on the variables they are passed (and
potentially limited data sources).
Slide 41
Slide 41 text
Resource Module Structure
Slide 42
Slide 42 text
Module Design - An Interface
Slide 43
Slide 43 text
Module Design - Resources
repo.tf dns.tf
stage.tf
Slide 44
Slide 44 text
Resource Module Anti-Patterns
• Modules to create an individual resource or small group of resources which expose
every option as a configuration point
• Modules are an abstraction - treat them that way
• Modules which instantiate other modules
• State file gets complex, as do resource paths (especially pre-Terraform 0.12)
• Refactoring is hard
• Complexity spirals out of control
• Golden rule: Only composition roots instantiate modules
Slide 45
Slide 45 text
This approach scales well both with
infrastructure size and number of
infrastructure developers
Slide 46
Slide 46 text
Enforcing a level of rigour regarding this
in your organisation helps given the fact
that the language does not provide much.
Slide 47
Slide 47 text
Testing
• Testing has been a thing in application development for some time now. Why don’t
we apply it to infrastructure?
• The “testing pyramid” defines various stages of testing for software:
• Unit Tests - Fast, do not communicate without external resources. Verify results
“in the small”. Lots of them.
• The unit is the test NOT the test subject.
• Acceptance Tests - Slow(er), communicate with external resources and verify
results throughout a system. Relatively fewer of them.
Slide 48
Slide 48 text
Applying this to Terraform
• HashiCorp Sentinal
• GruntWork Terratest
• ServerSpec
• Pulumi Testing Framework and runner
• Detailed analysis out of scope for this talk, there are other talks at this
conference about this with more information (similarly at FOSDEM with videos)
Slide 49
Slide 49 text
For a step-change in the usability of
Infrastructure of Code, we need to
instead think of Infrastructure as Software
Slide 50
Slide 50 text
Resources
• Any software engineering book covering functional or object oriented design.
• Terraform Design Guide on the HashiCorp website
• HashiDays NYC 2017 Talk - Open source code illustrating this pattern in AWS for
a wide variety of infrastructure pieces:
• Code: https://github.com/jen20/hashidays-nyc
• For Terraform 0.11
• Pull Requests Accepted™