Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Terraform Without The Mess

James Nugent
February 03, 2020

Terraform Without The Mess

The way Terraform configuration is written today is impacted hugely by the history of the project and when features essential to minimising complexity became available

In this talk we'll look at some of the history of the project and make specific recommendations about how to structure Terraform configuration in order to make changes simple, predictable and safe.

James Nugent

February 03, 2020
Tweet

More Decks by James Nugent

Other Decks in Technology

Transcript

  1. Terraform Without The Mess
    James Nugent
    @jen20

    View full-size slide

  2. The fewer facilities a programming language has,
    the more important rigour is in everyday use to
    prevent emergence of catastrophic complexity

    View full-size slide

  3. Bio
    • CTO at Event Store, a company which builds an open source stream database
    • Terraform Core maintainer at HashiCorp from vEarly to ~v0.8
    • Member of the first SRE team at HashiCorp, developing patterns for scalable
    Terraform and ran for advanced Terraform training classes
    • Terraform AWS provider maintainer until ~September 2019
    • Pulumi contributor since June 2018
    • Tweet @jen20 primarily about Rust and Brexit (ffs)

    View full-size slide

  4. Survey Questions…

    View full-size slide

  5. Infrastructure as Code

    View full-size slide

  6. Infrastructure as Code?

    View full-size slide

  7. Infrastructure as Code
    Infrastructure as Software

    View full-size slide

  8. Ancient History…

    View full-size slide

  9. Terraform 0.1

    View full-size slide

  10. Terraform 0.1
    • No modules - everything lived in one namespace
    • No terraform_remote_state resource
    • No remote state at all!
    • Very “forgiving” HCL dialect (pre-HCL 1.0)

    View full-size slide

  11. Terraform 0.3

    View full-size slide

  12. Terraform 0.3

    View full-size slide

  13. Terraform 0.3.5

    View full-size slide

  14. Terraform 0.4.0

    View full-size slide

  15. Terraform 0.6.0

    View full-size slide

  16. Terraform 0.6.0

    View full-size slide

  17. output “private_subnets” {
    value = “${join(aws_subnet.private.*.id, “,”)}”
    }
    resource “aws_instance” “servers” {
    // other fields
    subnet_id = “${element(split(var.private_subnets, “,”, count.index)}”
    }

    View full-size slide

  18. By Terraform 0.6, we had the
    tools to avoid creating a mess…

    View full-size slide

  19. … but people already had large
    amounts of Terraform code, and the
    ‘best practices’ were already in the wild.

    View full-size slide

  20. Scalable Infrastructure Patterns

    View full-size slide

  21. Microservices

    View full-size slide

  22. Microservices
    • This is a tortuous analogy because I don’t actually like the name Microservices, or
    many of the concepts that it has come to embody.
    • Perhaps Service Oriented Architecture (SOA) is a better name
    • The gestalt is that we prefer to keep services (“applications”):
    • Small, so they can easily be understood (“micro”)
    • Independent, so that issues in one service are less likely to affect another
    • Similar to the UNIX philosophy of applications doing one thing (and doing it well).

    View full-size slide

  23. Terraform
    • We should also prefer to keep Terraform “applications” as small as possible:
    • An “application” is the module where we type terraform {plan,apply}
    • One “application” represents a single state file
    • A single state file is:
    • A Security boundary (RBAC in TFE, )
    • A Blast Radius boundary in terms of accidental destruction

    View full-size slide

  24. Terraform
    • One state file for an entire infrastructure is A Bad Time waiting to happen
    • Since Terraform 0.4, we don’t have to do this!
    • Prefer to keep configuration for each aspect of a system in a
    separate “application”
    • Aspects of the system live in a hierarchy
    • Refer to outputs of “applications” lower in the hierarchy by using the
    terraform_remote_state data source

    View full-size slide

  25. Bastion Host “Application” Web Site “Application”
    Network “Application”
    terraform_remote_state
    Private subnet IDs
    Public subnet IDs
    terraform_remote_state
    Private subnet IDs

    View full-size slide

  26. Prefer to have more state files
    which are individually smaller

    View full-size slide

  27. “organizations which design systems...
    are constrained to produce designs
    which are copies of the communication
    structures of these organizations.”
    — Melvin Conway, 1967

    View full-size slide

  28. The “Composition Root”
    Pattern

    View full-size slide

  29. The “Composition Root” Pattern
    • “Where should we compose object module graphs?”
    • “As close as possible to the application’s entry point”
    • “A Composition Root is a (preferably) unique location in an application where
    modules are composed together.”
    • “Only at the entry point of the application is the entire module graph finally
    composed”
    • “Only applications should have composition roots. Libraries and frameworks
    shouldn’t have them”

    View full-size slide

  30. Applying this to Terraform…

    View full-size slide

  31. Modules should either manage
    resources or compose other modules

    View full-size slide

  32. There should be one composition
    root per application, at the root,
    parameterised for environments

    View full-size slide

  33. Composition Root Configuration
    • Composition Roots should consist of very little code and minimal logic:
    • Variables - only things that need to change per deployment or workspace at the
    current time!
    • State Storage Configuration - to confi gure the backend.
    • Data Sources - to query the state of the cloud based on variables passed in, and to
    link to other “applications” in the same system.
    • Module Declarations - to instantiate the modules which create resources.
    • Outputs - only the things that need to be provided to other stacks at the current time!

    View full-size slide

  34. Resource Modules

    View full-size slide

  35. “SOLID” Principles

    View full-size slide

  36. Resource Modules
    • The modules instantiated by composition roots should be:
    • Simple - no “clever” hacks unless they absolutely cannot be avoided.
    • Flat - no nested module dependencies.
    • Pure - treat them like functions.
    • More importantly they should be:
    • Cohesive - resources related to the same aspect of a system should live in the same module.
    • Decoupled - no dependencies on other modules, only on the variables they are passed (and
    potentially limited data sources).

    View full-size slide

  37. Resource Module Structure

    View full-size slide

  38. Module Design - An Interface

    View full-size slide

  39. Module Design - Resources
    repo.tf dns.tf
    stage.tf

    View full-size slide

  40. Resource Module Anti-Patterns
    • Modules to create an individual resource or small group of resources which expose
    every option as a configuration point
    • Modules are an abstraction - treat them that way
    • Modules which instantiate other modules
    • State file gets complex, as do resource paths (especially pre-Terraform 0.12)
    • Refactoring is hard
    • Complexity spirals out of control
    • Golden rule: Only composition roots instantiate modules

    View full-size slide

  41. This approach scales well both with
    infrastructure size and number of
    infrastructure developers

    View full-size slide

  42. Enforcing a level of rigour regarding this
    in your organisation helps given the fact
    that the language does not provide much.

    View full-size slide

  43. Testing
    • Testing has been a thing in application development for some time now. Why don’t
    we apply it to infrastructure?
    • The “testing pyramid” defines various stages of testing for software:
    • Unit Tests - Fast, do not communicate without external resources. Verify results
    “in the small”. Lots of them.
    • The unit is the test NOT the test subject.
    • Acceptance Tests - Slow(er), communicate with external resources and verify
    results throughout a system. Relatively fewer of them.

    View full-size slide

  44. Applying this to Terraform
    • HashiCorp Sentinal
    • GruntWork Terratest
    • ServerSpec
    • Pulumi Testing Framework and runner
    • Detailed analysis out of scope for this talk, there are other talks at this
    conference about this with more information (similarly at FOSDEM with videos)

    View full-size slide

  45. For a step-change in the usability of
    Infrastructure of Code, we need to
    instead think of Infrastructure as Software

    View full-size slide

  46. Resources
    • Any software engineering book covering functional or object oriented design.
    • Terraform Design Guide on the HashiCorp website
    • HashiDays NYC 2017 Talk - Open source code illustrating this pattern in AWS for
    a wide variety of infrastructure pieces:
    • Code: https://github.com/jen20/hashidays-nyc
    • For Terraform 0.11
    • Pull Requests Accepted™

    View full-size slide