Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Can You Test Your Infrastructure as Code?

Can You Test Your Infrastructure as Code?

Originally presented at Infrastructure & Ops Superstream: Platform Automation and Infrastructure as Code.

When you update your infrastructure as code modules and configurations and quickly push production changes, how do you know that your changes didn't break the system? Rosemary Wang discusses how to think about your infrastructure as code testing strategy and shows how some types of tests can flag potential misconfigurations to infrastructure as code before pushing to production. You'll learn about what kinds of tests are most effective for infrastructure as code and when it's worth the time to write them.

Rosemary Wang

January 31, 2024
Tweet

More Decks by Rosemary Wang

Other Decks in Technology

Transcript

  1. Rosemary Wang | January 31, 2024 Can You Test Your

    Infrastructure as Code? Infrastructure & Ops Superstream
  2. 👩💻 🔄 requested changes Rather than hard-code the image version,

    let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.
  3. 👩💻 🔄 requested changes Rather than hard-code the image version,

    let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push dry run dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.
  4. 👩💻 🔄 requested changes Rather than hard-code the image version,

    let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.
  5. 👩💻 🔄 requested changes Rather than hard-code the image version,

    let’s move it to a variable. 👩💻 $ curl https://app.hashidemo.com/ 503 Server Unavailable 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.
  6. 👩💻 🔄 requested changes Rather than hard-code the image version,

    let’s move it to a variable. 👩💻 $ curl https://app.hashidemo.com/ 503 Server Unavailable 👩💻 $ git commit -m “Deploy a virtual machine” $ git push 👩💻 $ git commit -m “Refactor after code review” $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy Test for variables Test this API call Test for network requirements 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network. Test VM status
  7. When are tests worth writing? • Turn unknown knowns (siloed

    knowledge) into known knowns • Communicate expectations for infrastructure • Standardize infrastructure as code • Automate checks for system function after changes
  8. Some notes… • Pyramid is guideline, not rule • Choose

    a programming / domain-speci fi c language • You will have multiple testing frameworks • Infrastructure will have fl aky tests
  9. Example resource "aws_db_instance" "database" { engine = "postgres" ## omitted

    for clarity } run "database" { command = plan assert { condition = aws_db_instance.database.engine == "postgres" error_message = "Database engine should use postgres" } } Database engine should use PostgreSQL.
  10. Example locals { subnets = cidrsubnets("10.0.0.0/16", 8, 8, 8, 8,

    8, 8, 8, 8, 8) } module "vpc" { ## omitted for clarity private_subnets = slice(local.subnets, 0, 3) public_subnets = slice(local.subnets, 3, 6) } def test_vpc_subnets_have_correct_netmask(vpc_subnets): wrong_subnets = [subnet['values'].get('id') for subnet in vpc_subnets if not subnet['values'].get('cidr_block').endswith('/24')] assert len(wrong_subnets) == 0, "subnets {} should have /24 CIDR block".format(wrong_subnets) This function should create subnets with a /24 subnet mask.
  11. Unit Tests • Check attributes • Result of iterative logic

    / functions • Dependencies between resources • Any expected values, variables, or outputs
  12. Example variable “peering_requester_vpc_cidr_block” { type = string description = "CIDR

    Block for VPC requesting to peer" } variable “peering_accepter_vpc_cidr_block” { type = string description = "CIDR block for VPC accepting peering connection" } Check that two CIDR blocks do not overlap for network peering def test_peered_networks_do_not_overlap(vpcs): assert vpcs['prod_us_east_1'].get('cidr_block') == vpcs['prod_us_east_2'].get('cidr_block')
  13. Example resource "hcp_hvn" "main" { hvn_id = var.name cloud_provider =

    "aws" region = local.hcp_region cidr_block = var.hcp_cidr_block } Check that two CIDR blocks do not overlap for network peering lifecycle { precondition { condition = var.hcp_cidr_block != var.vpc_cidr_block error_message = "Peered VPCs cannot have overlapping CIDR blocks" } }
  14. Con fi guration Dry Run Unit Test Deploy Catch the

    mistake here! Contract Test Mocks Infrastructure API
  15. Con fi guration Dry Run Unit Test Deploy Catch the

    mistake here! Rather than deploying and waiting for an error. Contract Test Mocks Infrastructure API
  16. Contract Tests • Validate input variables • Check identi fi

    er, password, or naming standards • Verify dependency metadata (e.g., passing in private networks)
  17. Example Check that a database is available after creation run

    "database" { command = apply assert { condition = aws_db_instance.database.status == "available" error_message = "Database in module should be available" } }
  18. Example Check that a Kubernetes cluster is not latest version

    def test_kubernetes_cluster_version_should_not_be_latest(kubernetes_cluster): kubernetes_release_details = requests.get( 
 'https://api.github.com/repos/kubernetes/kubernetes/releases/latest', headers={ 
 'Accept': 'application/vnd.github+json', 
 'X-GitHub-Api-Version': '2022-11-28' }) kubernetes_latest = version.parse( kubernetes_release_details.json().get(‘tag_name')) cluster_version = version.parse( kubernetes_cluster[‘change']['after'].get('version')) assert cluster_version < kubernetes_latest
  19. Example Check that a Kubernetes cluster is not latest version

    data "http" "kubernetes_version" { url = "https://api.github.com/repos/kubernetes/kubernetes/releases?per_page=1" request_headers = { Accept = "application/vnd.github+json" X-GitHub-Api-Version = "2022-11-28" } lifecycle { postcondition { condition = tonumber(module.eks.cluster_version) < tonumber(regex("[0-9]+.[0-9]+", jsondecode(self.response_body).0.tag_name)) error_message = "Kubernetes cluster version should not be latest" } } }
  20. Con fi guration Dry Run Deploy Unit Test Contract Test

    Integration Test Mocks Infrastructure API
  21. Integration Tests • Requires active resources • Check resource status

    • Verify default attributes supplied by infrastructure API (e.g., versions, encryption key identi fi ers)
  22. Example Check that a sample application endpoint is still healthy

    @pytest.fixture(scope='session') def url(servers): return f"http://{servers[0]['networkInterfaces'][0]['accessConfigs'][0] ['natIP']}" def test_url_for_service_returns_running_page(url): response = requests.get(url) assert "Welcome" in response.text
  23. Con fi guration Dry Run Deploy Unit Test Contract Test

    Integration Test Mocks Infrastructure API End-to-End Test
  24. End-to-End Tests • Requires a full set of active resources

    • Run on live environment • Verify end-to-end connectivity and functionality (e.g., network peering, application deployment work fl ow)
  25. Manual Tests End-to-End Tests Integration Tests Contract Tests Unit Tests

    Modules Modules Modules Con fi guration Con fi guration Con fi guration
  26. In conclusion… • Use tests as a form of education

    • Evolve tests for additional functionality, remove fl aky ones • Shorten feedback loop from IaC to deployment • Assess cost of testing environment versus testing in production • Infrastructure mocks? Test coverage?