Rosemary Wang | January 31, 2024 Can You Test Your Infrastructure as Code? Infrastructure & Ops Superstream

Slide 3 text

(But I don’t want to write tests.)

What’s worth testing?

Rosemary Wang @joatmon08

👩💻 $ git commit -m “Deploy a virtual machine” $ git push dry run

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push dry run dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ curl 503 Server Unavailable 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ curl 503 Server Unavailable 👩💻 $ git commit -m “Deploy a virtual machine” $ git push 👩💻 $ git commit -m “Refactor after code review” $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy Test for variables Test this API call Test for network requirements 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network. Test VM status

When are tests worth writing? • Turn unknown knowns (siloed knowledge) into known knowns • Communicate expectations for infrastructure • Standardize infrastructure as code • Automate checks for system function after changes

Manual Tests Integration Tests Contract Tests Unit Tests Cost Increases (Time & Money) End-to-End Tests

Some notes… • Pyramid is guideline, not rule • Choose a programming / domain-speci fi c language • You will have multiple testing frameworks • Infrastructure will have fl aky tests

Unit Tests Check attributes without con fi guring active resources

Example resource "aws_db_instance" "database" { engine = "postgres" ## omitted for clarity } run "database" { command = plan assert { condition = aws_db_instance.database.engine == "postgres" error_message = "Database engine should use postgres" } } Database engine should use PostgreSQL.

Example locals { subnets = cidrsubnets("", 8, 8, 8, 8, 8, 8, 8, 8, 8) } module "vpc" { ## omitted for clarity private_subnets = slice(local.subnets, 0, 3) public_subnets = slice(local.subnets, 3, 6) } def test_vpc_subnets_have_correct_netmask(vpc_subnets): wrong_subnets = [subnet['values'].get('id') for subnet in vpc_subnets if not subnet['values'].get('cidr_block').endswith('/24')] assert len(wrong_subnets) == 0, "subnets {} should have /24 CIDR block".format(wrong_subnets) This function should create subnets with a /24 subnet mask.

Con fi guration Unit Test

Con fi guration Dry Run Unit Test

Con fi guration Dry Run Unit Test Mocks Infrastructure API

Unit Tests • Check attributes • Result of iterative logic / functions • Dependencies between resources • Any expected values, variables, or outputs

Contract Tests Check inputs match what is expected for a “module”

module collection of resources via infrastructure as code that are distributed for other people to use

Example variable “peering_requester_vpc_cidr_block” { type = string description = "CIDR Block for VPC requesting to peer" } variable “peering_accepter_vpc_cidr_block” { type = string description = "CIDR block for VPC accepting peering connection" } Check that two CIDR blocks do not overlap for network peering def test_peered_networks_do_not_overlap(vpcs): assert vpcs['prod_us_east_1'].get('cidr_block') == vpcs['prod_us_east_2'].get('cidr_block')

Example resource "hcp_hvn" "main" { hvn_id = cloud_provider = "aws" region = local.hcp_region cidr_block = var.hcp_cidr_block } Check that two CIDR blocks do not overlap for network peering lifecycle { precondition { condition = var.hcp_cidr_block != var.vpc_cidr_block error_message = "Peered VPCs cannot have overlapping CIDR blocks" } }

How does someone know how to use the module?

Con fi guration Dry Run Unit Test Deploy Contract Test Mocks Infrastructure API

Con fi guration Dry Run Unit Test Deploy Catch the mistake here! Contract Test Mocks Infrastructure API

Con fi guration Dry Run Unit Test Deploy Catch the mistake here! Rather than deploying and waiting for an error. Contract Test Mocks Infrastructure API

Contract Tests • Validate input variables • Check identi fi er, password, or naming standards • Verify dependency metadata (e.g., passing in private networks)

Integration Tests Check attributes on active resources

Example Check that a database is available after creation run "database" { command = apply assert { condition = aws_db_instance.database.status == "available" error_message = "Database in module should be available" } }

Example Check that a Kubernetes cluster is not latest version def test_kubernetes_cluster_version_should_not_be_latest(kubernetes_cluster): kubernetes_release_details = requests.get( 
 '', headers={ 
 'Accept': 'application/vnd.github+json', 
 'X-GitHub-Api-Version': '2022-11-28' }) kubernetes_latest = version.parse( kubernetes_release_details.json().get(‘tag_name')) cluster_version = version.parse( kubernetes_cluster[‘change']['after'].get('version')) assert cluster_version < kubernetes_latest

Example Check that a Kubernetes cluster is not latest version data "http" "kubernetes_version" { url = "" request_headers = { Accept = "application/vnd.github+json" X-GitHub-Api-Version = "2022-11-28" } lifecycle { postcondition { condition = tonumber(module.eks.cluster_version) < tonumber(regex("[0-9]+.[0-9]+", jsondecode(self.response_body).0.tag_name)) error_message = "Kubernetes cluster version should not be latest" } } }

Con fi guration Dry Run Deploy Unit Test Contract Test Integration Test Mocks Infrastructure API

Integration Tests • Requires active resources • Check resource status • Verify default attributes supplied by infrastructure API (e.g., versions, encryption key identi fi ers)

End-to-End Tests Check the system works as expected

Example Check that a sample application endpoint is still healthy @pytest.fixture(scope='session') def url(servers): return f"http://{servers[0]['networkInterfaces'][0]['accessConfigs'][0] ['natIP']}" def test_url_for_service_returns_running_page(url): response = requests.get(url) assert "Welcome" in response.text

Con fi guration Dry Run Deploy Unit Test Contract Test Integration Test Mocks Infrastructure API End-to-End Test

End-to-End Tests • Requires a full set of active resources • Run on live environment • Verify end-to-end connectivity and functionality (e.g., network peering, application deployment work fl ow)

Manual Tests End-to-End Tests Integration Tests Contract Tests Unit Tests Modules Modules Modules Con fi guration Con fi guration Con fi guration

In conclusion… • Use tests as a form of education • Evolve tests for additional functionality, remove fl aky ones • Shorten feedback loop from IaC to deployment • Assess cost of testing environment versus testing in production • Infrastructure mocks? Test coverage?

Thank you! @joatmon08