Slide 1

Slide 1 text

Rosemary Wang | January 31, 2024 Can You Test Your Infrastructure as Code? Infrastructure & Ops Superstream

Slide 2

Slide 2 text

Yes.

Slide 3

Slide 3 text

(But I don’t want to write tests.)

Slide 4

Slide 4 text

What’s worth testing?

Slide 5

Slide 5 text

Rosemary Wang @joatmon08 joatmon08.github.io

Slide 6

Slide 6 text

👩💻 $ git commit -m “Deploy a virtual machine” $ git push dry run

Slide 7

Slide 7 text

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

Slide 8

Slide 8 text

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push dry run dry run 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

Slide 9

Slide 9 text

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

Slide 10

Slide 10 text

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ curl https://app.hashidemo.com/ 503 Server Unavailable 👩💻 $ git commit -m "Deploy a virtual machine" $ git push 👩💻 $ git commit -m "Refactor after code review" $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network.

Slide 11

Slide 11 text

👩💻 🔄 requested changes Rather than hard-code the image version, let’s move it to a variable. 👩💻 $ curl https://app.hashidemo.com/ 503 Server Unavailable 👩💻 $ git commit -m “Deploy a virtual machine” $ git push 👩💻 $ git commit -m “Refactor after code review” $ git push 👩💻 ✅ approved changes LGTM! dry run dry run deploy Test for variables Test this API call Test for network requirements 👩💻 🔄 requested changes Do we deploy this virtual machine to a private or public subnet? The configuration deploys it with a public IP address, we should only deploy it in a private network. Test VM status

Slide 12

Slide 12 text

When are tests worth writing? • Turn unknown knowns (siloed knowledge) into known knowns • Communicate expectations for infrastructure • Standardize infrastructure as code • Automate checks for system function after changes

Slide 13

Slide 13 text

Manual Tests Integration Tests Contract Tests Unit Tests Cost Increases (Time & Money) End-to-End Tests

Slide 14

Slide 14 text

Some notes… • Pyramid is guideline, not rule • Choose a programming / domain-speci fi c language • You will have multiple testing frameworks • Infrastructure will have fl aky tests

Slide 15

Slide 15 text

Unit Tests Check attributes without con fi guring active resources

Slide 16

Slide 16 text

Example resource "aws_db_instance" "database" { engine = "postgres" ## omitted for clarity } run "database" { command = plan assert { condition = aws_db_instance.database.engine == "postgres" error_message = "Database engine should use postgres" } } Database engine should use PostgreSQL.

Slide 17

Slide 17 text

Example locals { subnets = cidrsubnets("10.0.0.0/16", 8, 8, 8, 8, 8, 8, 8, 8, 8) } module "vpc" { ## omitted for clarity private_subnets = slice(local.subnets, 0, 3) public_subnets = slice(local.subnets, 3, 6) } def test_vpc_subnets_have_correct_netmask(vpc_subnets): wrong_subnets = [subnet['values'].get('id') for subnet in vpc_subnets if not subnet['values'].get('cidr_block').endswith('/24')] assert len(wrong_subnets) == 0, "subnets {} should have /24 CIDR block".format(wrong_subnets) This function should create subnets with a /24 subnet mask.

Slide 18

Slide 18 text

Con fi guration Unit Test

Slide 19

Slide 19 text

Con fi guration Dry Run Unit Test

Slide 20

Slide 20 text

Con fi guration Dry Run Unit Test Mocks Infrastructure API

Slide 21

Slide 21 text

Unit Tests • Check attributes • Result of iterative logic / functions • Dependencies between resources • Any expected values, variables, or outputs

Slide 22

Slide 22 text

Contract Tests Check inputs match what is expected for a “module”

Slide 23

Slide 23 text

module collection of resources via infrastructure as code that are distributed for other people to use

Slide 24

Slide 24 text

Example variable “peering_requester_vpc_cidr_block” { type = string description = "CIDR Block for VPC requesting to peer" } variable “peering_accepter_vpc_cidr_block” { type = string description = "CIDR block for VPC accepting peering connection" } Check that two CIDR blocks do not overlap for network peering def test_peered_networks_do_not_overlap(vpcs): assert vpcs['prod_us_east_1'].get('cidr_block') == vpcs['prod_us_east_2'].get('cidr_block')

Slide 25

Slide 25 text

Example resource "hcp_hvn" "main" { hvn_id = var.name cloud_provider = "aws" region = local.hcp_region cidr_block = var.hcp_cidr_block } Check that two CIDR blocks do not overlap for network peering lifecycle { precondition { condition = var.hcp_cidr_block != var.vpc_cidr_block error_message = "Peered VPCs cannot have overlapping CIDR blocks" } }

Slide 26

Slide 26 text

How does someone know how to use the module?

Slide 27

Slide 27 text

Con fi guration Dry Run Unit Test Deploy Contract Test Mocks Infrastructure API

Slide 28

Slide 28 text

Con fi guration Dry Run Unit Test Deploy Catch the mistake here! Contract Test Mocks Infrastructure API

Slide 29

Slide 29 text

Con fi guration Dry Run Unit Test Deploy Catch the mistake here! Rather than deploying and waiting for an error. Contract Test Mocks Infrastructure API

Slide 30

Slide 30 text

Contract Tests • Validate input variables • Check identi fi er, password, or naming standards • Verify dependency metadata (e.g., passing in private networks)

Slide 31

Slide 31 text

Integration Tests Check attributes on active resources

Slide 32

Slide 32 text

Example Check that a database is available after creation run "database" { command = apply assert { condition = aws_db_instance.database.status == "available" error_message = "Database in module should be available" } }

Slide 33

Slide 33 text

Example Check that a Kubernetes cluster is not latest version def test_kubernetes_cluster_version_should_not_be_latest(kubernetes_cluster): kubernetes_release_details = requests.get( 
 'https://api.github.com/repos/kubernetes/kubernetes/releases/latest', headers={ 
 'Accept': 'application/vnd.github+json', 
 'X-GitHub-Api-Version': '2022-11-28' }) kubernetes_latest = version.parse( kubernetes_release_details.json().get(‘tag_name')) cluster_version = version.parse( kubernetes_cluster[‘change']['after'].get('version')) assert cluster_version < kubernetes_latest

Slide 34

Slide 34 text

Example Check that a Kubernetes cluster is not latest version data "http" "kubernetes_version" { url = "https://api.github.com/repos/kubernetes/kubernetes/releases?per_page=1" request_headers = { Accept = "application/vnd.github+json" X-GitHub-Api-Version = "2022-11-28" } lifecycle { postcondition { condition = tonumber(module.eks.cluster_version) < tonumber(regex("[0-9]+.[0-9]+", jsondecode(self.response_body).0.tag_name)) error_message = "Kubernetes cluster version should not be latest" } } }

Slide 35

Slide 35 text

Con fi guration Dry Run Deploy Unit Test Contract Test Integration Test Mocks Infrastructure API

Slide 36

Slide 36 text

Integration Tests • Requires active resources • Check resource status • Verify default attributes supplied by infrastructure API (e.g., versions, encryption key identi fi ers)

Slide 37

Slide 37 text

End-to-End Tests Check the system works as expected

Slide 38

Slide 38 text

Example Check that a sample application endpoint is still healthy @pytest.fixture(scope='session') def url(servers): return f"http://{servers[0]['networkInterfaces'][0]['accessConfigs'][0] ['natIP']}" def test_url_for_service_returns_running_page(url): response = requests.get(url) assert "Welcome" in response.text

Slide 39

Slide 39 text

Con fi guration Dry Run Deploy Unit Test Contract Test Integration Test Mocks Infrastructure API End-to-End Test

Slide 40

Slide 40 text

End-to-End Tests • Requires a full set of active resources • Run on live environment • Verify end-to-end connectivity and functionality (e.g., network peering, application deployment work fl ow)

Slide 41

Slide 41 text

Manual Tests End-to-End Tests Integration Tests Contract Tests Unit Tests Modules Modules Modules Con fi guration Con fi guration Con fi guration

Slide 42

Slide 42 text

In conclusion… • Use tests as a form of education • Evolve tests for additional functionality, remove fl aky ones • Shorten feedback loop from IaC to deployment • Assess cost of testing environment versus testing in production • Infrastructure mocks? Test coverage?

Slide 43

Slide 43 text

Thank you! @joatmon08 joatmon08.github.io