Testing Your Cloud Infrastructure (as Code)

Copyright © 2020 HashiCorp Testing your Cloud Infrastructure (as Code)
Rosemary Wang (@joatmon08) OdysseyConf | March 31, 2021 1

Migrate to public cloud infrastructure. @JOATMON08 2

Infrastructure as Code Declare the configuration you want and store
it in source control. Immutable Create a new resource for every update, delete the old resource. On-Demand Create and delete resources when you need them (or don’t). 3 @JOATMON08

@JOATMON08 4 We wrote some tests for our cloud infrastructure.
Testing environments cost us 70% of our cloud provider bill. We pushed a change that affected production. We deactivated tests and deleted the testing environments

Rosemary Wang (She/Her) Developer Advocate at HashiCorp @joatmon08 joatmon08.github.io 5

@JOATMON08 6 Infrastructure tests to catch problems before production. Cost
of environments for running tests.

@JOATMON08 7 Integration Tests Contract Tests Unit Tests Cost* (Time,
$$$) End-to-End Tests

Unit Tests 8

Unit Tests Test configuration or state metadata ▪ Lint and
check syntax ▪ Any testing framework or language that parses data formats – Programming languages – Behavior-driven development (BDD) – Other tools (e.g., CloudFormation linter, HashiCorp Sentinel, terraform- compliance, Helm unittest plugin) 9 @JOATMON08

@JOATMON08 10 You write some configuration. INFRASTRUCTURE AS CODE CONFIGURATION
Tool uses configuration to generate changes to infrastructure state. LIST OF CHANGES TO INFRASTRUCTURE STATE UNIT TESTS PARSE CONFIGURATION OR LIST OF CHANGES.

Unit Tests Cost assessment ▪ Worth writing a few… –
Communicate expectations, including security – No infrastructure required ▪ Do not… – Work offline (i.e., retrieve remote state) – Prevent system failure 11 @JOATMON08

Contract Tests 12

Contract Tests Test configuration inputs and outputs ▪ Validate inputs
or outputs – Password requirements – Special values (e.g., port ranges, constants) ▪ Public cloud APIs constantly change ▪ Check infrastructure dependencies ▪ Any testing framework or language for validation 13 @JOATMON08

CODE EDITOR variable "listener_rule_priority" { type = number default =
1 description = "Priority of listener rule between 1 to 50000" validation { condition = var.listener_rule_priority > 0 && var.listener_rule_priority < 50000 error_message = "The priority of listener rule must between 1 to 50000." } } @JOATMON08 14

Contract Tests Cost assessment ▪ Worth writing a few… –
(Usually) no infrastructure required – Fail fast for invalid input values ▪ Do not… – Work offline (i.e., retrieve remote state) – Prevent system failure 15 @JOATMON08

Integration Tests 16

Integration Tests Test if the configuration creates infrastructure ▪ Can
you even apply the changes to infrastructure? ▪ Verify dependencies with real infrastructure ▪ Any testing framework or language with setup, test, teardown – Integration testing frameworks (e.g., TaskCat, kitchen, terratest, goss, Inspec) – Programming language – Add to delivery pipeline 17 @JOATMON08

@JOATMON08 18 You push some configuration to source control. INFRASTRUCTURE
AS CODE CONFIGURATION Pipeline sets up resources in testing environment. Pipeline tears down resources in testing environment. INTEGRATION TESTS

CODE EDITOR @pytest.fixture(scope='session') def apply_changes(): generate_json(TEST_SERVER_NAME) assert os.path.exists(SERVER_CONFIGURATION_FILE) assert test_utils.initialize()
== 0 yield test_utils.apply() assert test_utils.destroy() == 0 os.remove(SERVER_CONFIGURATION_FILE) def test_changes_should_add_1_resource(apply_changes): output = apply_changes[1].decode(encoding='utf-8').split('\n') assert 'Apply complete! Resources: 1 added, 0 changed, 0 destroyed' \ in output[-2] def test_server_is_in_running_state(apply_changes): gcp_server = test_utils.get_server(TEST_SERVER_NAME) assert gcp_server.state == NodeState.RUNNING   @JOATMON08 19 SETUP TEARDOWN TEST

Integration Tests Cost assessment ▪ Worth writing… – Test dependencies
– Build confidence in “successful” changes ▪ Do not… – Eliminate cost of resources – Speed up feedback loop 20 @JOATMON08

End-to-End Tests 21

End-to-End Tests Test if infrastructure supports the workflows ▪ Can
you create or access resources? ▪ Verify end-to-end functionality ▪ Run after you apply changes in testing / production ▪ Includes smoke tests ▪ Any testing framework or language with enough access to create or read resources 22 @JOATMON08

CODE EDITOR $ kitchen test -----> Starting Kitchen (v2.3.3) …
Profile: End-to-End Tests for Application (default) ✔ db: Database: check routing from public to private subnet ✔ Host 10.128.0.43 port 27017 proto tcp should be reachable ✔ Host 10.128.0.43 port 27017 proto tcp should be resolvable ✔ Host 10.128.0.43 port 80 proto tcp should not be reachable ✔ outbound: Public Subnet: check routing out to public internet ✔ HTTP GET on https://hashicorp.com status should cmp == 301 Profile Summary: 2 successful controls, 0 control failures, 0 controls skipped Test Summary: 3 successful, 0 failures, 0 skipped @JOATMON08 23

End-to-End Tests Cost assessment ▪ Worth writing… – Test functionality
– Check system works before “release” ▪ Do not… – Eliminate cost of resources – Speed up feedback loop 24 @JOATMON08

Look at our cloud bill! 25

@JOATMON08 26 Integration Tests Contract Tests Unit Tests Cost* (Time,
$$$) End-to-End Tests REDUCE ERRORS IN CONFIGURATION RUN ON EXISTING INFRASTRUCTURE

Use Cheaper Resources ✅ Smaller size ✅ Shorter lifecycle ❌
Not always accurate ❌ Drift between testing & production Use Infrastructure API Mocks ✅ No infrastructure ✅ Can run offline ❌ Dependencies? ❌ Drift between mock & actual APIs Delete Long-Lived Environments ✅ Elasticity ✅ Shorter lifecycle ❌ Confidence? ❌ Time spent creating environments 27 @JOATMON08

How much is really from testing? Use resource tagging to
answer this question. @JOATMON08 28

Resource Tagging Identify why you have them in the first
place ▪ Environment (testing/production) ▪ Test Type (integration/end-to-end) ▪ Repository (joatmon08/terraform-aws-listenerrule-nia) ▪ Teardown (true/false) 29 @JOATMON08

AWS Listener Rule Integration Tests (02/2021) $15.12 1 Application Load
Balancer us-east-1, 672 hours $0.00 1 Elastic IP us-east-1 $0.82 1 EC2 Instance us-east-1, t2.micro, 72 hours $15.94 Total us-east-1 @JOATMON08 30

Solution: Use elasticity! Create and delete resources before and after
testing. 31 @JOATMON08

Problem: Assumes immutability. You might make some changes in-place. 32
@JOATMON08

Reality: Some long-lived resources. e.g., networking, databases, Kubernetes control planes
33 @JOATMON08

Time to Change Shorter = Setup & Teardown Number of
Dependencies Fewer = Setup & Teardown Frequency of Change Less = Setup & Teardown Statefulness Less = Setup & Teardown 34

@JOATMON08 35 We write some tests for our cloud infrastructure.
We assess the cost based on tagging. We push the change to production. We can replace the long-lived resources for this set of tests.

References Infrastructure testing is a heuristic. ▪ github.com/joatmon08/tdd-infrastructure ▪ puppet.com/blog/hitchhikers-guide-to-testing-infrastructure-as-and-code
▪ github.com/joatmon08/terraform-aws-listenerrule-nia ▪ hashicorp.com/resources/testing-your-hcl-modules-in-terraform 36 @JOATMON08

Thank you! Rosemary Wang @joatmon08 joatmon08.github.io 37

Testing Your Cloud Infrastructure (as Code)

Testing Your Cloud Infrastructure (as Code)

More Decks by Rosemary Wang

Other Decks in Programming

Featured

Transcript