Slide 1

Slide 1 text

Copyright © 2020 HashiCorp Testing your Cloud Infrastructure (as Code) Rosemary Wang (@joatmon08) OdysseyConf | March 31, 2021 1

Slide 2

Slide 2 text

Migrate to public cloud infrastructure. @JOATMON08 2

Slide 3

Slide 3 text

Infrastructure as Code Declare the configuration you want and store it in source control. Immutable Create a new resource for every update, delete the old resource. On-Demand Create and delete resources when you need them (or don’t). 3 @JOATMON08

Slide 4

Slide 4 text

@JOATMON08 4 We wrote some tests for our cloud infrastructure. Testing environments cost us 70% of our cloud provider bill. We pushed a change that affected production. We deactivated tests and deleted the testing environments

Slide 5

Slide 5 text

Rosemary Wang (She/Her) Developer Advocate at HashiCorp @joatmon08 5

Slide 6

Slide 6 text

@JOATMON08 6 Infrastructure tests to catch problems before production. Cost of environments for running tests.

Slide 7

Slide 7 text

@JOATMON08 7 Integration Tests Contract Tests Unit Tests Cost* (Time, $$$) End-to-End Tests

Slide 8

Slide 8 text

Unit Tests 8

Slide 9

Slide 9 text

Unit Tests Test configuration or state metadata ▪ Lint and check syntax ▪ Any testing framework or language that parses data formats – Programming languages – Behavior-driven development (BDD) – Other tools (e.g., CloudFormation linter, HashiCorp Sentinel, terraform- compliance, Helm unittest plugin) 9 @JOATMON08

Slide 10

Slide 10 text

@JOATMON08 10 You write some configuration. INFRASTRUCTURE AS CODE CONFIGURATION Tool uses configuration to generate changes to infrastructure state. LIST OF CHANGES TO INFRASTRUCTURE STATE UNIT TESTS PARSE CONFIGURATION OR LIST OF CHANGES.

Slide 11

Slide 11 text

Unit Tests Cost assessment ▪ Worth writing a few… – Communicate expectations, including security – No infrastructure required ▪ Do not… – Work offline (i.e., retrieve remote state) – Prevent system failure 11 @JOATMON08

Slide 12

Slide 12 text

Contract Tests 12

Slide 13

Slide 13 text

Contract Tests Test configuration inputs and outputs ▪ Validate inputs or outputs – Password requirements – Special values (e.g., port ranges, constants) ▪ Public cloud APIs constantly change ▪ Check infrastructure dependencies ▪ Any testing framework or language for validation 13 @JOATMON08

Slide 14

Slide 14 text

CODE EDITOR variable "listener_rule_priority" { type = number default = 1 description = "Priority of listener rule between 1 to 50000" validation { condition = var.listener_rule_priority > 0 && var.listener_rule_priority < 50000 error_message = "The priority of listener rule must between 1 to 50000." } } @JOATMON08 14

Slide 15

Slide 15 text

Contract Tests Cost assessment ▪ Worth writing a few… – (Usually) no infrastructure required – Fail fast for invalid input values ▪ Do not… – Work offline (i.e., retrieve remote state) – Prevent system failure 15 @JOATMON08

Slide 16

Slide 16 text

Integration Tests 16

Slide 17

Slide 17 text

Integration Tests Test if the configuration creates infrastructure ▪ Can you even apply the changes to infrastructure? ▪ Verify dependencies with real infrastructure ▪ Any testing framework or language with setup, test, teardown – Integration testing frameworks (e.g., TaskCat, kitchen, terratest, goss, Inspec) – Programming language – Add to delivery pipeline 17 @JOATMON08

Slide 18

Slide 18 text

@JOATMON08 18 You push some configuration to source control. INFRASTRUCTURE AS CODE CONFIGURATION Pipeline sets up resources in testing environment. Pipeline tears down resources in testing environment. INTEGRATION TESTS

Slide 19

Slide 19 text

CODE EDITOR @pytest.fixture(scope='session') def apply_changes(): generate_json(TEST_SERVER_NAME) assert os.path.exists(SERVER_CONFIGURATION_FILE) assert test_utils.initialize() == 0 yield test_utils.apply() assert test_utils.destroy() == 0 os.remove(SERVER_CONFIGURATION_FILE) def test_changes_should_add_1_resource(apply_changes): output = apply_changes[1].decode(encoding='utf-8').split('\n') assert 'Apply complete! Resources: 1 added, 0 changed, 0 destroyed' \ in output[-2] def test_server_is_in_running_state(apply_changes): gcp_server = test_utils.get_server(TEST_SERVER_NAME) assert gcp_server.state == NodeState.RUNNING 

Slide 20

Slide 20 text

Integration Tests Cost assessment ▪ Worth writing… – Test dependencies – Build confidence in “successful” changes ▪ Do not… – Eliminate cost of resources – Speed up feedback loop 20 @JOATMON08

Slide 21

Slide 21 text

End-to-End Tests 21

Slide 22

Slide 22 text

End-to-End Tests Test if infrastructure supports the workflows ▪ Can you create or access resources? ▪ Verify end-to-end functionality ▪ Run after you apply changes in testing / production ▪ Includes smoke tests ▪ Any testing framework or language with enough access to create or read resources 22 @JOATMON08

Slide 23

Slide 23 text

CODE EDITOR $ kitchen test -----> Starting Kitchen (v2.3.3) … Profile: End-to-End Tests for Application (default) ✔ db: Database: check routing from public to private subnet ✔ Host port 27017 proto tcp should be reachable ✔ Host port 27017 proto tcp should be resolvable ✔ Host port 80 proto tcp should not be reachable ✔ outbound: Public Subnet: check routing out to public internet ✔ HTTP GET on status should cmp == 301 Profile Summary: 2 successful controls, 0 control failures, 0 controls skipped Test Summary: 3 successful, 0 failures, 0 skipped @JOATMON08 23

Slide 24

Slide 24 text

End-to-End Tests Cost assessment ▪ Worth writing… – Test functionality – Check system works before “release” ▪ Do not… – Eliminate cost of resources – Speed up feedback loop 24 @JOATMON08

Slide 25

Slide 25 text

Look at our cloud bill! 25

Slide 26

Slide 26 text

@JOATMON08 26 Integration Tests Contract Tests Unit Tests Cost* (Time, $$$) End-to-End Tests REDUCE ERRORS IN CONFIGURATION RUN ON EXISTING INFRASTRUCTURE

Slide 27

Slide 27 text

Use Cheaper Resources ✅ Smaller size ✅ Shorter lifecycle ❌ Not always accurate ❌ Drift between testing & production Use Infrastructure API Mocks ✅ No infrastructure ✅ Can run offline ❌ Dependencies? ❌ Drift between mock & actual APIs Delete Long-Lived Environments ✅ Elasticity ✅ Shorter lifecycle ❌ Confidence? ❌ Time spent creating environments 27 @JOATMON08

Slide 28

Slide 28 text

How much is really from testing? Use resource tagging to answer this question. @JOATMON08 28

Slide 29

Slide 29 text

Resource Tagging Identify why you have them in the first place ▪ Environment (testing/production) ▪ Test Type (integration/end-to-end) ▪ Repository (joatmon08/terraform-aws-listenerrule-nia) ▪ Teardown (true/false) 29 @JOATMON08

Slide 30

Slide 30 text

AWS Listener Rule Integration Tests (02/2021) $15.12 1 Application Load Balancer us-east-1, 672 hours $0.00 1 Elastic IP us-east-1 $0.82 1 EC2 Instance us-east-1, t2.micro, 72 hours $15.94 Total us-east-1 @JOATMON08 30

Slide 31

Slide 31 text

Solution: Use elasticity! Create and delete resources before and after testing. 31 @JOATMON08

Slide 32

Slide 32 text

Problem: Assumes immutability. You might make some changes in-place. 32 @JOATMON08

Slide 33

Slide 33 text

Reality: Some long-lived resources. e.g., networking, databases, Kubernetes control planes 33 @JOATMON08

Slide 34

Slide 34 text

Time to Change Shorter = Setup & Teardown Number of Dependencies Fewer = Setup & Teardown Frequency of Change Less = Setup & Teardown Statefulness Less = Setup & Teardown 34

Slide 35

Slide 35 text

@JOATMON08 35 We write some tests for our cloud infrastructure. We assess the cost based on tagging. We push the change to production. We can replace the long-lived resources for this set of tests.

Slide 36

Slide 36 text

References Infrastructure testing is a heuristic. ▪ ▪ ▪ ▪ 36 @JOATMON08

Slide 37

Slide 37 text

Thank you! Rosemary Wang @joatmon08 37