Slide 1

Slide 1 text

Copyright © 2019 HashiCorp Test Infrastructure in Production with Terraform WWCode Cloud | Dec. 18, 2019 Rosemary Wang | @joatmon08

Slide 2

Slide 2 text

Network engineers live dangerously.

Slide 3

Slide 3 text

My users = developers ▪ Push and deliver code at any time ▪ Availability of application depends on system

Slide 4

Slide 4 text

How do we change infrastructure without impacting applications?

Slide 5

Slide 5 text

Infrastructure-as-Code* * Not really.

Slide 6

Slide 6 text

Let’s talk software development.

Slide 7

Slide 7 text

How do we work? ▪ Feature release vs. Continuous Delivery ▪ Feature branching vs. Trunk-based ▪ Mono- vs. Multi-Repository

Slide 8

Slide 8 text

Approaches ▪ Shift-left testing (e.g., staging) ▪ Feature Toggles ▪ Canary Testing ▪ A/B Testing

Slide 9

Slide 9 text

Shift-Left Testing ▪ Test before production ▪ Assess change impact ▪ Can apply Test-Driven Development

Slide 10

Slide 10 text

Integration Tests Contract Tests Unit Tests “Ideal” Testing Pyramid Manual Testing Cost (Time, $$$) End-to-End Tests

Slide 11

Slide 11 text

Unit / Contract Testing CODE EDITOR contains_variables(variables) { variables[_].vpc_cidr[0].value = "10.128.0.0/25" variables[_].region[0].value = "eu-central-1" variables[_].owner[0] } deny[msg] { not contains_variables(input.variables) msg = "Variables are not populated with expected values" }

Slide 12

Slide 12 text

12

Slide 13

Slide 13 text

Challenges ▪ Cost of lower environments ▪ Imperfect indicator – Net new each time? – Dependencies?

Slide 14

Slide 14 text

Feature Toggles ▪ Preserve state, if possible ▪ Inject with roll forward mindset ▪ Don’t write toggles at the start

Slide 15

Slide 15 text

Feature Toggles CODE EDITOR resource "aws_instance" "example_bionic" { count = var.enable_new_ami ? 1 : 0 instance_type = "t2.micro" ami = data.aws_ami.ubuntu_bionic.id tags = { Terraform = "true" Owner = var.owner Has_Toggle = var.enable_new_ami } }

Slide 16

Slide 16 text

Canary Testing ▪ Smoke test before release ▪ Easier with container architectures – e.g., VM images for Kubernetes worker nodes

Slide 17

Slide 17 text

Canary Testing CODE EDITOR resource "aws_instance" "canary" { count = var.enable_new_network ? 1 : 0 instance_type = "t2.micro" ami = data.aws_ami.ubuntu.id vpc_security_group_ids = [aws_security_group.instances _green.id] subnet_id = aws_subnet.public_green.id tags = { Name = "${var.prefix}-canary" Owner = var.owner } }

Slide 18

Slide 18 text

VPC (blue) 10.128.0.0/24 VPC (green) 10.128.0.0/28 APP APP APP APP KITCHEN INSTANCE APP APP CANARY CAN I CONNECT?

Slide 19

Slide 19 text

Kubernetes Control Plane Kubernetes Node Group (Insecure OS) Kubernetes Node Group (Secure OS) INTERNAL EXTERNAL EXTERNAL EXTERNAL EXTERNAL INTERNAL INTERNAL kubectl taint nodes external=true:NoExecute

Slide 20

Slide 20 text

A/B Testing ▪ Infrastructure that affect upstream Service Level Objectives ▪ Hypotheses: –Does X batch process more quickly than Y? –Does X cost more than Y?

Slide 21

Slide 21 text

Kafka FaaS “Data Lake” versus Kafka Spark “Data Lake” APPLICATION APPLICATION APPLICATION APPLICATION Does Spark + Kafka architecture process faster with lower cost?

Slide 22

Slide 22 text

Regular VM versus Network Optimized VM Does a new instance reduce latency? APPLICATION APPLICATION APPLICATION APPLICATION

Slide 23

Slide 23 text

Conclusions ▪ Test in production organizes infrastructure blast radius ▪ Risk mitigation over risk aversion ▪ “Infrastructure-as-Code” is heuristic

Slide 24

Slide 24 text

Resources ▪ learn.hashicorp.com/terraform ▪ hashicorp.com/blog/terraform-feature-toggles-blue-green- deployments-canary-test ▪ hashicorp.com/resources/test-driven-development-tdd-for- infrastructure ▪ discuss.hashicorp.com ▪ app.terraform.io

Slide 25

Slide 25 text

joatmon08.github.io Rosemary Wang (she/her) Developer Advocate at HashiCorp @joatmon08 joatmon08 linkedin.com/in/rosemarywang/ 25