Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Test in Production: Infrastructure Edition

Rosemary Wang
November 04, 2019

Test in Production: Infrastructure Edition

Originally presented at Test in Production (Berlin) MeetUp on November 4, 2019.

How do we feature toggle, canary test, and AB test on production infrastructure? In this talk, we'll give some brief examples on each approach.

CORRECTIONS

- Initial example for talk briefly showed feature toggle for network. Updated example has a more clear framing of what the feature toggle looks like.
- Updated example for canary explicitly calls out a "green" VPC, as denoted in diagram.

Rosemary Wang

November 04, 2019
Tweet

More Decks by Rosemary Wang

Other Decks in Technology

Transcript

  1. Copyright © 2019 HashiCorp Test in Production: Infrastructure Edition Test

    in Production Berlin | Nov. 4, 2019 Rosemary Wang | @joatmon08
  2. My users = developers ▪ Push and deliver code at

    any time ▪ Availability of application depends on system
  3. Feature Toggles ▪ Preserve state, if possible ▪ Inject with

    roll forward mindset ▪ Don’t write toggles at the start
  4. Feature Toggles CODE EDITOR resource "aws_instance" "example_bionic" { count =

    var.enable_new_ami ? 1 : 0 instance_type = "t2.micro" ami = data.aws_ami.ubuntu_bionic.id vpc_security_group_ids = [aws_security_group.instances.1.id] subnet_id = aws_subnet.public.1.id tags = { Terraform = "true" Owner = var.owner Has_Toggle = var.enable_new_ami } }
  5. Canary Testing ▪ Smoke test before release ▪ Easier with

    container architectures – e.g., VM images for Kubernetes worker nodes
  6. Canary Testing CODE EDITOR resource "aws_instance" "canary" { count =

    var.enable_new_network ? 1 : 0 instance_type = "t2.micro" ami = data.aws_ami.ubuntu.id vpc_security_group_ids = [aws_security_group.instances _green.id] subnet_id = aws_subnet.public_green.id tags = { Name = "${var.prefix}-canary" Owner = var.owner } }
  7. VPC (blue) 10.128.0.0/24 VPC (green) 10.128.0.0/28 APP APP APP APP

    KITCHEN INSTANCE APP APP CANARY CAN I CONNECT?
  8. Kubernetes Control Plane Kubernetes Node Group (Insecure OS) Kubernetes Node

    Group (Secure OS) INTERNAL EXTERNAL EXTERNAL EXTERNAL EXTERNAL INTERNAL INTERNAL kubectl taint nodes external=true:NoExecute
  9. A/B Testing ▪ Infrastructure that affect upstream Service Level Objectives

    ▪ Hypotheses: –Does X batch process more quickly than Y? –Does X cost more than Y?
  10. Kafka FaaS “Data Lake” versus Kafka Spark “Data Lake” APPLICATION

    APPLICATION APPLICATION APPLICATION Does Spark + Kafka architecture process faster with lower cost?
  11. Conclusions ▪ Test in production organizes infrastructure blast radius ▪

    Risk mitigation over risk aversion ▪ “Infrastructure-as-Code” is heuristic