Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building the World's Largest Websites

Seth Vargo
September 11, 2015

Building the World's Largest Websites

Today we are plagued by hundreds of choices when architecting a modern data center. Should our machines be virtual or physical? Should we use containers or Docker? Should we use a public cloud provider or a private cloud provider? Which configuration management tool is best to use? What about IaaS, PaaS, and SaaS? It would be manageable if these were binary choices; however, we often find ourselves in a hybrid environment. As more operations choices are added to your data center, whether through company acquisitions, a growing development team, or general technical debt, managing complexity between legacy and new systems becomes a nightmare. Yet the end goal is still the same — safely deploy your application to your infrastructure. We need to tame our data centers by managing change across systems, enforcing policies, and by establishing a workflow for both developers and operations engineers to build in a collaborative environment. This talk will discuss the problems faced in the modern data center, and how a set of innovative open source tooling can be used to tame the rising complexity curve. We will discuss the tools and tactics implored by some of the largest web-based companies using open source tools. Join me on an adventure with Vagrant, Consul, Terraform, and more as we take your data center from chaos to control.

Seth Vargo

September 11, 2015
Tweet

More Decks by Seth Vargo

Other Decks in Technology

Transcript

  1. RISING DATACENTER COMPLEXITY DC VM VM VM VM VM VM

    VM VM VM VM VM VM VM VM VM VM C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C
  2. RISING DATACENTER COMPLEXITY DC-01 DC-02 VM VM VM VM VM

    VM VM VM C C C C C C C C C C C C C C C C C C C C C C C C
  3. CONSUL-TEMPLATE Template Example global daemon maxconn {{key "haproxy/maxconn"}} defaults mode

    {{key "haproxy/mode"}}{{range ls "haproxy/timeouts"}} timeout {{.Key}} {{.Value}}{{end}} listen http-in bind *:8000{{range service "release.web"}} server {{.Node}} {{.Address}}:{{.Port}}{{end}}
  4. CONSUL-TEMPLATE Execute (as a service) $ consul-template \ -consul demo.consul.io

    \ -template “haproxy.ctmpl:/etc/haproxy/haproxy.conf:restart haproxy” -dry
  5. STEP BY STEP 1. Config management tooling lays down configuration

    template 2. consul-template runs as a service 3. Edge triggers config changes, restarts service
  6. RESILIENCY Low-TTL DNS records Ensures availability even if Consul is

    unavailable Required for short-held connections since DNS lookup overhead is too high with zero TTL
  7. CONSUL AGENT OPTION #1: CONSUL SETTINGS Per-service, stale reads on

    non-leaders WEB PROCESS DNS query CONSUL 
 LEADER CONSUL 
 STANDBY
  8. CONSUL AGENT OPTION #2: DNSMASQ + CONSUL Global, works if

    Consul is down WEB PROCESS DNS query CONSUL 
 LEADER CONSUL 
 STANDBY DNSMASQ
  9. CONSUL AGENT OPTION #2: DNSMASQ + CONSUL Global, works if

    Consul is down WEB PROCESS DNS query CONSUL 
 LEADER CONSUL 
 STANDBY DNSMASQ
  10. CONSUL AGENT OPTION #3: APPLICATION-LEVEL CACHE Works if almost everything

    is down, strict control over cache times WEB PROCESS DNS query CONSUL 
 LEADER CONSUL 
 STANDBY IN-MEM CACHE
  11. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6
  12. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6
  13. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.5 10.0.1.6
  14. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.5 10.0.1.6 host: web.service.consul
  15. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.5 10.0.1.6 host: web.service.consul
  16. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.5 10.0.1.6 host: web.service.consul
  17. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.5 10.0.1.6 host: web.service.consul
  18. CONSUL CONSUL MONITORING Removes unhealthy nodes from service discovery layer

    WEB 1 WEB 2 WEB N dig web.service.consul 10.0.1.4 10.0.1.5 10.0.1.6 host: web.service.consul
  19. CONSUL LOCK Allows for a new kind of "HA" demo

     master consul lock [options] prefix child...
  20. VM C C C C VM C C C C

    VM C C C C VM C C C C VM C C C C ROLLING RESTARTS/UPGRADES
  21. CONSUL CONSUL EXEC Run arbitrary commands on nodes WEB 1

    WEB 2 DATABASE consul exec -service=web ./script.sh
  22. CONSUL WATCH Wait for event, then do something demo 

    master consul watch -type=event -name=deploy ./deploy.sh
  23. WHAT IF I ASKED YOU TO... CREATE AN EPHEMERAL ENVIRONMENT

    (STAGING, ETC)? UPDATE AN EXISTING COMPLEX APPLICATION?
  24. WHAT IF I ASKED YOU TO... CREATE AN EPHEMERAL ENVIRONMENT

    (STAGING, ETC)? UPDATE AN EXISTING COMPLEX APPLICATION? DOCUMENT YOUR INFRASTRUCTURE ARCHITECTURE?
  25. WHAT IF I ASKED YOU TO... CREATE AN EPHEMERAL ENVIRONMENT

    (STAGING, ETC)? UPDATE AN EXISTING COMPLEX APPLICATION? DOCUMENT YOUR INFRASTRUCTURE ARCHITECTURE? DELEGATE SOME OPS TO SMALLER TEAMS (CORE VS. APP IT)?
  26. DIGITAL OCEAN DROPLET WITH DNS USING DNS SIMPLE resource "digitalocean_droplet"

    "web" { name = "tf-web" size = "512mb" image = "centos-5-8-x32" region = "sfo1" } resource "dnsimple_record" "hello" { domain = "example.com" name = "test" value = "${digitalocean_droplet.web.ipv4_address}" type = "A" }
  27. DIGITAL OCEAN DROPLET WITH DNS USING DNS SIMPLE resource "digitalocean_droplet"

    "web" { name = "tf-web" size = "512mb" image = "centos-5-8-x32" region = "sfo1" } resource "dnsimple_record" "hello" { domain = "example.com" name = "test" value = "${digitalocean_droplet.web.ipv4_address}" type = "A" }
  28. DIGITAL OCEAN DROPLET WITH DNS USING DNS SIMPLE resource "digitalocean_droplet"

    "web" { name = "tf-web" size = "512mb" image = "centos-5-8-x32" region = "sfo1" } resource "dnsimple_record" "hello" { domain = "example.com" name = "test" value = "${digitalocean_droplet.web.ipv4_address}" type = "A" }
  29. DIGITAL OCEAN DROPLET WITH DNS USING DNS SIMPLE resource "digitalocean_droplet"

    "web" { name = "tf-web" size = "512mb" image = "centos-5-8-x32" region = "sfo1" } resource "dnsimple_record" "hello" { domain = "example.com" name = "test" value = "${digitalocean_droplet.web.ipv4_address}" type = "A" }
  30. TERRAFORM PLAN What are you going to do? demo 

    master terraform plan + digitalocean_droplet.web backups: "" => "<computed>" image: "" => "centos-5-8-x32" ipv4_address: "" => "<computed>" ipv4_address_private: "" => "<computed>" name: "" => "tf-web" private_networking: "" => "<computed>" region: "" => "sfo1" size: "" => "512mb" status: "" => "<computed>"
  31. TERRAFORM GRAPH What order are you going to do things?

    demo  master terraform graph digraph { compound = "true" newrank = "true" subgraph "root" { "[root] aws_instance.haproxy" [label = "aws_instance.haproxy", shape = "box"] "[root] aws_instance.web" [label = "aws_instance.web", shape = "box"] "[root] aws_internet_gateway.terraform-tutorial" [label = "aws_internet_gateway.terraform-tutorial", shape = "box"] "[root] aws_route_table.terraform-tutorial" [label =
  32. TERRAFORM MODULES module "consul" { source = "github.com/hashicorp/consul/terraform/aws" servers =

    5 version = "0.4.0" } resource "dnsimple_record" "consul" { domain = "example.com" name = "consul" value = "${module.consul.ip_address}" type = "A" }
  33. TERRAFORM REMOTE STATE resource "terraform_remote_state" "consul" { backend = "atlas"

    config { path = "hashicorp/consul-prod" } } output "consul-address" { value = "${terraform_remote_state.consul.addr}" }
  34. SERVICE COMPOSITION Modern infrastructures are almost always "multi-provider": DNS in

    CloudFlare, compute in AWS, etc. Infrastructure change requires composing data from multiple services, executing change in multiple services
  35. SERVICE COMPOSITION // Terraform allows you to combine multiple external

    providers and // their outputs into a single pipeline resource "aws_instance" "web" {
 // Existing resource attributes } resource "cloudflare_record" "www" { domain = "foo.com" name = "www" value = "${aws_instance.web.private_ip}" type = "A" }
  36. LOGICAL RESOURCES // In additional to physical resources, Terraform also

    has logical // resources such as templates resource "template_file" "data" { filename = "data.tpl" vars { address = "${var.addr}" } } resource "aws_instance" "web" {
 user_data = "${template_file.data.rendered}" }
  37. INFRASTRUCTURE COLLABORATION Approve plans - similar to pull requests, but

    for infrastructure Infrastructure change review