Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Journey to #GIFEE

5107e703ee54ed6a3466bdb56658612a?s=47 Rob
March 22, 2016

The Journey to #GIFEE

Walking through the 3 changes required to run a Google-like infrastructure stack on CoreOS + Kubernetes.

5107e703ee54ed6a3466bdb56658612a?s=128

Rob

March 22, 2016
Tweet

Transcript

  1. Rob Szumski @robszumski | rob.szumski@coreos.com The Journey to #GIFEE

  2. Secure the Internet MISSION

  3. None
  4. 3 Application packaging Linux at scale Clustering

  5. #GIFEE Borg/Omega ChromeOS Chubby

  6. #GIFEE Borg/Omega ChromeOS Chubby

  7. #GIFEE Borg/Omega ChromeOS Chubby

  8. Linux at Scale 1

  9. Patches to the OS and kernel are hard Rolling update

    tools Diverse hardware LARGE SCALE Safer to leave it alone No one owns security SMALL SCALE
  10. Auto-updating browsers fixed security We got HTML5 at the same

    time
  11. Atomic operating system updates

  12. Atomic operating system updates

  13. PXE/diskless Quick reboots Easy to boot, install, and manage Secure

    by default Cross-cloud
  14. None
  15. Application Packaging 2

  16. Abstract away app from the OS OS App

  17. None
  18. None
  19. Protect apps from each other Isolated network namespace Isolated file

    system namespace Mixed versions of dependencies eg. python 3.4 & python 2.7
  20. Base software managed by CoreOS systemd kernel OpenSSH

  21. Easily move apps between machines Easy scale out Recover from

    failure Painless OS software update
  22. Perfect touch-point for security Sign artifact from CI Scan containers

    at rest Audit trail
  23. None
  24. A security-minded, standards-based container engine

  25. Specification for “application containers”

  26. Universal Container Format Packaged Downloaded Executed

  27. apt-get for containers Local mirrors Distributed namespace (DNS) Serve over

    HTTPS, no complex software
  28. Jetpack FreeBSD/Go Kurma Linux/Go rkt Linux/Go Independent GitHub organization Contributions

    from Cloud Foundry, Mesosphere, Google, Red Hat and many others
  29. None
  30. Composable Designed for init systems Standard Unix process Separate build

    tool
  31. Composable No central daemon Not a “platform”

  32. systemd app systemd app docker run redis docker engine daemon

  33. $ sudo rkt run coreos.com/etcd:v2.0.0 $ sudo rkt run coreos.com/etcd:v2.0.0

    \ --cpu=750m --memory=128M $ sudo rkt run --net=host coreos.com/etcd:v2.0.0 rkt run
  34. None
  35. Pods Built-in Deployed together Share local network Share volumes

  36. Pods Built-in rktnetes

  37. Tunable Isolation Match your workload 3 isolation levels Make your

    own stage1
  38. stage0 stage1 stage2

  39. stage0 stage1 stage2 The rkt binary • Fetch ACI, verify

    • Set up pod filesystem • Unpack stage1 and stage2 ACIs
  40. stage0 stage1 stage2 Set up execution env • Create cgroups,

    namespaces, & mounts • Read pod manifest • Start systemd-nspawn
  41. stage0 stage1 stage2 Your application!

  42. Benefit from standard packaging, signing and distribution at all isolation

    levels. Privileged eg. Kubelet Container/cgroup eg. Webapp Virtual Machine eg. Untrusted code
  43. $ sudo rkt run \ example.com/worker -- --loglevel verbose ---

    \ example.com/syncer -- --interval 30s rkt run a pod
  44. Unique rkt features Sensible, best practice security Ease of use

    for Ops
  45. $ sudo rkt gc Moving pod "81627cc6" to garbage Moving

    pod "cd642877" to garbage Moving pod "d65abad6" to garbage Pod "81627cc6" not removed: still within grace period (30m0s) Pod "cd642877" not removed: still within grace period (30m0s) Pod "d65abad6" not removed: still within grace period (30m0s) Garbage Collection Run as a cron job, customizable grace period
  46. $ sudo rkt trust --prefix=storage.coreos.com $ sudo rkt trust --prefix=coreos.com/etcd

    $ sudo rkt trust --root ~/aci-pubkeys.gpg Tools for trust Easily control what runs on your server
  47. $ find /etc/rkt/trustedkeys/ /etc/rkt/trustedkeys/ /etc/rkt/trustedkeys/prefix.d /etc/rkt/trustedkeys/prefix.d/coreos.com /etc/rkt/trustedkeys/prefix.d/coreos.com/etcd /etc/rkt/trustedkeys/prefix.d/coreos. com/etcd/8b86de38890ddb7291867b025210bd8888182190 /etc/rkt/trustedkeys/root.d

    /etc/rkt/trustedkeys/root. d/d8685c1eff3b2276e5da37fd65eea12767432ac4 Tools for trust Easily control what runs on your server
  48. $ rkt fetch quay.io/coreos/alpine-sh ... $ sudo rkt run quay.io/coreos/alpine-sh

    Fetch ACI as unprivileged user Don’t have to download as root
  49. $ sudo rkt run --insecure-options=image --interactive \ docker://busybox -- /bin/sh

    Run Docker containers with rkt Use a more secure runtime without changing images
  50. Clustering 3

  51. Scale out workloads Everyone’s goal is #GIFEE Enables automation Cloud

    = Distributed Systems
  52. When do you need cluster coordination? Leader election Cluster-wide Semaphores

    Service discovery Dynamic configuration
  53. Hard Computer Science Problem ?

  54. Hard Computer Science Problem Chubby

  55. A distributed, reliable key-value store for the most critical data

    of a distributed system.
  56. None
  57. No existing “cloud native” solutions High availability from beginning Dynamic

    reconfiguration Why build etcd?
  58. Simple key/value “Distributed etc” Feels like a file system eg.

    directories
  59. $ etcdctl set /foo bar bar $ etcdctl ls /config

    /config/verbosity /config/ratelimit Set a value $ etcdctl get /foo bar
  60. Simple interface Easily write clients Use curl if you want

    Already maintain TLS infra.
  61. Watch a value Service discovery Reconfiguration Locking Cluster scheduler

  62. Cluster-wide reboot lock - “locksmith” Distributed init system - “fleet”

    Leader election - “fleet”
  63. $ locksmithctl status Available: 1 Max: 1 $ sudo locksmithctl

    reboot locksmith $ locksmithctl status Available: 0 Max: 1 MACHINE ID 7f9ccde3cff9441f8b506785 $ sudo locksmithctl reboot Error locking: semaphore is at 0
  64. Industry Adoption 500+ projects on Github

  65. 3 Application packaging Linux at scale Clustering

  66. Minimal, secure Linux OS Containers for app packaging Self-updating cluster

    Distributed systems tools
  67. None
  68. Sounds good, but... Is anyone successful with CoreOS in prod?

  69. Publically traded options exchange

  70. Containers on CoreOS are powering ISE's high- throughput, low-latency financial

    exchange Running in production Bare metal & AWS Billions of transactions a day 150 million req/sec
  71. TIME PATCHING OS NEW MACHINE DEPLOYMENT

  72. Invisible Infrastructure

  73. We really look at that [CoreOS] number growing significantly over

    this next year. We did some of these benchmarks to see if our production trading systems could leverage this type of infrastructure, and it was highly successful for us, and we look forward to using it more in our other environments. On the Linux side, everything in AWS is CoreOS. On the physical side, 20% is CoreOS, and growing. “ ” Robert Cornish CTO Paul Morgan Systems Architect
  74. None
  75. Kubernetes is our recommended orchestration platform

  76. Guides & Tools coreos.com/kubernetes kube-aws Cloud-configs

  77. Upstream rktnetes Auth/OIDC Node self-signed TLS

  78. Scaling 15x scheduler performance 30k pods on 1k nodes SIG-scale

  79. None
  80. Off-the-shelf #GIFEE

  81. Enhances Kubernetes Included tools 24/7 Support Enhanced security

  82. Quay Enterprise

  83. Tectonic Console

  84. CoreUpdate

  85. Distributed Trusted Computing Only possible with #GIFEE

  86. Trusted Computing It’s in your pocket right now

  87. Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware

    OS
  88. Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware

    OS
  89. Customer key embedded in the firmware Kubernetes rkt CoreOS Linux

    Firmware & TPM Cluster Containers Hardware OS Kubernetes
  90. Verify integrity of the OS release Customer key embedded in

    the firmware Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware OS
  91. Verify integrity of the OS release Customer key embedded in

    the firmware Verify configuration state Verify images with trusted keys Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware OS
  92. Verify integrity of the OS release Customer key embedded in

    the firmware Verify configuration state Verify images with trusted keys Only attested machines are allowed to join Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware OS
  93. Verify integrity of the OS release Customer key embedded in

    the firmware Verify configuration state Verify images with trusted keys Only attested machines are allowed to join Kubernetes rkt CoreOS Linux Firmware & TPM Cluster Containers Hardware OS Tamper-proof audit log (TPM)
  94. Identify Attacks Visibility into new classes of attacks Firmware OS

    Images Rootkits
  95. Inverting DRM Your company is in control

  96. You hold the keys Only software your company allows will

    run You are in control of the hardware Key
  97. New Level of Security Run in third party or hostile

    data centers with zero trust Prevent invisible attacks Verifiable audit log for when things go wrong Putting you in control Your company is in cryptographic control your environment
  98. The Journey to #GIFEE

  99. coreos.com/fest - @coreosfest May 9 & 10, 2016 - Berlin,

    Germany
  100. Thank You Rob Szumski Product Design Lead, CoreOS @robszumski