Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes at GitHub

Jesse Newland
December 08, 2017

Kubernetes at GitHub

An overview of the on-premesis Kubernetes deployments that power 20% of GitHub's production services, and a review of the challenges GitHub faced and overcame during their Kubernetes journey.

Presented at KubeCon in Austin. Slides with presenter notes are available here:

https://schd.ws/hosted_files/kccncna17/44/kubernetes-at-github.pdf

Jesse Newland

December 08, 2017
Tweet

More Decks by Jesse Newland

Other Decks in Technology

Transcript

  1. Kubernetes at
    GitHub
    Jesse Newland
    @jnewland
    Principal Site Reliability Engineer

    View Slide

  2. View Slide

  3. 4 years ago

    View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. Substrate

    View Slide

  8. View Slide

  9. Substrate

    View Slide

  10. Substrate

    View Slide

  11. Substrate

    View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. View Slide


  16. View Slide

  17. 20% of services
    run on Kubernetes

    View Slide

  18. View Slide

  19. View Slide

  20. GitHub dot com,
    the website

    View Slide

  21. $ kubectl get ns github-production
    NAME STATUS AGE
    github-production Active 168d

    View Slide

  22. $ kubectl get ns
    NAME STATUS AGE
    github-production Active 168d
    kube-system Active 169d

    View Slide

  23. View Slide

  24. View Slide

  25. Cluster C
    kube-node
    kube-apiserver
    3x
    kube-node
    kube-node
    45x
    Cluster B
    kube-node
    kube-apiserver
    3x
    kube-node
    kube-node
    67x
    Cluster A
    kube-node
    kube-apiserver
    3x
    kube-node
    37x kube-node
    67x
    kube-node
    67x
    1460 CPUs
    5.7 TB RAM
    1540 CPUs
    5.4 TB RAM
    1580 CPUs
    6.9 TB RAM

    View Slide

  26. View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. $ kubectl -n github-production get deployment
    NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
    unicorn 190 190 190 190 168d
    unicorn-api 164 164 164 164 168d
    consul-service-router 2 2 2 2 168d

    View Slide

  33. unicorn
    kind: Deployment
    metadata:
    name: unicorn
    labels:
    service: unicorn
    role: production
    spec:
    replicas: 190
    nginx
    unicorn
    failbot
    requests via
    unix socket
    exceptions

    View Slide

  34. unicorn
    kind: Deployment
    metadata:
    name: unicorn
    labels:
    service: unicorn
    role: production
    spec:
    replicas: 190
    nginx
    unicorn
    failbot
    requests via
    unix socket
    exceptions

    View Slide

  35. unicorn
    kind: Deployment
    metadata:
    name: unicorn
    labels:
    service: unicorn
    role: production
    spec:
    replicas: 190
    nginx
    unicorn
    failbot
    requests via
    unix socket
    exceptions

    View Slide

  36. unicorn-api
    kind: Deployment
    metadata:
    name: unicorn-api
    labels:
    service: unicorn-api
    role: production
    spec:
    replicas: 164
    nginx
    unicorn
    failbot
    requests via
    unix socket
    exceptions

    View Slide

  37. consul-service-router
    Metal services
    github-production Namespace
    kind: Deployment
    metadata:
    name: unicorn
    mysql
    gpgverify
    search
    hookshot
    spokes
    memcached
    kind: Deployment
    metadata:
    name: consul-service-router
    haproxy
    unicorn
    kind: Service
    metadata:
    name: consul-service-router

    View Slide

  38. consul-service-router
    Metal services
    github-production Namespace
    kind: Deployment
    metadata:
    name: unicorn
    mysql
    gpgverify
    search
    hookshot
    spokes
    memcached
    kind: Deployment
    metadata:
    name: consul-service-router
    haproxy
    unicorn
    kind: Service
    metadata:
    name: consul-service-router

    View Slide

  39. View Slide

  40. Cluster A
    kind: Namespace
    metadata:
    name: github-production

    kind: Service
    metadata:
    name: unicorn
    spec:
    type: NodePort
    Cluster B
    kind: Namespace
    metadata:
    name: github-production
    kind: Service
    metadata:
    name: unicorn
    spec:
    type: NodePort
    Cluster C
    kind: Namespace
    metadata:
    name: github-production
    kind: Service
    metadata:
    name: unicorn
    spec:
    type: NodePort

    View Slide

  41. Tools to support operations
    • kube-testlib
    • Continuously running suite of conformance tests
    • kube-health-proxy
    • Adjust weight of incoming traffic, disable entire clusters at load balancer level
    • kube-namespace-defaults
    • Creates default resources in each new namespace, configures imagePullSecrets
    • kube-pod-patrol
    • Detects and deletes stuck pods, sets NodeConditions if a node has repeated trouble starting pods
    • node-problem-healer
    • Detects NodeConditions, heals them by rebooting nodes

    View Slide

  42. A platform
    for builders

    View Slide

  43. A platform
    for builders

    View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. View Slide

  48. GitHub Flow

    View Slide

  49. Conventions

    View Slide

  50. $ docker build -t $service:$sha1 ./Dockerfile

    View Slide

  51. $ docker build -t $service:$sha1 ./Dockerfile
    $ kubectl create ns $service-$environment

    View Slide

  52. $ docker build -t $service:$sha1 ./Dockerfile
    $ kubectl create ns $service-$environment
    $ deploy -Rf ./config/kubernetes/$environment | \

    View Slide

  53. $ docker build -t $service:$sha1 ./Dockerfile
    $ kubectl create ns $service-$environment
    $ deploy -Rf ./config/kubernetes/$environment | \
    kubectl -ns $service-$environment apply —f -

    View Slide

  54. Create a branch

    View Slide

  55. Add some commits

    View Slide

  56. Open a pull request

    View Slide

  57. Containers built on push, tagged with commit

    View Slide

  58. Iterate and review

    View Slide

  59. View Slide

  60. # config/kubernetes/review-lab
    # updates image field value to $service:$sha1
    # injects a Secret
    # injects an Ingress

    View Slide

  61. $ kubectl create ns review-lab-$branch
    $ kubectl apply -ns review-lab-$branch -f -

    View Slide

  62. Deploy

    View Slide

  63. View Slide

  64. View Slide

  65. View Slide

  66. Steady state kind: Service
    metadata:
    name: unicorn
    spec:
    selector:
    service: unicorn
    kind: Pod
    metadata:
    name: unicorn
    labels:
    service: unicorn
    role: production
    unicorn

    View Slide

  67. Canary deploy kind: Service
    metadata:
    name: unicorn
    spec:
    selector:
    service: unicorn
    kind: Pod
    metadata:
    name: unicorn
    labels:
    service: unicorn
    role: production
    unicorn
    kind: Pod
    metadata:
    name: unicorn-canary
    labels:
    service: unicorn
    role: canary
    unicorn

    View Slide

  68. View Slide

  69. $ kubectl apply \
    —-namespace github-production \
    -Rf config/kubernetes/production

    View Slide

  70. View Slide

  71. All of the other services deployed
    to our Kubernetes clusters can now
    use this canary workflow

    View Slide

  72. Adopting Kubernetes as a
    standard platform has made it
    easier for GitHub SREs to build
    features that apply to all services,
    not just github/github

    View Slide

  73. We're encouraging the
    decomposition of the monolith
    by providing a first-class
    experience for newer, smaller
    services

    View Slide

  74. 2018

    View Slide

  75. View Slide

  76. State

    View Slide

  77. State

    View Slide

  78. Distributed systems often use
    replication to provide fault tolerance,
    and can therefore tolerate node failures.
    However, data gravity is preferred for
    reducing replication traffic and cold
    startup latencies.

    View Slide

  79. View Slide

  80. View Slide

  81. Changing our OSS habits

    View Slide


  82. View Slide


  83. @jnewland


    [email protected]

    View Slide