Kubernetes manifests management and operation in Mercari

by @babarot

Slide 1

Slide 1 text

@b4b4r07 (Apr 22, 2019) / Kubernete meetup tokyo #18 Kubernetes manifests management & operation in Mercari

Slide 2

Slide 2 text

BABAROT / @b4b4r07 Mercari, Inc.  SRE, Microservices Platform Blog / tellme.tokyo

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Monolith Our current status Microservices

Slide 5

Slide 5 text

100+ microservices Our current status

Slide 6

Slide 6 text

200+ contributors Our current status

Slide 7

Slide 7 text

Our current status Service A namespace central GKE cluster Service A pods RBAC for Service A team Each namespace is managed by each microservices team GKE cluster itself is managed by platform team

Slide 8

Slide 8 text

Agenda 1. Kubernetes YAML 1. GitHub Pull Requests 2. GitOps 3. Monorepo 4. Directories 2. Repository Ecosystem 3. Recap

Slide 9

Slide 9 text

Kubernetes YAML apiVersion: v1 kind: Pod metadata: name: nginx-pod namespace: x-echo-jp-dev spec: containers: - name: nginx-container image: nginx ports: - containerPort: 80

Slide 10

Slide 10 text

How do you manage YAML and operate it to your Kubernetes Clusters?

Slide 11

Slide 11 text

Write Apply

Slide 12

Slide 12 text

Write Apply Some points…

Slide 13

Slide 13 text

Write Apply How should we review it? How should we apply it?

Slide 14

Slide 14 text

Write Apply How should we review it? How should we manage it? How should we apply it? How should we control it?

Slide 15

Slide 15 text

Write Apply How should we review it? How should we manage it? How should we apply it? How should we control it? YAML Operations YAML Management

Slide 16

Slide 16 text

Write Apply How should we review it? How should we manage it? How should we apply it? How should we control it? YAML Operations YAML Management Our practices are…

Slide 17

Slide 17 text

Write Apply How should we review it? How should we manage it? How should we apply it? How should we control it? 2. GitOps w/ kubectl 1. Pull Requests 3. Monorepo 4. Directories

Slide 18

Slide 18 text

Let’s see the each details

Slide 19

Slide 19 text

1. GitHub Pull Requests

Slide 20

Slide 20 text

Kubernetes Object Management Imperative commands Imperative  object  configuration Declarative  object  configuration https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/

Slide 21

Slide 21 text

Kubernetes Object Management Imperative  object  configuration Declarative  object  configuration https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/ kubectl run nginx --image nginx

Slide 22

Slide 22 text

Kubernetes Object Management Declarative  object  configuration https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/ kubectl run nginx --image nginx kubectl create -f nginx.yaml

Slide 23

Slide 23 text

Kubernetes Object Management kubectl run nginx --image nginx https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/ kubectl create -f nginx.yaml kubectl apply -f manifests/

Slide 24

Slide 24 text

•It can be stored in VCS such as Git. •It can integrate with processes such as reviewing changes. •It has better support for operating on directories and automatically detecting operation types per-object. Kubernetes Object Management Declarative  object  configuration https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/

Slide 25

Slide 25 text

Pull Requests •Easy to review •Easy to track operations and changes •Easy to recover from unexpected things • Just reverting

Slide 26

Slide 26 text

2. GitOps + -

Slide 27

Slide 27 text

GitOps - Operations by Pull Requests

Slide 28

Slide 28 text

https://www.weave.works/blog/gitops-operations-by-pull-request

Slide 29

Slide 29 text

Start 6 servers There are 6 servers

Slide 30

Slide 30 text

Start 6 servers There are 6 servers <

Slide 31

Slide 31 text

Let’s see Weaveworks case

Slide 32

Slide 32 text

GitHub Repository Kubernetes Cluster

Slide 33

Slide 33 text

Live Objects Source of Truth

Slide 34

Slide 34 text

Live Objects Source of Truth

Slide 35

Slide 35 text

Live Objects apply Source of Truth

Slide 36

Slide 36 text

Live Objects apply kubectl create Source of Truth

Slide 37

Slide 37 text

Live Objects apply kubectl create diff Source of Truth

Slide 38

Slide 38 text

Live Objects apply kubectl create diff Source of Truth unknown objects

Slide 39

Slide 39 text

Live Objects apply kubectl create diff Git is the source of truth Source of Truth unknown objects

Slide 40

Slide 40 text

GitHub Repository Kubernetes Cluster diff & sync Pull strategy

Slide 41

Slide 41 text

Let’s see our case

Slide 42

Slide 42 text

GitHub Repository Kubernetes Cluster CI

Slide 43

Slide 43 text

+ - GitHub Repository Kubernetes Cluster CI diff

Slide 44

Slide 44 text

+ - GitHub Repository Kubernetes Cluster CI merge diff

Slide 45

Slide 45 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff

Slide 46

Slide 46 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff kubectl create

Slide 47

Slide 47 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff kubectl create not implemented yet  (described later)

Slide 48

Slide 48 text

GitHub Repository Kubernetes Cluster CI merge & apply Push strategy

Slide 49

Slide 49 text

Let’s see the each differences

Slide 50

Slide 50 text

Pull (Weaveworks case) Push (Mercari case) •Git can be the single source of truth •Bit difficult to implement the sync pipeline •Difficult to find divergence from the source •A common way of applying the changes

Slide 51

Slide 51 text

Pull (Weaveworks case) We know they recommend "Pull" strategy https://www.weave.works/blog/kubernetes-anti-patterns-let-s-do-gitops-not-ciops •Git can be the single source of truth •Bit difficult to implement the sync pipeline

Slide 52

Slide 52 text

Push (Mercari case) Why do we choose "Push" strategy? Simple Enough  to start  firstly Easy to implement •Difficult to find divergence from the source •A common way of applying the changes

Slide 53

Slide 53 text

Why do we choose "Push" strategy? Simple Enough  to start  firstly Easy to implement Another most bigger reason is... Spinnaker

Slide 54

Slide 54 text

Enough  to start  firstly Spinnaker Why using Spinnaker also? •Our CI pipeline which runs "kubectl apply" based on the changes is triggered by merging pull requests •However, in some resources (Job etc), we want to apply in our timing •Spinnaker "Provider V2" can handle Kubernetes manifests declaratively https://www.spinnaker.io/reference/providers/kubernetes-v2/

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff kubectl create not implemented yet  (described later) Spinnaker kick

Slide 57

Slide 57 text

3. Monorepo

Slide 58

Slide 58 text

Two type of repository styles

Slide 59

Slide 59 text

One repository Multiple repositories

Slide 60

Slide 60 text

Monorepo Polyrepo

Slide 61

Slide 61 text

Monorepo Polyrepo Service A Service B Service A Service C Service B Service C

Slide 62

Slide 62 text

https://medium.com/@adamhjk/monorepo-please-do-3657e08a4b70 https://medium.com/@mattklein123/monorepos-please-dont-e9a279be011b

Slide 63

Slide 63 text

Monorepo •Advantages • Easy to share YAML code • Easy to be reviewed by central team • Easy to be managed by central team • Easy to set up CI pipeline •Disadvantages • Take account into repo scale up • Take account into delegation of authority • CI/CD: No independence in each team

Slide 64

Slide 64 text

•Advantages • Build CI/CD pipeline by yourself • No dependency of outside system • Easy to be scaled up themselves • Easy to change the pipeline cycle •Disadvantages • Not easy to share YAML code • Difficult to review by central team • Troublesome to build CI/CD pipeline by yourself Polyrepo

Slide 65

Slide 65 text

•We chose "Monorepo" style. •It was a good option to start small. •The concrete reason will be shown in next section.

Slide 66

Slide 66 text

4. Directories

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

microservice

Slide 70

Slide 70 text

environment microservice

Slide 71

Slide 71 text

environment kind microservice

Slide 72

Slide 72 text

microservice environment kind resource

Slide 73

Slide 73 text

Spinnaker case

Slide 74

Slide 74 text

No content

Slide 75

Slide 75 text

microservice

Slide 76

Slide 76 text

microservice environment

Slide 77

Slide 77 text

microservice environment pipeline

Slide 78

Slide 78 text

microservice environment pipeline

Slide 79

Slide 79 text

Pipeline

Slide 80

Slide 80 text

Pipeline

Slide 81

Slide 81 text

How do we apply the manifest changes?

Slide 82

Slide 82 text

Just run "kubectl" https://groups.google.com/forum/#!msg/kubernetes-sig-cli/M6t40JP6n0g/U6Snz-bsFQAJ

Slide 83

Slide 83 text

kubectl https://groups.google.com/forum/#!msg/kubernetes-sig-cli/M6t40JP6n0g/U6Snz-bsFQAJ •Microservices developers not only develop but also operate the service by themselves •So they are familiar with kubectl basically •It means less learning cost than introducing other tools

Slide 84

Slide 84 text

Slide 85

Slide 85 text

Slide 86

Slide 86 text

Let’s say we’d add new manifest

Slide 87

Slide 87 text

added

Slide 88

Slide 88 text

added

Slide 89

Slide 89 text

added Helper bash scripts detect changed directory from the PR manifests/microservices/mercari-echo-jp/development/PodDisruptionBudget

Slide 90

Slide 90 text

changed_files() { declare basedir="${1}" declare current_branch="$(git rev-parse --abbrev-ref @)" if [[ ${current_branch} == "master" ]]; then # (apply) # In the master branch, when listing files edited # you need to compare with previous merge commit git diff --name-only "HEAD^" "HEAD" "${basedir}" else # (plan) # In the topic branch, when listing files edited in the branch, # you need to compare with the commit at the time # the branch was created # https://git-scm.com/docs/git-merge-base git diff --name-only $(git merge-base origin/HEAD HEAD) "${basedir}" fi }

Slide 91

Slide 91 text

changed_dirs() { # Note: # If these files are changed # - manifests/microservices/x/development/Ingress # - manifests/microservices/x/development/PersistentVolumeClaim # - manifests/microservices/y/production/PersistentVolumeClaim # - manifests/microservices/y/production/Pod # - manifests/microservices/y/production/PodDisruptionBudget # The files we have to pass to script/apply are only two dirs # - manifests/microservices/x/development # - manifests/microservices/y/production declare basedir="${1}" for file in $(changed_files "${basedir}") do get_target_dir "${file}" done | sort | uniq }

Slide 92

Slide 92 text

Slide 93

Slide 93 text

Slide 94

Slide 94 text

// InstallOrder is the order in which manifests should be installed (by Kind). // Those occurring earlier in the list get installed before those occurring later in the list. var InstallOrder SortOrder = []string{ "Namespace", "ResourceQuota", "PodSecurityPolicy", "Secret", "ConfigMap", "PersistentVolume", "PersistentVolumeClaim", "CustomResourceDefinition", "Role", "RoleBinding", "Service", "DaemonSet", "Pod", "ReplicaSet", "Deployment", "StatefulSet", "Job", "CronJob", "Ingress", } https://github.com/helm/helm/blob/v2.10.0/pkg/tiller/kind_sorter.go ResourceQuota Secret ConfigMap PersistentVolume PersistentVolumeClaim ServiceAccount Role RoleBinding Service DaemonSet Pod ReplicaSet Deployment StatefulSet Job CronJob Ingress HorizontalPodAutoscaler NetworkPolicy PodDisruptionBudget kind_install_order.txt

Slide 95

Slide 95 text

sort_kinds_by_install_order() { kinds=( $(cat "kind_install_order.txt") ) args=( "${@}" ) for kind in "${kinds[@]}" do for arg in "${args[@]}" do if [[ $(get_kind "${arg}") == ${kind} ]]; then echo "${arg}" fi done done } .../development/Deployment .../development/ConfigMap .../development/PodDisruptionBudget ConfigMap Deployment PodDisruptionBudget kind_install_order.txt

Slide 96

Slide 96 text

Slide 97

Slide 97 text

It's easy to introduce our design but it has new concepts of "overlay" etc.

Slide 98

Slide 98 text

•Our microservices is on the way •So developers are in the middle of being microservices developers •They have to learn a lot of things: • Kubernetes, Kubernetes YAML itself, Spinnaker, etc... • So introducing kustomize feature to our pipeline was not now.  In the future.

Slide 99

Slide 99 text

How do we apply PodDisruptionBudget? •Some resource kinds (e.g., PodDisruptionBudget) cannot update in-place •It means we cannot update existing PodDisruptionBudget or StatefulSet by kubectl apply.

Slide 100

Slide 100 text

kubectl_apply() { # ... for manifest in $(sort_kinds_by_install_order "${manifests[@]}"); do case ${kind} in PodDisruptionBudget) # Need to be recreated if it already exists if kubectl get -n "${namespace}" "${kind}" "${resource}"; then kubectl delete -n "${namespace}" "${kind}" "${resource}" fi kubectl apply -n "${namespace}" -f "${manifes}" ;; Secret) ansible-vault view "${manifest}" \ | kubectl apply -n "${namespace}" -f - ;; *) kubectl apply -n "${namespace}" -f "${manifest}" ;; esac done } In order to deal with those special kinds, we prepare for easy wrapper script for kubectl.

Slide 101

Slide 101 text

No content

Slide 102

Slide 102 text

The delegation of directory authority

Slide 103

Slide 103 text

mercari-echo-jp team should not be able to change mercari-xxx-jp's team code, and vice versa

Slide 104

Slide 104 text

GitHub CODEOWNERS feature https://blog.github.com/2017-07-06-introducing-code-owners/

Slide 105

Slide 105 text

GitHub CODEOWNERS feature https://help.github.com/articles/about-codeowners/

Slide 106

Slide 106 text

Repository Ecosystem

Slide 107

Slide 107 text

Repository Ecosystem •Our microservices-kubernetes repository has some awesome tools to make it maintain easier and more handy like the ecosystem •One of those tools is a linter for Kubernetes YAML: Stein Stein Documentations

Slide 108

Slide 108 text

Repository Ecosystem •For example, let's say you don't make the developers omit metadata.namespace field in their YAMLs to prevent from unexpected apply •However, do you have a way to do it in existing tools...? apiVersion: v1 kind: Pod metadata: name: nginx-pod namespace: x-echo-jp-dev spec: containers: - name: nginx-container image: nginx ports: - containerPort: 80 metadata: namespace: x-echo-jp-dev

Slide 109

Slide 109 text

Stein can do that.

Slide 110

Slide 110 text

rule "namespace_specification" { description = "Check namespace name is not empty" conditions = [ "${jsonpath("metadata.namespace") != ""}", ] report { level = "ERROR" message = "Namespace is not specified" } } Stein configuration Stein allows you to enforce the rule defined by you based on your policy upon your YAML.

Slide 111

Slide 111 text

Slide 112

Slide 112 text

rule "namespace_specification" { description = "Check namespace name is not empty" conditions = [ "${jsonpath("metadata.namespace") != ""}", ] report { level = "ERROR" message = "Namespace is not specified" } } Stein configuration stein can interpret HCL like Terraform stein supports many built-in functions like Terraform

Slide 113

Slide 113 text

$ stein apply x-echo-jp/development/Pod/test.yaml [ERROR] rule.namespace_specification Namespace is not specified ===================== 7 error(s), 2 warn(s) •Stein checks the policy files and applies them to your config files. If there are violation rules, Stein returns exit code 1. •Stein can work as a linter in CI step etc well.

Slide 114

Slide 114 text

•Stein concepts and design comes from HashiCorp Sentinel one. •"Policy as Code" (PaC) is provided by HashiCorp and Sentinel. •PaC means the way to describe "ideal configuration files" and force it upon real configuration files. Policy as Code infrastructure code policy IaC PaC Policy as Code - Sentinel by HashiCorp Why Policy as Code? - HashiCorp Blog

Slide 115

Slide 115 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff kubectl create not implemented yet  (described later)

Slide 116

Slide 116 text

+ - apply GitHub Repository Kubernetes Cluster CI merge diff kubectl create Stein: Adminmission Controller •TODO: Stein can work the admission controller also. • By doing so, it is possible to check whether YAML having violated rules is going to be applied. • It can be compatible with "Push" strategy.

Slide 117

Slide 117 text

Recap

Slide 118

Slide 118 text

1. Write manifests 2. Send Pull Request •kubectl pipeline •stein lint step •Dir base delegation •apply when merged 3. Run apply (dry-run)

Slide 119

Slide 119 text

1. Write manifests 2. Send Pull Request •kubectl pipeline •stein lint step •Dir base delegation •apply when merged 3. Run apply (dry-run) We can provide the common resources to all microservices

Slide 120

Slide 120 text

•By using Monorepo style, • we can provide the common guard rail to start to develop & operate their own microservices • apply pipeline • review by central team • common lint step • Of course, it has also disadvantages • It's trade-offs for scaling up

Slide 121

Slide 121 text

•Thank you