Slide 1

Slide 1 text

Inside of Kubernetes Controller 2019/09/27 Kubernetes Meetup Tokyo #23 Operator Deep Dive

Slide 2

Slide 2 text

Who am I Name: Kenta Iso(@go_vargo) Job: Infrastructure Engineer Kubernetes Lover Mission: Make Kubernetes environment Improve 2 CKA & CKAD

Slide 3

Slide 3 text

Purpose & Target ɾPeople who know Kubernetes Resources can understand Controller mechanism(include Implement). ɾKubernetes Controller Concept ɾKubernetes Controller mechanism, inside Controller Implement ɾComponents support Kubernetes Controller = Controller’s High Level Layer ~ Low Level Layer Target Purpose

Slide 4

Slide 4 text

Not Targeting ɾKubernetes Custom Controller + CRD Detail ※ However, Understanding controller mechanism Helps you to build custom controller. ɾFramework & SDK for Kubernetes Custom Controller e.g. KubebuilderɺOperator SDK controller-runtime, controller-tools Not Target

Slide 5

Slide 5 text

Note Book Review Blog(Japanese): https://go-vargo.hatenablog.com/entry/2019/08/05/201546 This slide is impressed by Programming Kubernetes. Programming Kubernetes 
 Published by O’Reilly Media, Inc. ʮProgramming KubernetesʯAuthor: Stefan Schimanski, Michael Hausenblas Chap1~2: Kubernetes Concept, API Object Chap3: client-go Chap4: CRD Chap5: code-generator Chap6~7: Custom Controller Chap8~9: Custom API Server, CRD Advanced https://programming-kubernetes.info/

Slide 6

Slide 6 text

Agenda ɾ What is Kubernetes Controllerʁ ɾ Control Loop(Reconciliation Loop) ɾController Library & Components ɾClient-Go - Informer - WorkQueue ɾController’s Cycle, Main Logic ɾController Summary - Reference

Slide 7

Slide 7 text

What is Kubernetes Controllerʁ ʙ High Level Architecture ʙ

Slide 8

Slide 8 text

.BTUFS 8PSLFS 8PSLFS 8PSLFS Kubernetes Architecture Master Node Worker Node

Slide 9

Slide 9 text

Kubernetes High Level Architecture .BTUFS 8PSLFS FUDE DMPVEDPOUSPMMFS NBOBHFS /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ DPOUSPMMFS NBOBHFS BQJTFSWFS TDIFEVMFS /PEF $POUBJOFS 3VOUJNF LVCFMFU /PEF $POUBJOFS 3VOUJNF LVCFMFU DMPVE LVCF QSPYZ LVCF QSPYZ

Slide 10

Slide 10 text

Kubernetes Architecture Kubernetes is Distributed Architecture & Distributed Components Master: AuthɾAuthz, Resource(API Object) Management, Container Scheduling: General management Worker: Container Execution

Slide 11

Slide 11 text

.BTUFS 8PSLFS FUDE DMPVEDPOUSPMMFS NBOBHFS /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ DPOUSPMMFS NBOBHFS BQJTFSWFS TDIFEVMFS /PEF $POUBJOFS 3VOUJNF LVCFMFU /PEF $POUBJOFS 3VOUJNF LVCFMFU DMPVE LVCF QSPYZ LVCF QSPYZ Kubernetes High Level Architecture

Slide 12

Slide 12 text

api-server / controller-manager api-server: api-server receives API Object’s CREATEɾUPDATEɾDELTE (CRUD)requests and execute requests. Executed Object data is persisted to etcd(DataStore). ※ component which accesses to etcd is only api-server controller-manager: Controller manages Resource(like Deployment, Service…). controller-manager is a group of multiple controllers.

Slide 13

Slide 13 text

controller-manager controller-manager: multiple controllers in one binary %FQMPZNFOU 3FQMJDB4FU %BFNPO4FU 4FSWJDF +PC $SPO+PC &OEQPJOU 4UBUFGVM4FU ʜ

Slide 14

Slide 14 text

3FQMJDB4FU $POUSPMMFS LJOE3FQMJDB4FU NFUBEBUB OBNFYYYYYY TQFD ʜ 3FTPVSDF 3FTPVSDF 3FTPVSDF ʜ 3FTPVSDF / .BOBHFT $POUSPM-PPQ .BOJGFTU ControllerͱResource Controller manages Resource FH3FQMJDB4FU$POUSPMMFS

Slide 15

Slide 15 text

3FQMJDB4FU $POUSPMMFS Controller and Resource One Controller manages one Resource 3FQMJDB4FU 1PE %FQMPZNFOU $POUSPMMFS %FQMPZNFOU NBOBHF DSFBUF NBOBHF DSFBUF Reference: OwnerReference https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/ There is mechanism called byʮownerReferenceʯwhich parent resource tags child resource. When parent resource is deleted, child resource is deleted by Garbage Collection(GC).

Slide 16

Slide 16 text

Control Loop (Reconciliation Loop)

Slide 17

Slide 17 text

"DUVBM4UBUF $POUSPM-PPQ %FTJSFE4UBUF 3FBESFBMSFTPVSDFT $IBOHFJOUFSOBMFYUFSOBMSFTPVSDFT PCTFSWF BOBMZ[F "DU Controller’s Concept Concept: Control Loop(Reconciliation Loop)

Slide 18

Slide 18 text

Control Loop(Reconciliation Loop) Controller Loop is Controller’s Concept. ※This is called by Reconciliation Loop Controller Loop Flow: 1. Read Resource Actual State 2. Change Resource State to Desired State 3. Update Resource Status Declarative API realize immutable Infrastructure by Control Loop. Loop

Slide 19

Slide 19 text

3FQMJDBT ReplicaSet Control Loop Example 3FQMJDBT 3FQMJDB4FU $POUSPMMFS 3FQMJDB4FU $POUSPMMFS 0CTFSWF %FTJSFE4UBUF "DUVBM4UBUF "DU

Slide 20

Slide 20 text

ReplicaSet Apply ʙ Container Execution LJOE3FQMJDB4FU NFUBEBUB OBNFYYYYYY TQFD ʜ .BOJGFTU 6TFS .BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS ,VCFDUMBQQMZGNBOJGFTUZBNM †"QQMZ3FQMJDB4FU3FTPVSDF 3FQMJDB4FU $POUSPMMFS ᶃ3FQMJDB4FUJT$SFBUFE

Slide 21

Slide 21 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS $POUSPM-PPQ ᶄ3FQMJDB4FU$POUSPMMFSEFUFDUT3FQMJDB4FUDSFBUJPO ᶄ`&NQUZ1PEJTDSFBUFEXIJDIEPFTO`U IBWF4QFDOPEF/BNFCZ$POUSPMMFS DSFBUF ˞/PUZFUEFMJWFSFEUP8PSLFS/PEF ReplicaSet Apply ʙ Container Execution

Slide 22

Slide 22 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS ᶅ4DIFEVMFSEFUFDUT1PEDSFBUJPO queue ᶅ`4DIFEVMFSFORVFVFTUPTDIFEVMJOH RVFVFCFDBVTFQPE4QFDOPEF/BNF JTFNQUZ ReplicaSet Apply ʙ Container Execution

Slide 23

Slide 23 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS ᶆLVCFMFUBMTPEFUFDUT1PEDSFBUJPO queue ᶆ`LVCFMFUTLJQTCFDBVTF QPE4QFDOPEF/BNFJTFNQUZ skip ReplicaSet Apply ʙ Container Execution

Slide 24

Slide 24 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS ᶇ4DIFEVMFSQPQT1PEGSPNRVFVF queue ᶇ`4DIFEVMFTQPEUPOPEFXIJDI DBOCFUPBTTJHOUP schedule ᶇ`6QEBUFTQPE4QFDOPEF/BNF 6QEBUF TQFDOPEF/BNF ReplicaSet Apply ʙ Container Execution

Slide 25

Slide 25 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue ᶈLVCFMFUEFUFDUT1PEVQEBUF ᶈ`LVCFMFUTUBSUTVQDPOUBJOFS ReplicaSet Apply ʙ Container Execution

Slide 26

Slide 26 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue ᶉLVCFMFUTFOETSFRVFTUUPBQJTFSWFS BOEBQJTFSWFSVQEBUFTQPE4UBUVT TUBUVTDPOEJUJPOT ReplicaSet Apply ʙ Container Execution

Slide 27

Slide 27 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue /PXTVQQPTFUIFDPOUBJOFSEJFEGPSTPNFSFBTPO ReplicaSet Apply ʙ Container Execution

Slide 28

Slide 28 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue 1PE`T5FSNJOBUJOH TUBUVTDPOEJUJPOT ReplicaSet Apply ʙ Container Execution ᶊLVCFMFUTFOETSFRVFTUUPBQJTFSWFS BOEBQJTFSWFSVQEBUFTQPE4UBUVT

Slide 29

Slide 29 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue 1PE`T5FSNJOBUJOH ᶋ3FQMJDB4FU$POUSPMMFSEFUFDUT 1PEVQEBUF EFMFUF ᶋ`3FQMJDB4FU$POUSPMMFSEFMFUFT1PE ReplicaSet Apply ʙ Container Execution

Slide 30

Slide 30 text

.BTUFS 8PSLFS FUDE /PEF $POUBJOFS 3VOUJNF LVCFMFU LVCF QSPYZ BQJTFSWFS TDIFEVMFS 3FQMJDB4FU $POUSPMMFS queue 3FDPODJMF ᶌ3FQMJDB4FU$POUSPMMFS3FDPODJMF JOPSEFSUPCFDPNF%FTJSFE4UBUF $POUSPMMFSSFDSFBUF1PE EFMFUF "OETP -PPQUIJTʜ ᶄʙᶌSFQFBU ReplicaSet Apply ʙ Container Execution

Slide 31

Slide 31 text

Appendix) Source Code ᶄ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L487 ᶅ https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/scheduler/eventhandlers.go#L436 ᶆ https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/kubelet/config/apiserver.go#L33 ᶇ https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/scheduler/scheduler.go#L535 ᶈ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L803 ᶉ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/kubelet/kubelet.go#L1527 ᶊ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/kubelet/kubelet.go#L2006 ᶋ https://github.com/kubernetes/kubernetes/blob/v1.16.0/pkg/controller/replicaset/replica_set.go#L535 ᶌ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L487

Slide 32

Slide 32 text

Controller and Components Each Component’s Responsibility ɾapi-server: Resource CRUD ɾscheduler: Resource Scheduling ɾkubelet: Container starts up ɾcontroller: Reconcile Resource Each component concentrates on its responsibilities. = There is no Orchestra conductor who controls the whole and gives instructions.

Slide 33

Slide 33 text

Controller and Harmony In Kubernetes, each component works cooperatively… Do not mean Even if component is not commanded, the whole consistency is maintained. So… Let’s think of each component as a Single Controller. Each Controller concentrates on running each Control Loop. As a result, Strangely, you can see that Kubernetes is in harmony as a whole.

Slide 34

Slide 34 text

Kubernetes is jazz improv Kubernetes is not an Orchestration. As jazz improv, by players(Controllers) concentrating on each plays(Control Loop), the whole is consisted of. Kubernetes is more jazz improv than orchestration. Joe Beda https://blog.heptio.com/core-kubernetes-jazz-improv-over-orchestration-a7903ea92ca Co-founder Core Kubernetes: Jazz Improv over Orchestration

Slide 35

Slide 35 text

Why Controller executes Control Loop? When we think why Controller executes Control Loop, ʮEventʯis key factor. ※ e.g. Added, Modified, Deleted, Error… For Kubernetes, which consists of distributed components, Event is very important. It flows between each component. Reference: Events, the DNA of Kubernetes There are two way of how we think event triggers ɾEdge-driven Triggers ɾLevel-driven Triggers https://www.mgasch.com/post/k8sevents/

Slide 36

Slide 36 text

Appendix) Edge vs. Level https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d Reference: Level Triggering and Reconciliation in Kubernetes Edge-driven Triggers Level-driven Triggers Trigger when event occurs Trigger when in a specific state up down low high low

Slide 37

Slide 37 text

In order to think that why controller executes control loop, we suppose Kubernetes takes Procedure process, not Reconcile. LJOEYYYYYYYYY NFUBEBUB OBNFYYYYYY TQFD ʜ Actual Kubernetes Assumption Reconcile Procedure Why Controller executes Control Loop?

Slide 38

Slide 38 text

If Controller is Procedure $VSSFOUOVNCFS PG3FQMJDBT ˢ 4DBMF6Q ˢ 4DBMFEPXO Controller takes Procedure process, the events are triggered as Edge-driven-trigger. This seems like no problem, but there are weaknesses. &WFOU

Slide 39

Slide 39 text

If there is no Reconcile When there is temporary network outage or bug, Event information is lost. 0VUBHF &WFOU $VSSFOUOVNCFS PG3FQMJDBT ˢ 4DBMF6Q ˢ 4DBMFEPXO

Slide 40

Slide 40 text

Resync Interval and Reconcile Controller Reconcile every time Resync Interval. So that Controller can bring the state closer to desired state. SFTZODJOUFSWBM &WFOU Kubernetes = Edge-driven Trigger + Level-driven Trigger 0VUBHF $VSSFOUOVNCFS PG3FQMJDBT ˢ 4DBMF6Q ˢ 4DBMFEPXO

Slide 41

Slide 41 text

Components which support Controller ʙ Middle Level Architecture ʙ

Slide 42

Slide 42 text

Terminology Kind: Kind is the kind of API Object(e.g. Deployment, Service) 
 Resource: Resource is used in the same meaning as Kind. This is used as HTTP Endpoint. Resource is expressed in lower case and plural form (e.g. pods, services) Object: An entity of created API Object. This is persisted in etcd.

Slide 43

Slide 43 text

Library under the Controller client-go Informer Lister WorkQueue api-machinery runtime.Object Scheme code-generator Library Component Out of range to explain

Slide 44

Slide 44 text

Appendix) Custom Controller SDK Kubebuilder Informer Framework Component Lister Operator SDK Library (High Level) controller-runtime controller-tools Library (Low Level) client-go api-machinery etc… Scheme runtime.Object WorkQueue etc…

Slide 45

Slide 45 text

client-go: Kubernetes official client Library This is used to Kubernetes development api-machinery: Kubernetes API Object & Kubernetes API like Object Library e.g. conversion, decode, encode, etc… Controller manages API Object, so this is needed. code-generator: Informer, Lister, clientset, DeepCopy source code generator This is used to Custom Controller development mainly. Library under the Controller

Slide 46

Slide 46 text

Component under Controller Detail of each component will be described later. Informer: Watch an Object Event and stores data to in-memory-cache Lister: Getter object data from in-memory-cache WorkQueue: Queue which store Control Loop item runtime.Object: API Object Interface Scheme: Associate Go Type with Kubernetes API Out of range to explain

Slide 47

Slide 47 text

client-go Informer ʙ Low Level Architecture ʙ

Slide 48

Slide 48 text

client-goͱInformer client-go Informer Library Component Reflector DeltaFIFO Indexer Store Lister

Slide 49

Slide 49 text

Informer Informer watches Object Event(Added, Updated, Deleted…) When controller inquiries object status to api-server every time to monitor Object changes, api-server is high loaded. ὎ Informer stores object data to in-memory-cache. By Controller referring to cache, this problem is solved. $POUSPMMFS $POUSPMMFS *OGPSNFS in-memory-cache watch watch

Slide 50

Slide 50 text

Appendix) Informer Sample Code func main() { ... clientset, err := kubernetes.NewForConfig(config) // Create InformerFactory informerFactory := informers.NewSharedInformerFactory(clientset, time.Second*30) // Create pod informer by informerFactory podInformer := informerFactory.Core().V1().Pods() // Add EventHandler to informer podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: func(new interface{}) { log.Println("Added") }, UpdateFunc: func(old, new interface{}) { log.Println("Updated") }, DeleteFunc: func(old interface{}) { log.Println("Deleted") }, }) // Start Go routines informerFactory.Start(wait.NeverStop) // Wait until finish caching with List API informerFactory.WaitForCacheSync(wait.NeverStop) // Create Pod Lister podLister := podInformer.Lister() // Get List of pods _, err = podLister.List(labels.Nothing()) … } https://github.com/govargo/kubecontorller-book-sample-snippet/blob/master/02/podinformer/podinformer.go

Slide 51

Slide 51 text

Appendix) Shared Informer When we use Informer, we don’t use Informer itself. Instead we use Shared Informer. Shared Informer shares same Resource in single binary. %FQMPZNFOU 3FQMJDB4FU %BFNPO4FU 4FSWJDF +PC ʜ 4IBSFE*OGPSNFSTIBSFEDBDIFGPSTBNFSFTPVSDF kube-controller-manager

Slide 52

Slide 52 text

https://github.com/kubernetes/sample-controller/blob/master/docs/controller-client-go.md Reference: Informer and WorkQueue Overview

Slide 53

Slide 53 text

3FqFDUPS %FMUB'*'0 RVFVF -JTU8BUDI JONFNPSZDBDIF 4UPSF *OEFYFS -JTUFS (FUPS-JTU FHDPSFW1PE (FUPS-JTU 3FRVFTU *OGPSNFS )BOEMF%FMUBT Detail of Informer

Slide 54

Slide 54 text

3FqFDUPS %FMUB'*'0 RVFVF ᶃ-JTU"OE8BUDI JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT Detail of Informer ~Cache Flow~

Slide 55

Slide 55 text

3FqFDUPS %FMUB'*'0 RVFVF ᶄ1PQ JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI Detail of Informer ~Cache Flow~

Slide 56

Slide 56 text

3FqFDUPS %FMUB'*'0 RVFVF ᶅ)BOEMF%FMUB JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI Detail of Informer ~Cache Flow~

Slide 57

Slide 57 text

3FqFDUPS %FMUB'*'0 RVFVF ᶆJOEFYFS"EE JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI Detail of Informer ~Cache Flow~

Slide 58

Slide 58 text

3FqFDUPS %FMUB'*'0 RVFVF JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI ᶅ)BOEMF%FMUB ᶄ1PQ Detail of Informer ~Cache Flow~

Slide 59

Slide 59 text

3FqFDUPS %FMUB'*'0 RVFVF JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI ᶆJOEFYFS"EE Detail of Informer ~Cache Flow~

Slide 60

Slide 60 text

3FqFDUPS %FMUB'*'0 RVFVF JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI -JTUFS (FUPS-JTU 3FRVFTU Detail of Informer ~Cache Flow~

Slide 61

Slide 61 text

3FqFDUPS %FMUB'*'0 RVFVF JONFNPSZDBDIF 4UPSF *OEFYFS *OGPSNFS )BOEMF%FMUBT 8BUDI JOEFYFS(FU#Z,FZ PS JOEFYFS-JTU -JTUFS (FUPS-JTU 3FRVFTU Detail of Informer ~Cache Flow~

Slide 62

Slide 62 text

Appendix) Informer cache Source Code ᶃ https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/reflector.go#L188 ᶄ https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/controller.go#L153 ᶅ https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/shared_informer.go#L455 ᶆ https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/shared_informer.go#L464

Slide 63

Slide 63 text

Informer and Component Informer: Watch an Object Event and stores data to in-memory-cache Reflector: ListAndWatch api-server DeltaFIFO: FIFO Queue which enqueue object data temporarily Indexer: Getter / Setter for in-memory-cache Store: in-memory-cache Lister: Getter object data from in-memory-cache via Indexer

Slide 64

Slide 64 text

client-go WorkQueue ʙ Low Level Architecture ʙ

Slide 65

Slide 65 text

client-goͱWorkQueue client-go WorkQueue Library Component RateLimitingQueue DelayedQueue …

Slide 66

Slide 66 text

WorkQueue WorkQueue is another queue different from DeltaFIFO. WorkQueue is used in order to store item of Contrl Loop. Reconcile will be executed as many times as the number stored in WorkQueue. Pure Controller enqueues item to WorkQueue when Event occurs. Event $POUSPMMFS Added Updated Deleted WorkQueue

Slide 67

Slide 67 text

Appendix) WorkQueue Sample Code func main() { ... clientset, err := kubernetes.NewForConfig(config) // Create InformerFactory informerFactory := informers.NewSharedInformerFactory(clientset, time.Second*30) // Create pod informer by informerFactory podInformer := informerFactory.Core().V1().Pods() // Create RateLimitQueue queue := workqueue.NewRateLimitingQueue(workqueue.DefaultControllerRateLimiter()) // shutdown when process ends defer queue.ShutDown() // Add EventHandler to informer podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: func(old interface{}) { var key string var err error if key, err = cache.MetaNamespaceKeyFunc(old); err != nil { runtime.HandleError(err) return } queue.Add(key) log.Println("Added: " + key) }, UpdateFunc: func(old, new interface{}) { … }, DeleteFunc: func(old interface{}) { … }, }) … } https://github.com/govargo/kubecontorller-book-sample-snippet/blob/master/02/workqueue/enqueuePod.go

Slide 68

Slide 68 text

3FqFDUPS %FMUB'*'0 RVFVF -JTU8BUDI 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN $POUSPMMFS-PHJD *OGPSNFS &WFOU)BOEMFS Detail of WorkQueue

Slide 69

Slide 69 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS &WFOU -JTU8BUDI $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 70

Slide 70 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS ᶃ8BUDI $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 71

Slide 71 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI ᶄ1PQ $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 72

Slide 72 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI ᶅ "EE'VOD 6QEBUF'VOD %FMFUF'VOD $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 73

Slide 73 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI namesapce/name e.g. default/nginx ᶆXPSLRVFVF"EE $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 74

Slide 74 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI ᶅ "EE'VOD 6QEBUF'VOD %FMFUF'VOD $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 75

Slide 75 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI ᶆXPSLRVFVF"EE $POUSPMMFS-PHJD Detail of WorkQueue ~Enqueue~

Slide 76

Slide 76 text

3FqFDUPS %FMUB'*'0 RVFVF 3FTPVSDF &WFOU )BOEMFS JONFNPSZDBDIF TUPSF *OEFYFS 8PSL2VFVF 1SPDFTT *UFN *OGPSNFS &WFOU)BOEMFS 8BUDI ᶇXPSLRVFVF(FU $POUSPMMFS-PHJD Detail of WorkQueue ~Dequeue~

Slide 77

Slide 77 text

Appendix) Informer enqueue Source Code ᶃ https://github.com/kubernetes/client-go/blob/master/tools/cache/reflector.go#L267 ᶄ https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/controller.go#L153 ᶅ
 https://github.com/kubernetes/client-go/blob/release-13.0/tools/cache/controller.go#L198 https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L153 ※ ReplicaSet Controllerͷ৔߹ ᶆ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L417 ※ ReplicaSet Controllerͷ৔߹ ᶇ https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L438 ※ ReplicaSet Controllerͷ৔߹

Slide 78

Slide 78 text

Appendix) Informer Resync Period Resync Period is option of InformerFactory. Informer watches object events to api-server. After Resync Period has passed, no matter what event has occurred, UpdateFunc is called back. As a result, Reconcile is executed again. ※This time, Resync refers in-memory-cache(not api-server). Resync(cache sync) and Relist(list from api-server) is different. informer.Start SFTZODQFSJPE List Watch Event Reflector Added Updated Updated Added AddFunc UpdateFunc AddFunc UpdateFunc Handler

Slide 79

Slide 79 text

Controller’s Cycle Main Logic

Slide 80

Slide 80 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS 8PSL2VFVF Controller’s Cycle Let’s confirm Reconcile flow. Informer + WorkQueue + Controller $POUSPMMFS $POUSPMMFS.BJO-PHJD

Slide 81

Slide 81 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI 8PSL2VFVF ʲAssumptionʳ There is two pod in etcd. Update Event occurs from here. Controller’s Cycle

Slide 82

Slide 82 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS 6QEBUF XBUDI 8PSL2VFVF Controller’s Cycle

Slide 83

Slide 83 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI BEE'VOD VQEBUF'VOD EFMFUF'VOD ᶄ 8PSL2VFVF Controller’s Cycle

Slide 84

Slide 84 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶅXPSL2VFVF"EE 8PSL2VFVF Controller’s Cycle

Slide 85

Slide 85 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶅXPSL2VFVF"EE 8PSL2VFVF Controller’s Cycle

Slide 86

Slide 86 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶆXPSL2VFVF(FU 8PSL2VFVF Controller’s Cycle

Slide 87

Slide 87 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶆXPSL2VFVF(FU 8PSL2VFVF Controller’s Cycle

Slide 88

Slide 88 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶇ4ZOD)BOEMFS 3FDPODJMF 8PSL2VFVF Controller’s Cycle

Slide 89

Slide 89 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶈXPSL2VFVF'PSHFU ᶉXPSL2VFVF%POF When Reconcile finished successfully 5IF*UFNPG$POUSPM-PPQJTSFNPWFE GSPN8PSL2VFVFDPNQMFUFMZ 8PSL2VFVF Controller’s Cycle

Slide 90

Slide 90 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶇ`XPSL2VFVF"EE3BUF-JNJUFE When Reconcile ends with Error $POUSPMMFSSFRVFVFJUFNUP8PSL2VFVF "OE3FDPODJMFXJMMCFFYFDVUFEBHBJO 8PSL2VFVF Controller’s Cycle

Slide 91

Slide 91 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS XBUDI ᶇ`XPSL2VFVF"EE3BUF-JNJUFE 8PSL2VFVF Controller’s Cycle When Reconcile ends with Error $POUSPMMFSSFRVFVFJUFNUP8PSL2VFVF "OE3FDPODJMFXJMMCFFYFDVUFEBHBJO

Slide 92

Slide 92 text

*OGPSNFS 1SPDFTT *UFN 3FTPVSDF &WFOU )BOEMFS (Left Side of this slide) Every time an event occurs, items continue to store in WorkQueue. (Right Side of this slide) Controller processes items in WorkQueue and executes Reconcile. This loop continues endlessly until the Controller stops. 8PSL2VFVF Controller’s Cycle

Slide 93

Slide 93 text

Controller’s Basic Strategy Read from In-memory-cache. Write to api-server. ※ However if we update object in cache directly, it is very difficult to guarantee its consistency. So, we use DeepCopy(get clone data), when we update object. e.g. kubernetes/pkg/controller/replicaset/replica_set.go rs = rs.DeepCopy() newStatus := calculateStatus(rs, filteredPods, manageReplicasErr) // Always updates status as pods come up or die. updatedRS, err := updateReplicaSetStatus(rsc.kubeClient.AppsV1(). ὎ ReplicaSets(rs.Namespace), rs, newStatus) https://github.com/kubernetes/kubernetes/blob/release-1.15/pkg/controller/replicaset/replica_set.go#L611

Slide 94

Slide 94 text

XPSLFS XPSLFS QSPDFTT/FYU8PSL*UFN TZOD)BOEMFS Controller’s Main Logic worker: Endless Loop of processNextWorkItem processNextWorkItem: Operate WorkQueue(Get, Add) and Call Reconcile Logic syncHandler: This is equal to Reconcile Logic Add TZOD)BOEMFS Update Delete Event Reconcile Reconcile regardless of Event Type

Slide 95

Slide 95 text

Appendix) ReplicaSet Controller Soure Code worker: processNextWorkItem: syncReplicaSet: XPSLFS XPSLFS QSPDFTT/FYU8PSL*UFN TZOD3FQMJDB4FU https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L432 https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L437 https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go#L562 ReplicaSet Controller Kubernetes v1.16

Slide 96

Slide 96 text

Appendix) Sync of in-memory-cache and etcd Informer synchronizes object data from etcd to in-memory-cache. You may think whether in-memory-cache data is different from data of etcd. It’s no problem. Object has resourceVersion.
 If resourceVersion of etcd and in-memory-cache is different, Error occurs when Controller updates object state. Controller requeue and Reconcile until Reconcile finishes successfully. Rv2 Rv1 Update Rv2 version too old Requeue Reconcile (retry)

Slide 97

Slide 97 text

Terminology(review) Informer: Watch Object Event, and store object data to in-memory-cache Add items of Control Loop to WorkQueue via EventHandler Lister: Getter object data from in-memory-cache via Indexer WorkQueue: Queue which store items of Control Loop This items is target of Reconcile Logic. If error has occurs when Reconcile ends, Controller requeue item to WorkQueue. And Controller executes Reconcile again.

Slide 98

Slide 98 text

Controller Summary

Slide 99

Slide 99 text

Controller Summary ɾController realizes declarative API by Control Loop (Reconciliation Loop) ɾKubernetes has distributed component. Event associates each component. ɾclient-go, apimachinery, code-generator are Library for Controller. ɾInformer has two important role. ᶃ Store object data to in-memory-cache ᶄ Add items to WorkQueue via EventHandler ɾItems which are stored in WorkQueue is processed by Reconcile.

Slide 100

Slide 100 text

Step up to Deep Dive ɾSample Controller ɾKubernetes/Kubernetes - Deployment Controller - ReplicaSet Controller ɾMake Custom Controller(+ CRD) Kubebuilder: https://book.kubebuilder.io/ Operator SDK: https://github.com/operator-framework/operator-sdk Link: https://github.com/kubernetes/sample-controller mkdir -p $GOPATH/src/k8s.io && cd $GOPATH/src/k8s.io && git clone https://github.com/kubernetes/sample-controller.git export GO111MODULE=on go build -o sample-controller . ./sample-controller -kubeconfig $HOME/.kube/config https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/deployment/deployment_controller.go https://github.com/kubernetes/kubernetes/blob/release-1.16/pkg/controller/replicaset/replica_set.go

Slide 101

Slide 101 text

Thank you

Slide 102

Slide 102 text

Reference

Slide 103

Slide 103 text

Reference ɾWeb Article - https://kubernetes.io/docs/concepts/architecture/nodes/ - https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/ - https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md - A deep dive into Kubernetes controllers (https://engineering.bitnami.com/articles/a-deep-dive-into-kubernetes-controllers.html) - Core Kubernetes: Jazz Improv over Orchestration (https://blog.heptio.com/core-kubernetes-jazz-improv-over-orchestration-a7903ea92ca) - Events, the DNA of Kubernetes(https://www.mgasch.com/post/k8sevents/) ɾPresentation(Japanes) - Kubernete Meetup Tokyo #18 - Kubebuilder/controller-runtime ೖ໳ (https://www.slideshare.net/pfi/kubernete-meetup-tokyo-18-kubebuildercontrollerruntime) - KubernetesͷιʔείʔυϦʔσΟϯάೖ໳ (https://speakerdeck.com/smatsuzaki/kubernetesfalsesosukodorideinguru-men) ɾBook - Programming Kubernetes (https://programming-kubernetes.info/)

Slide 104

Slide 104 text

Reference ɾRepository - Kubernetes(https://github.com/kubernetes/kubernetes) - Sample Controller(https://github.com/kubernetes/sample-controller) - client-go(https://github.com/kubernetes/client-go) - apimachinery(https://github.com/kubernetes/apimachinery) - codegenerator(https://github.com/kubernetes/code-generator) - what-happens-when-k8s(https://github.com/jamiehannaford/what-happens-when-k8s)