touch your system during normal operations, you have a bug. The definition of normal changes as your systems grow. Carla Geisser, Google SRE Chapter 5 - Eliminating Toil “ ”
Maintained Generic Automation 04 Internally Maintained System-specific Automation 05 Systems That Don’t Need Any Automation Automation Evolution Site Reliability Engineering Chapter 7 - A Hierarchy of Automation Classes
Server v2 Reconciler Observe Reconcile Deploy Procedural Model Reconciliation Model What’s Difference between Procedural Model and Reconciliation Model? Run Script Update Desire State
Script: Fix B Failure Failure Advantages of k8s #01 Procedural Model Reconciliation Model What’s Difference between Procedural Model and Reconciliation Model? Run Script
Advantages of k8s #01 Script: +Sc Server v3 Desire: Current: Server v2 Reconciler Recovery Procedural Model Reconciliation Model Script: Fix B Reconcile Observe Run Script
/apis/apps/v1/namespaces/default/deployments /apis/<Group>/<Version>/namespaces/default/<Kind> Platform Platform Platform API API CRD Defines Custom API The Automation Platform Can Be Further Extended Call Call Extended ɹɹɹ ed API Server Endpoints API
Reconciler Reconciler CR Kubernetes Cluster Z Lab Automate Kubernetes Cluster Management Using k8s Reconciler CRD Kubernetes Cluster Call API API Server Observe Register Reconcile
Reconcile CR Reconciler CRD Call API API Server Observe Register Reconcile KubernetesͷػೳΛར༻͢ΕAPIαʔόΫϥΠ Ξϯτͷ։ൃෆཁͷͨΊɺϏδωεϩδοΫ෦ ͷReconcilerͱॲཧରͷCRDઃܭʹूதͰ͖Δ
Map Success to update CM ResourceVersio n is too Old.. ctrlOpts := ctrl.Options{ LeaderElection: leConfig, LeaderElectionNamespace: leNamespace, LeaderElectionID: leName, } Optimistic Resource Lock LeaderͷΈ͕Controller͕ىಈ͠ɺΓ ͷFollowerϗοτελϯόΠʹͳΔ CMͷݖݶʹҙ Just set LE configuration to Manager
Serverͷ2όΠφϦͷ ΈΛىಈͤ͞ΔͨΊɺܰྔͩ ͕ɺ”APIϨϕϧ”ͷಈ࡞֬ೝ͔͠Ͱ ͖ͳ͍ɻControllerManager Kubelet͍ͳ͍ͷͰUTతʹͳΔ Container Kind Full Cluster Your Real Platform API server Testing Framework How to setup Kubernetes Cluster for Test?
k8sຊՈͷe2eͷΑ͏ʹςετ࣌ͷEventΛऩूͨ͠Γɺো͕ى͖ͨ࣌ ʹ࠷ॳͷ͋ͨΓΛ͚ͭΔͨΊʹ֬ೝͨ͠ΓɺΫϥελΛ࡞͢Δͱ͖ͷ Ϧιʔεͷಈ͖Λ؍ͨ͠Γͱɺͬ͘͟Γͱͨ͠ڍಈͷѲʹศར $ kubectl get event LAST SEEN TYPE REASON OBJECT MESSAGE 5s Normal Scheduled pod/cndt-cbb75cdc5-mws7l Successfully assigned default/cndt-cbb75cdc5-mws7l to worker3 4s Normal Pulling pod/cndt-cbb75cdc5-mws7l Pulling image "gcr.io/hello-minikube-zero-install/hello-node" 5s Normal SuccessfulCreate replicaset/cndt-cbb75cdc5 Created pod: cndt-cbb75cdc5-mws7l 5s Normal ScalingReplicaSet deployment/cndt Scaled up replica set cndt-cbb75cdc5 to 1
Metric Format $ curl http://localhost:8080/metrics # HELP controller_runtime_reconcile_errors_total Total number of reconcile errors per controller # TYPE controller_runtime_reconcile_errors_total counter controller_runtime_reconcile_errors_total{controller="mysql-controller"} 10 # HELP controller_runtime_reconcile_queue_length Length of reconcile queue per controller # TYPE controller_runtime_reconcile_queue_length gauge