- Pagination through API Server - Limit CRD usage - etcd optimizations need to be upstreamed - Btree freelist management improvements -> support 10x data size - Concurrent read support -> reduce write latency 10x
- Not super CPU efﬁcient though - Objects are heavily cached - A lot of memory - Lack of custom index support - The code to support index is half cooked - Not hard to add custom index by modifying the API Server codebase
to end optimization to support at least 10x trafﬁc with minimal addition resources - End to end stress testing to ensure reliability and to reduce risk Requirements - Application upgrades should not change container placements - Minor resources updates should not change container placements
Set “in-place” annotation to “created” & update Pod requests/limits b. APIServer: i. Admission to check annotation c. Scheduler: i. Check if in-place update is possible 1. otherwise set “in-place update fail”, user fall back ii. Set “in-place” annotation to “accepted” d. Kubelet: i. Update container resources (CRI), cgroups manager, cpu manager, clear annotation 2. v2: a. A join effort KEP (kep #686) with community on upstream i. reviews are welcome!
i. main branch -> feature branch -> test sets -> Code Review -> main branch -> CI pipeline -> PASS ii. main branch -> release -> CI pipeline -> PASS b. Release-build i. release -> generate build -> test build (*.rpm) -> prod build (*.rpm) c. Test-build i. test build -> test cluster -> monitoring & dashboard a. Rollout i. prod build -> service template -> rollout plan b. Run rollout plan i. E.g. Cluster X a total of 3000+ nodes, batch interval 6hr, 12hr 1. Batch 1：2，5，10， 2. Batch 2：20，50，100 3. Batch 3：200，200，200 4. Batch 4：400，400，400 5. Batch 5：500，500，500 c. Rollback (just in case)
- Constraint violations - No unexpected resource over-commit - No afﬁnity/anti-afﬁnity violations - … - State cross checking - API Server pod state == pod running state query through Kubelet - Controller replica == number of pods running query through Kubelet - Warnings - Too many soft afﬁnity/anti-afﬁnity violations - ...