and Co-Founder of Komodor • A big believer in dev empowerment and moving fast! • Backend Developer turned DevOps • Worked at eBay, Forter, Rookout (first developer) • K8S fanboy 😃
clusters per organization is growing every year • While in 2019 only 10% of organizations had 50+ clusters, in 2024 having 100s of K8s clusters is no longer considered an outlier • The inherent complexities of managing a K8s cluster (lifecycle, monitoring, maintenance, troubleshooting, etc.) are multiplied • The sheer scale of these multi-cluster deployments poses a new and unique set of challenges, heightened by the popularity of K8s on Edge • To address those the concept of Fleet Management was invented.
to logically group and normalize Kubernetes clusters, helping you uplevel management from individual clusters to entire groups of clusters.” - GKE Enterprise Efficiency Security Standardization
& Resource Utilization • Keeping the cost low across multi-cluster/cloud/on-prem & hybrid • Efficient resource utilization on Edge nodes Reliability & Resiliency • Resolving issues across different envs & AZs • RCA is endless • Knowledge gaps between Dev & Ops create bottlenecks Access Management • RBAC for cluster access • JiT access • Edge locations Governance & Standardization • Enforcing standards across the fleet • Policy enforcement • Security compliance Cross-Cluster Visibility • Hard to correlate between issues • Deviations in service performance What’s So Hard About Fleet Management?
the organization has a different mindset and approach • Different requirements and KPIs for different teams • Different permissions and access required per persona or per use-case (JiT) • Knowledge and skills gaps (K8s has a steep learning curve)
2 Region N Production Staging Development AWS Azure Google Cloud NS: frontend NS: backend NS: auth By Region 👉 By Environment 👉 By Cloud Provider 👉 By Namespace 👉
and bubble up relevant data in the right context (i.e simplify K8s and reduce cognitive load on non-experts) • Automate away toil in a manner that can circumvent human errors • Template services, deployments, etc. (i.e enforce governance and standardization) • Empower developers and other stakeholders to own K8s (i.e manage their workloads on K8s without having to learn K8s)