Versatility of Energy-Aware Workload Allocation Optimizer (WAO) for Kubernetes Shunsuke Ise, Chizuko Mizumoto, Ying-Feng Hsu, Kazuhiro Matsuda and Morito Matsuoka IEEE JCC 2025 | 21‒24 July 2025 | Tucson, AZ, USA
software-based energy-aware approach? n Data center energy use is soaring n Hardware upgrades are costly and slow n Built on Kubernetes → widely usable Motivation 4 International Energy Agency, "Electricity 2024," [Online]. Available: https://www.iea.org/reports/electricity-2024 Estimated electricity demand from traditional data centres, dedicated AI data centres and cryptocurrencies, 2022 and 2026, base case IEA. CC BY 4.0.
placement n Predicted incremental power draw as a placement criterion n Uses per-server power models and environmental metrics n Applicable to both workload placement and load balancing WAO Concept 5 Y. -F. Hsu, C. Mizumoto, K. Matsuda, and M. Matsuoka, "Sustainable data center energy management through server workload allocation optimization and HVAC system," in Proc. IEEE Cloud Summit, 2024, pp. 17-23.
data center with 4 server models n Environmental and system metrics u CPU utilization (from OS) u Ambient (inlet) temperature (via Redfish or IPMI) u Static pressure differential (via front/rear sensors) Testbed & Instrumentation 8 Server CPU Number of Threads A Intel Xeon Bronze 12 B Intel Xeon Silver 32 C Intel Xeon Gold 96 D AMD EPYC 96
per-server models from measured metrics n Power Consumption Linearity: PCL = !"!!"#$%& ! represents the potential for power optimization Power Consumption Models: Fundamentals and PCL 11 Server fan rotation management policy CPU frequency governor Ambient Temperature Server Fan Rotation Server Fan Management Policy CPU Management Policy Server Air Conditioner Explanatory variables Objective variable *DVFS: Dynamic Voltage and Frequency Scaling **CS: Context Switching * **
profiles vary across CPU frequency governors n Mode-aware modeling improves accuracy with per-governor training Power Consumption Models: Adapting to CPU Operating Modes 12
savings are similar regardless of server count u Fewer servers reach peak at lower occupied threads u More servers need higher occupied threads Scalability: Small-Scale 15 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 Number of Servers Power Saving with WAO (%) #96 #384 #960 #1536 #3072 C (Intel Xeon Gold: 96)
reduces score computation from 𝒪 𝑁×𝑃 to 𝒪 𝑃 𝑁: Number of Nodes, 𝑃: Number of Pods n Maintains constant latency as cluster scales in node count Scalability: Large-Scale 16
40 45 50 55 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Uniform WAO (α=0.5, balanced) WAO (α = 1.0; energy-aware) CPU Utilization (%) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) 30 35 40 45 50 55 6000 8000 10000 12000 14000 16000 18000 CPU: 10% CPU: 20% CPU: 50% CPU: 80% Uniform WAO (α = 0.5, balanced) WAO (α = 1.0, energy-aware) Total Power Consumption (W) Processing Time (s) Evaluation Score: μ = α (Power Consumption) + β (Processing Time) D (AMD EPYC: 96) n Similar trends on both Model C (Xeon Gold) and Model D (EPYC) n Performance degrades beyond 50% CPU utilization likely due to resource contention under SMT Computational Performance: Energy‒Performance Trade-off 20
Models A/B/D into Model C servers n Similar PCL but different curve shape → “boost effect” n Large PCL gap (Model A) → diminished power-saving effect Heterogeneous Environment: Mixed Servers 22
WAO scoring to support heterogeneous clusters n Evaluate groups of servers with per-model weighted scores n Enables consistent power-aware placement across mixed hardware Heterogeneous Environment: Extended Scoring 23 Workload allocation score (μ) for homogeneous servers: µ = α + β Workload allocation score (μ) for heterogeneous servers: µ = & ! ζ! α! + β! where: 𝑖: index over server models 𝜁!: contribution factor for server model 𝑖 𝛼!: power consumption score for server model 𝑖 𝛽!: computational performance score for server model 𝑖
(Recap) ü How well does WAO scale? • Effect saturates at 5~10 servers • Matches kube-scheduler performance at large scale, thanks to caching ü Can it save power without slowing jobs? • α = 0.5 is the sweet-spot: big savings, minimal slowdown ü What happens in heterogeneous environments? • “Boost effect” observed in moderately mixed setups • Works with extended scoring Conclusion 25
is supported by the New Energy and Industrial Technology Development Organization (NEDO) under its "Program to Develop and Promote the Commercialization of Energy Conservation Technologies to Realize a Decarbonized Society" (JPNP21005). Acknowledgement 27