Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐genera?on applica?ons • Energy consump?on at the Data Center • Insight on op?miza?on strategies • Conclusions
Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐genera?on applica?ons • Energy consump?on at the Data Center • Insight on op?miza?on strategies • Conclusions
Motivation • Energy consump?on of data centers – 1.3% of worldwide energy produc?on in 2010 – USA: 80 mill MWh/year in 2011 = 1,5 x NYC – 1 data center = 25 000 houses • More than 43 Million Tons of CO2 emissions per year (2% worldwide) • More water consump?on than many industries (paper, automo?ve, petrol, wood, or plas?c) Jonathan Koomey. 2011. Growth in Data center electricity use 2005 to 2010
Motivation José M.Moya | Madrid (Spain), July 27, 2012 7 • It is expected for total data center electricity use to exceed 400 GWh/year by 2015. • The required energy for cooling will con?nue to be at least as important as the energy required for the computa?on. • Energy op?miza?on of future data centers will require a global and mul?-‐disciplinary approach. 0 5000 10000 15000 20000 25000 30000 35000 2000 2005 2010 World server installed base (thousands) High-‐end servers Mid-‐range servers Volume servers 0 50 100 150 200 250 300 2000 2005 2010 Electricity use (billion kWh/year) Infrastructure Communica?ons Storage High-‐end servers Mid-‐range servers Volume servers 5,75 Million new servers per year 10% unused servers (CO2 emissions similar to 6,5 million cars)
Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐generaNon applicaNons • Energy consump?on at the Data Center • Insight on op?miza?on strategies • Our vision and future trends
The DC in next generation applications • Tradi?onal uses of Data Centers: – Webmail, Web search, Databases, Social networking or distributed storage, High-‐performance compu?ng (HPC), Cloud compu?ng • Next-‐genera?on applica?ons: – Popula?on monitoring applica?ons: e-‐Health, Ambient Assisted Living – Smart ci?es • Next-‐genera?on applica?ons generate huge amounts of data • Need to store, analize and generate knowledge
Global energy optimization • Solu?on: GoingGreen! • How: Global energy op?miza?on strategies – Proposal of a holis?c energy op?miza?on framework – Minimizing overall power consump?on – Mul?-‐level op?miza?on: WBSN, Personal Servers and Data Centers
Global energy optimization • Execu?ng part of the workload in the Personal Servers – Classifying tasks depending on their demand – Resource management techniques based on fast run?me alloca?on algorithms executed on the Personal Servers – Execu?ng some tasks in Personal Servers instead of forwarding load to DC. – Up to 10% in energy savings and 15% execu?on ?me savings
Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐genera?on applica?ons • Energy consumpNon at the Data Center • Insight on op?miza?on strategies • Conclusions
Energy Consumption at the DC Power consumption breakdown • The major contributors to electricity costs are: – Cooling (around 50%) – Servers (around 30-‐40%) • The most common metric to measure efficiency in Data Centers is PUE (Power Usage Effec?veness)
Power Usage Effectiveness (PUE) • Average PUE ≈ 2 • State of the Art: PUE ≈ 1,2 – The important part is IT energy consump?on – Current work in energy efficient data centers is focused in decreasing PUE – Decreasing PIT does not decrease PUE, but it has in impact on the electricity bill !"# = 1 !"#$ = !!"!#$ !!" =! ! !!!!!!!!!!= !!"#$% + !!""#$%& + !!"# !!"#$% ≈ !!""#$%& + !!" !!" !
Research trends Abstrac?on level • Higher levels of abstrac?on bring more benefits • Some areas have brought more benefits than others Solu?ons proposed by the State of the Art
Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐genera?on applica?ons • Energy consump?on at the Data Center • Insight on opNmizaNon strategies • Conclusions
Our approach • Global strategy to allow the use of mul?ple informa?on sources to coordinate decisions in order to reduce the total energy consump?on • Use of knowledge about the energy demand characteris?cs of the applicaNons, and characteris?cs of compuNng and cooling resources to implement proacNve opNmizaNon techniques
Resource Management at the Room level Leveraging heterogeneity – IT perspective • Use heterogeneity to minimize energy consump?on from a sta?c/dynamic point of view – StaNc: Finding the best data center set-‐up, given a number of heterogeneous machines – Dynamic: Op?miza?on of task alloca?on in the Resource Manager • We show that the best solu?on implies an heterogeneous data center – Most data centers are heterogeneous (several genera?ons of computers) – 5 to 22% energy savings for sta?c solu?on – 24% to 47% energy savings for dynamic solu?on M. Zapater, J.M. Moya, J.L. Ayala. Leveraging Heterogeneity for Energy Minimiza?on in Data Centers, CCGrid 2012
Resource Management at the Room level Leveraging heterogeneity – IT perspective • Energy profiling of tasks of the SPEC CPU 2006 benchmark • Usage of MILP algorithms to schedule tasks in servers where they consume less energy • Implemented in a real resource manager (SLURM)
Resource Management at the Room level IT + Cooling perspective • Genera?ng a thermal model for the data room: – Data Center environmental monitoring to gather temperature, humidity, differen?al pressure – Predict server temperature and room temperature • Op?mum resource management ajending to cooling and IT power – Real environment with heterogeneous servers – SLURM resource manager
Resource Management at the Server level Leakage-temperature tradeoffs - Cooling • Exploring the leakage-‐temperature tradeoffs at the server level – At higher temperatures, CPU increases power consump?on due to leakage – To decrease CPU temperature, fan speed raises, increasing server cooling consump?on. M. Zapater, J.L. Ayala., J.M. Moya, K. Vaidyanathan, K. Gross, and A. K. Coskun, “Leakage and temperature aware server control for improving energy efficiency in data centers”, DATE 2013
Resource Management at the Server level Leakage-temperature tradeoffs - Cooling • Implemented fan speed controllers that reduce server power consump?on by 10%. Fig. 4. Test 3 temperature sensor readings for the three different controllers 0.1 0.2 kWh) Energy difference between 1800RPM and 2400RPM for clustered allocation analytical model for leakage p fan speeds for varying utilization model, we implement a cooling Fig. 4. Test 3 temperature sensor readings for the three different controllers nd 2400RPM for clustered allocation analytical model for leakage power and find the optimum fan speeds for varying utilization values. Based our analytical model, we implement a cooling controller that adjusts the fan
Scheduling and resource allocation policies in MPSoCs A. Coskun , T. Rosing , K. Whisnant and K. Gross "Sta(c and dynamic temperature-‐ aware scheduling for mul(processor SoCs", IEEE Trans. Very Large Scale Integr. Syst., vol. 16, no. 9, pp.1127 -‐1140 2008 Fig. 3. Distribution of thermal hot spots, with DPM (ILP). A. Static Scheduling Techniques We next provide an extensive comparison of the ILP based techniques. We refer to our static approach as Min-Th&Sp. As discussed in Section III, we implemented the ILP for min- imizing thermal hot spots (Min-Th), energy balancing (Bal- En), and energy minimization (Min-En) to compare against Fig. 4. Distribution of spatial gradients, with DPM (ILP). hot spots. While Min-Th reduces the high spatial differentials above 15 C, we observe a substantial increase in the spatial gradients above 10 C. In contrast, our method achieves lower and more balanced temperature distribution in the die. In Fig. 5, we show how the magnitudes of thermal cycles vary with the scheduling method. We demonstrate the average per- Fig. 3. Distribution of thermal hot spots, with DPM (ILP). A. Static Scheduling Techniques We next provide an extensive comparison of the ILP based techniques. We refer to our static approach as Min-Th&Sp. As discussed in Section III, we implemented the ILP for min- imizing thermal hot spots (Min-Th), energy balancing (Bal- Fig. 4. Distribution of spatial gradients, with DPM (ILP). hot spots. While Min-Th reduces the high spatial differentials above 15 C, we observe a substantial increase in the spatial gradients above 10 C. In contrast, our method achieves lower and more balanced temperature distribution in the die. In Fig. 5, we show how the magnitudes of thermal cycles vary UCSD – System Energy Efficiency Lab
Scheduling and resource allocation policies in MPSoCs • Energy characteriza?on of applica?ons allows to define proac?ve scheduling and resource alloca?on policies, minimizing hotspots • Hotspot reduc?on allows to raise cooling temperature +1oC means around 7% cooling energy savings
JIT Compilation in Virtual Machines • Virtual machines compile (JIT compila?on) the applica?ons into na?ve code for performance reasons • The op?mizer is general-‐ purpose and focused in performance opNmizaNon
Back-‐end JIT compilation for energy minimization • Applica?on-‐aware compiler – Energy characteriza?on of applica?ons and transforma?ons – Applica?on-‐dependent op?mizer – Global view of the data center workload • Energy op?mizer – Currently, compilers for high-‐end processors oriented to performance op?miza?on Front-‐end Op?mizer Code generator
Energy saving potential for the compiler (MPSoCs) T. Simunic, G. de Micheli, L. Benini, and M. Hans. “Source code op?miza?on and profiling of energy consump?on in embedded systems,” Interna?onal Symposium on System Synthesis, pages 193 – 199, Sept. 2000 – 77% energy reduc?on in MP3 decoder Fei, Y., Ravi, S., Raghunathan, A., and Jha, N. K. 2004. Energy-‐op?mizing source code transforma?ons for OS-‐driven embedded sovware. In Proceedings of the Interna?onal Conference VLSI Design. 261–266. – Up to 37,9% (mean 23,8%) energy savings in mul?process applica?ons running on Linux
Global Management of Low-power modes (DVFS) • DVFS (Dynamic Voltage and Frequency Scaling) is based upon: – As suppy voltage decreases, power decreases quadra?cally – But delay increases (performance decreases) only linearly – The maximum frequency also decreases linearly • Currently, low-‐power modes, if used, are ac?vated by inac?vity of the server opera?ng system • To minimize energy consump?on, changes between modes should be minimized • On the other hand, workload knowledge allows to globally schedule low-‐power modes without any impact in performance
Global Management of Low-power modes (DVFS) • By using a thermal model, we can predict the behaviour of a workload under each power mode • We can use resource management algorithms to change DVFS on run?me, adap?ng to our workload.
Potential energy savings with floorplanning – Up to 21oC reduc?on of maximum temperature – Mean: -‐12oC in maximum temperature – Bejer results in the most cri?cal examples Y. Han, I. Koren, and C. A. Moritz. Temperature Aware Floorplanning. In Proc. of the Second Workshop on Temperature-‐Aware Computer Systems, June 2005
Temperature-aware floorplanning in 3D chips • 3D chips are gewng interest due to: – ↑ ↑ Scalability: reduces 2D equivalent area – ↑ ↑ Performance: shorter wire length – ↑ Reliability: less wiring • Drawback: – Huge increment of hotspots compared with 2D equivalent designs
Outline • Why Data Centers (DC) in this Workshop? • The DC in next-‐genera?on applica?ons • Energy consump?on at the Data Center • Insight on op?miza?on strategies • Conclusions
There is still much more to be done • Smart Grids – Consume energy when everybody else does not – Decrease energy consump?on when everybody else is consuming • Reducing the electricity bill – Variable electricity rates – Reac?ve power coefficient – Peak energy demand
Conclusions • Reducing PUE is not the same than reducing energy consump?on – IT energy consump?on dominates in state-‐of-‐the-‐art data centers • Applica?on and resources knowledge can be effec?vely used to define proacNve policies to reduce the total energy consump?on – At different levels – In different scopes – Taking into account cooling and computa?on at the same ?me • Proper management of the knowledge of the data center thermal behavior can reduce reliability issues • Reducing energy consump?on is not the same than reducing the electricity bill
Thank you for your attention Marina Zapater [email protected] hjp://greenlsi.die.upm.es (+34) 91 549 57 00 x-‐4227 ETSI de Telecomunicación, B105 Avenida Complutense, 30 Madrid 28040, Spain Thanks to our collaborators: