Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[My personal, biased view of] The Last Five Years of Energy Consumption Research

Gustavo Pinto
January 14, 2018
33

[My personal, biased view of] The Last Five Years of Energy Consumption Research

Gustavo Pinto

January 14, 2018
Tweet

Transcript

  1. 2013 Ambitious Plan “propose a catalog of refactorings targeting some

    languages of the JVM platform.” SPLASH DOC SYMP’13
  2. 2013 Ambitious Plan “propose a catalog of refactorings targeting some

    languages of the JVM platform.” SPLASH DOC SYMP’13
  3. 15 5M Questions Automatic Filter Manual Filter Final Data from

    2008 to 2013 325 Questions 558 Answers Base Group
  4. Research Questions 16 • RQ1: What are the most common

    energy-related problems faced by software developers? • RQ2: What are the main causes for software energy consumption problems? • RQ3: What solutions do developers employ or recommend to save energy?
  5. Energy-Related Problems 17 • Measurements (59/97 — Q/A) • General

    Knowledge (40/84 — Q/A) • Code design (36/133 — Q/A) • Context-specific (83/110 — Q/A) • Noise (107/134 — Q/A)
  6. 18 “I want to measure the energy consumption of my

    own application (which I can modify) [...] on Windows CE 5.0 and Windows Mobile 5/6. Is there some kind of API for this?” • Measurements (59/97 — Q/A) • General Knowledge (40/84 — Q/A) • Code design (36/133 — Q/A) • Context-specific (83/110 — Q/A) • Noise (107/134 — Q/A)
  7. 19 “Can a code optimized for least MCPS be guaranteed

    to have least power consumption as well?” • Measurements (59/97 — Q/A) • General Knowledge (40/84 — Q/A) • Code design (36/133 — Q/A) • Context-specific (83/110 — Q/A) • Noise (107/134 — Q/A)
  8. 20 “Are there any s/w high level design considerations [...]

    to make the code as power efficient as possible?” • Measurements (59/97 — Q/A) • General Knowledge (40/84 — Q/A) • Code design (36/133 — Q/A) • Context-specific (83/110 — Q/A) • Noise (107/134 — Q/A)
  9. 21 • Measurements (59/97 — Q/A) • General Knowledge (40/84

    — Q/A) • Code design (36/133 — Q/A) • Context-specific (83/110 — Q/A) • Noise (107/134 — Q/A) — Highest popularity — Highest A per Q ratio — Highest success rate Energy-Related Problems
  10. Energy-Related Causes 22 • Unnecessary resource usage (49 occurrences) •

    Fault GPS behavior (42 occurrences) • Background activities (40 occurrences) • Excessive synchronization (32 occurrences) • Background wallpapers (17 occurrences) • Advertisement (11 occurrences)
  11. 23 • Unnecessary resource usage (49 occurrences) • Fault GPS

    behavior (42 occurrences) • Background activities (40 occurrences) • Excessive synchronization (32 occurrences) • Background wallpapers (17 occurrences) • Advertisement (11 occurrences) “to have a background application that monitors device usage, identifies unused/idle resources, and acts appropriately”
  12. 24 • Unnecessary resource usage (49 occurrences) • Fault GPS

    behavior (42 occurrences) • Background activities (40 occurrences) • Excessive synchronization (32 occurrences) • Background wallpapers (17 occurrences) • Advertisement (11 occurrences) “When there are bugs that keep the GPS turned on too long they go to the top of the list to get fixed”
  13. Energy-Related Solutions 25 • Keep IO to a minimum (29

    occurrences) • Bulk operations (24 occurrences) • Avoid polling (17 occurrences) • Hardware Coordination (11 occurrences) • Concurrent Programming (9 occurrences) • Race to idle (7 occurrences)
  14. 26 • Keep IO to a minimum (29 occurrences) •

    Bulk operations (24 occurrences) • Avoid polling (17 occurrences) • Hardware Coordination (11 occurrences) • Concurrent Programming (9 occurrences) • Race to idle (7 occurrences) “do not flood the output stream with null values”
  15. 27 • Keep IO to a minimum (29 occurrences) •

    Bulk operations (24 occurrences) • Avoid polling (17 occurrences) • Hardware Coordination (11 occurrences) • Concurrent Programming (9 occurrences) • Race to idle (7 occurrences) “Don’t transfer say 1 file, and then wait for a bit to do another transfer. Instead, transfer right after the other.”
  16. Do researchers agree? 28 • Keep IO to a minimum

    (29 occurrences) • Bulk operations (24 occurrences) • Avoid polling (17 occurrences) • Hardware Coordination (11 occurrences) • Concurrent Programming (9 occurrences) • Race to idle (7 occurrences)
  17. Do researchers agree? 29 • Keep IO to a minimum

    (29 occurrences) • Bulk operations (24 occurrences) • Avoid polling (17 occurrences) • Hardware Coordination (11 occurrences) • Concurrent Programming (9 occurrences) • Race to idle (7 occurrences)
  18. 2014 30 • Explicit threading (the Thread-style): Using the java.lang.Thread

    class • Thread pooling (the Executor-style): Using the java.util.concurrent.Executor framework • Working Stealing (the ForkJoin-style): Using the java.util.concurrent.ForkJoin framework OOPSLA’14
  19. 31 • Embarrassingly parallel: spectralnorm, sunflow, n-queens • Leaning parallel:

    xalan, knucleotide, tomcat • Leaning serial: mandelbrot, largestImage • Embarrassingly serial: h2 Benchmarks
  20. 32 • Embarrassingly parallel: spectralnorm, sunflow, n-queens • Leaning parallel:

    xalan, knucleotide, tomcat • Leaning serial: mandelbrot, largestImage • Embarrassingly serial: h2 Benchmarks Micro-benchmarks DaCapo benchmarks
  21. 33 Experimental Environment A 2×16-core AMD CPUs, running Debian Linux,

    64GB of memory, JDK version 1.7.0 11, build 21, “ondemand” governor
  22. 34 Experimental Environment A 2×16-core AMD CPUs, running Debian Linux,

    64GB of memory, JDK version 1.7.0 11, build 21, “ondemand” governor
  23. 35 Experimental Environment A 2×16-core AMD CPUs, running Debian Linux,

    64GB of memory, JDK version 1.7.0 11, build 21.
  24. 36 Experimental Environment A 2×16-core AMD CPUs, running Debian Linux,

    64GB of DDR3 1600 memory, and JDK version 1.7.0 11, build 21.
  25. 45 More cores idle CPU frequency at a lower level

    More threads used, program completes sooner The greater the ratio between speedup and power, the steeper the \ The Λ Curve
  26. 59 19M Repos Manual Filter Automatic Filter Commits dh7h3 md8ja

    j287h dij873 dj827h os837 82uan 28a08 2ja82 d0hk0 j29yd a7jf9 aio92 hnna2
  27. 60 19M Repos Manual Filter Automatic Filter Commits dh7h3 md8ja

    j287h dij873 dj827h os837 82uan 28a08 2ja82 d0hk0 j29yd a7jf9 aio92 hnna2
  28. 61 19M Repos Manual Filter Automatic Filter Commits dh7h3 md8ja

    j287h dij873 dj827h os837 82uan 28a08 2ja82 d0hk0 j29yd a7jf9 aio92 hnna2
  29. 62 19M Repos Manual Filter Automatic Filter Commits dh7h3 md8ja

    j287h dij873 dj827h os837 82uan 28a08 2ja82 d0hk0 j29yd a7jf9 aio92 hnna2 All commits were, at least, double-checked!
  30. Research Questions 64 • RQ1. What are the solutions that

    developers use to save energy in practice? • RQ2. What software quality attributes may be given precedence over energy consumption? • RQ3. How are energy-saving solutions distributed over the software stack? • RQ4. To what extent are software developers certain that their commits will save energy?
  31. RQ1: Solutions 65 • Frequency and voltage scaling (50 occurrences)

    • Use power efficient library/device (45 occurrences) • Disabling features or devices (42 occurrences) • Energy bug fix (26 occurrences) • Low power idling (22 occurrences) • Timing out (16 occurrences)
  32. RQ1: Solutions 66 • Frequency and voltage scaling (50 occurrences)

    • Use power efficient library/device (45 occurrences) • Disabling features or devices (42 occurrences) • Energy bug fix (26 occurrences) • Low power idling (22 occurrences) • Timing out (16 occurrences)
  33. RQ2: Quality Attributes 67 • Correctness (7 occurrences) • Responsiveness

    (6 occurrences) • Performance (3 occurrences) • No actual power saving (3 occurrences) • Miscellaneous (3 occurrences)
  34. 68 • Correctness (7 occurrences) • Responsiveness (6 occurrences) •

    Performance (3 occurrences) • No actual power saving (3 occurrences) • Miscellaneous (3 occurrences) RQ2: Quality Attributes
  35. 69 • Responsiveness (6 occurrences) • Responsiveness (6 occurrences) •

    Performance (3 occurrences) • No actual power saving (3 occurrences) • Miscellaneous (3 occurrences) RQ2: Quality Attributes
  36. 72 88 Commits Application includes embedded applications, desktop application, and

    mobile applications. 42 Embedded RQ3: Software Stack
  37. 73 88 Commits Application includes embedded applications, desktop application, and

    mobile applications. 42 Embedded 21 Arduino RQ3: Software Stack
  38. 78 142 Commits Operating System includes Kernels, Embedded Kernels, Drivers

    and Firmwares 69 — OS Kernel 54 — Drivers RQ3: Software Stack
  39. RQ4: Certain 79 “Hesitating” words • seem • might •

    doubt • could • hope • attempt • supposed • guess • likely
  40. RQ4: Certain 80 “Hesitating” words • seem • might •

    doubt • could • hope • attempt • supposed • guess • likely 18 hesitations!
  41. RQ4: Certain 83 18 Reverted Commits 8/18 reverts the power

    efficient work queue! There is no silver bullet! Fred Brooks
  42. 2016 84 ICSME’16 Bad programmers worry about the code. Good

    programmers worry about data structures and their relationships. Linus Tolvards
  43. 89 • ArrayList • LinkedList • Vector • Collections.synchronizedList() •

    CopyOnWriteArrayList List<Object> lists = …; Non Thread-Safe Thread-Safe
  44. 98 List<Object> lists = …; • ArrayList • LinkedList •

    Vector • Collections.synchronizedList() • CopyOnWriteArrayList Non Thread-Safe Thread-Safe
  45. 16 Collections 102 List ArrayList Vector Collections.syncList() CopyOnWriteArrayList Set LinkedHashSet

    Collections.syncSet() CopyOnWriteArraySet ConcurrentSkipListSet ConcurrentHashSet ConcurrentHashSetV8 Map LinkedHashMap Hashtable Collections.syncMap() ConcurrentSkipListMap ConcurrentHashMap ConcurrentHashMapV8
  46. 16 Collections 103 List ArrayList Vector Collections.syncList() CopyOnWriteArrayList Set LinkedHashSet

    Collections.syncSet() CopyOnWriteArraySet ConcurrentSkipListSet ConcurrentHashSet ConcurrentHashSetV8 Map LinkedHashMap Hashtable Collections.syncMap() ConcurrentSkipListMap ConcurrentHashMap ConcurrentHashMapV8 Non thread-safe Thread-safe
  47. 16 Collections 104 List ArrayList Vector Collections.syncList() CopyOnWriteArrayList Set LinkedHashSet

    Collections.syncSet() CopyOnWriteArraySet ConcurrentSkipListSet ConcurrentHashSet ConcurrentHashSetV8 Map LinkedHashMap Hashtable Collections.syncMap() ConcurrentSkipListMap ConcurrentHashMap ConcurrentHashMapV8 Java 7 Java 8
  48. 16 Collections 105 List ArrayList Vector Collections.syncList() CopyOnWriteArrayList Set LinkedHashSet

    Collections.syncSet() CopyOnWriteArraySet ConcurrentSkipListSet ConcurrentHashSet ConcurrentHashSetV8 Map LinkedHashMap Hashtable Collections.syncMap() ConcurrentSkipListMap ConcurrentHashMap ConcurrentHashMapV8 x 3 Operations Traversal Insertion Removal
  49. 2 Environments AMD CPU: A 2×16-core, running Debian, 2.4 GHz,

    64GB of memory, JDK version 1.7 .0 11, build 21. 106 Intel CPU: A 2×8-core (32-cores w/ hyper-threading), running Debian, 2.60GHz, with 64GB of memory, JDK version 1.7 .0 71, build 14.
  50. 2 Environments AMD CPU: A 2×16-core, running Debian, 2.4 GHz,

    64GB of memory, JDK version 1.7 .0 11, build 21. 107 Intel CPU: A 2×8-core (32-cores w/ hyper-threading), running Debian, 2.60GHz, with 64GB of memory, JDK version 1.7 .0 71, build 14. Hardware-based energy measurement Software-based energy measurement
  51. 119 Maps Intel CPU Traversal Insertion Removal AMD CPU Traversal

    Insertion Removal Less energy than the non thread-safe implementation!
  52. 121 Tomcat > A web server > More than 170K

    lines of Java code > More than 300 Hashtables
  53. 122 Tomcat Xalan > Parses XML in HTML documents >

    More than 188K lines of Java code > More than 140 Hashtables > A web server > More than 170K lines of Java code > More than 300 Hashtables
  54. 123 For each Hashtable instance, change it for a ConcurrentHashMap

    one. Do it again for ConcurrentHashMapV8 Task:
  55. 125 Tomcat Xalan Hashtable to CHM: -12.21% Hashtable to CHM8:

    -17 .82% Hashtable to CHM: -5.82% Hashtable to CHM8: -9.32%
  56. 129 Hashtable ConcurrentHashMap Cloneable 
 Map 
 / / works

    fine implements Map<X,Y> obj = new Hashtable<>(); obj.clone();
  57. 130 Hashtable ConcurrentHashMap Map<X,Y> obj = new Hashtable<>(); obj.clone(); Cloneable

    
 Map 
 / / works fine / / compiler error implements Map<X,Y> obj = new ConcurrentHashMap<>(); obj.clone();
  58. 131 Hashtable ConcurrentHashMap Cloneable 
 Map 
 / / works

    fine / / compiler error implements Danny Dig Opportunity for improving refactoring tools! Map<X,Y> obj = new Hashtable<>(); obj.clone(); Map<X,Y> obj = new ConcurrentHashMap<>(); obj.clone();
  59. 140 actor FIFO (Centralized) 1s 2s 1s 92s 1s 1s

    actor Decentralized 1s 2s 1s 92s 1s 1s If a message is being executed when another message arrives, fork the second message
  60. 141

  61. 142 #1. Copy on Fork #2. Copy on Join #3.

    Scattered Data #4. Exacting Intra-Task Synchronization #5. Bottleneck 6: Sleepy Workers
  62. 143