Defense talk: Energy-Efficient Mobile Web Computing

3c332dfc0b438785cb10c5234652dd66?s=47 Yuhao Zhu
September 07, 2016

Defense talk: Energy-Efficient Mobile Web Computing

Defense talk

3c332dfc0b438785cb10c5234652dd66?s=128

Yuhao Zhu

September 07, 2016
Tweet

Transcript

  1. 1.

    1 Energy-Efficient Mobile Web Computing Sept. 7th, 2016 Yuhao Zhu

    Electrical and Computer Engineering Department The University of Texas at Austin Advisor: Vijay Janapa Reddi
  2. 7.
  3. 8.

    2 1990 HTML 1996 JavaScript 2008 Mobile Web 2012 Responsive

    Web The Web Evolution Functionality Performance
  4. 9.

    2 1990 HTML 1996 JavaScript 2008 Mobile Web 2012 Responsive

    Web 2016 Watt Wise Web The Web Evolution Functionality Performance Energy
  5. 10.

    2 1990 HTML 1996 JavaScript 2008 Mobile Web 2012 Responsive

    Web 2016 Watt Wise Web The Web Evolution Functionality Performance Energy
  6. 12.

    0 30 60 90 120 2011 2012 2013 2014 2015

    2016 3 Source: BIA/Kelsey Search Volume (B) Web: Mobile Overtaking Desktop
  7. 13.

    0 30 60 90 120 2011 2012 2013 2014 2015

    2016 3 Source: BIA/Kelsey Search Volume (B) Mobile Desktop Web: Mobile Overtaking Desktop
  8. 14.

    0 30 60 90 120 2011 2012 2013 2014 2015

    2016 3 Source: BIA/Kelsey Search Volume (B) Mobile Desktop Web: Mobile Overtaking Desktop When the work started
  9. 15.

    0 30 60 90 120 2011 2012 2013 2014 2015

    2016 3 Source: BIA/Kelsey Search Volume (B) Mobile Desktop Web: Mobile Overtaking Desktop When the work started
  10. 21.

    5 The Scope of Mobile Web Mobile Client Cloud Web

    Servers Cellular Network [MICRO 2015] (Top Picks Honorable Mention)
  11. 25.

    38 32 26 20 14 8 2 Load time (s)

    10 2 3 4 5 6 7 8 100 2 3 4 5 6 7 8 1000 2 Network RTT (ms) 8 Isn’t Mobile Web a Network Issue?
  12. 26.

    38 32 26 20 14 8 2 Load time (s)

    10 2 3 4 5 6 7 8 100 2 3 4 5 6 7 8 1000 2 Network RTT (ms) 8 Isn’t Mobile Web a Network Issue? ▸ Samsung Galaxy S4 smartphone. ▸ Hot webpages from Alexa1. ▸ Time measured using Navigation Timing API2. 1. http://www.alexa.com/ 2. https://www.w3.org/TR/navigation-timing-2/
  13. 27.

    38 32 26 20 14 8 2 Load time (s)

    10 2 3 4 5 6 7 8 100 2 3 4 5 6 7 8 1000 2 Network RTT (ms) 8 LTE 3G Adverse 3G 2G Wi-Fi Isn’t Mobile Web a Network Issue? ▸ Samsung Galaxy S4 smartphone. ▸ Hot webpages from Alexa1. ▸ Time measured using Navigation Timing API2. 1. http://www.alexa.com/ 2. https://www.w3.org/TR/navigation-timing-2/
  14. 28.

    38 32 26 20 14 8 2 Load time (s)

    10 2 3 4 5 6 7 8 100 2 3 4 5 6 7 8 1000 2 Network RTT (ms) 8 LTE 3G Adverse 3G 2G Wi-Fi Isn’t Mobile Web a Network Issue? ▸ Samsung Galaxy S4 smartphone. ▸ Hot webpages from Alexa1. ▸ Time measured using Navigation Timing API2. 1. http://www.alexa.com/ 2. https://www.w3.org/TR/navigation-timing-2/
  15. 29.

    38 32 26 20 14 8 2 Load time (s)

    10 2 3 4 5 6 7 8 100 2 3 4 5 6 7 8 1000 2 Network RTT (ms) 8 LTE 3G Adverse 3G 2G Wi-Fi Isn’t Mobile Web a Network Issue? ▸ Samsung Galaxy S4 smartphone. ▸ Hot webpages from Alexa1. ▸ Time measured using Navigation Timing API2. 1. http://www.alexa.com/ 2. https://www.w3.org/TR/navigation-timing-2/
  16. 33.

    10 Traditional Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render
  17. 34.

    10 Traditional Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render Application
  18. 35.

    ▸ Parallelize browser computation 10 Traditional Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application
  19. 36.

    ▸ Parallelize browser computation 10 Traditional Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Architecture
  20. 37.

    ▸ Parallelize browser computation 10 Traditional Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Architecture ▸ Voltage/frequency scaling on general-purpose processors
  21. 38.

    ▸ Parallelize browser computation 10 Traditional Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Inputs Architecture ▸ Voltage/frequency scaling on general-purpose processors
  22. 39.

    ▸ Parallelize browser computation ▸ Ignored! 10 Traditional Approach Frameworks

    and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Inputs Architecture ▸ Voltage/frequency scaling on general-purpose processors
  23. 40.

    ▸ Parallelize browser computation ▸ Ignored! 10 Traditional Approach Frameworks

    and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Inputs Architecture ▸ Voltage/frequency scaling on general-purpose processors ▸ End of Dennard Scaling! ▸ Diminishing return
  24. 41.

    ▸ Parallelize browser computation ▸ Ignored! 11 My Approach Frameworks

    and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Inputs Architecture WebCore Web-specific Architecture
  25. 42.

    ▸ Parallelize browser computation 11 My Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Inputs Architecture ▸ Lost page-level diversity ▸ Lost user QoS requirements WebCore Web-specific Architecture
  26. 43.

    ▸ Parallelize browser computation 11 My Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Architecture ▸ Lost page-level diversity ▸ Lost user QoS requirements WebCore Web-specific Architecture
  27. 44.

    12 My Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render Application Architecture WebCore Web-specific Architecture GreenWeb Language Extensions
  28. 45.

    12 My Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render Application Architecture WebCore Web-specific Architecture GreenWeb Language Extensions Runtime
  29. 46.

    12 My Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render Application Architecture WebCore Web-specific Architecture GreenWeb Language Extensions Runtime
  30. 47.

    12 My Approach Frameworks and Libraries HTML JavaScript CSS Language

    Runtime Styling Security Local Storage User Input Layout Render Application Architecture WebCore Web-specific Architecture GreenWeb Language Extensions Runtime
  31. 48.

    WebRT Energy-aware Web Runtime 12 My Approach Frameworks and Libraries

    HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Application Architecture WebCore Web-specific Architecture GreenWeb Language Extensions Runtime
  32. 49.

    Runtime 13 My Approach Architecture Application WebRT Energy-aware Web Runtime

    WebCore Web-specific Architecture GreenWeb Language Extensions
  33. 50.

    Runtime 13 My Approach Architecture Application My Dissertation Work WebRT

    Energy-aware Web Runtime WebCore Web-specific Architecture GreenWeb Language Extensions [PLDI 2016] [ISCA 2014] [HPCA 2013] [HPCA 2015] [CAL 2014] (Best of CAL)
  34. 52.

    Thesis Statement 14 Future mobile Web systems can achieve energy-efficiency

    without sacrificing responsiveness by incorporating:
  35. 53.

    Thesis Statement 14 ▸ Programming language annotations to convey user

    QoS information ▸ Runtime scheduling mechanisms to exploit heterogeneous hardware ▸ Hardware accelerators specialized for the key computation kernel Future mobile Web systems can achieve energy-efficiency without sacrificing responsiveness by incorporating:
  36. 54.

    Runtime 15 My Approach Architecture Application My Dissertation Work WebRT

    Energy-aware Web Runtime WebCore Web-specific Architecture GreenWeb Language Extensions
  37. 55.

    Runtime 15 My Approach Architecture Application My Dissertation Work WebRT

    Energy-aware Web Runtime WebCore Web-specific Architecture GreenWeb Language Extensions
  38. 56.

    Energy Concern Among Mobile Developers 16 [ICSE 2016] Manotas et

    al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  39. 57.

    Energy Concern Among Mobile Developers 16 Percentage (%) 0 25

    50 75 100 Mobile Desktop Data Center Never/Rarely Sometimes Often/Almost Always “My applications have requirements about energy usage.” [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  40. 58.

    Energy Concern Among Mobile Developers 16 Percentage (%) 0 25

    50 75 100 Mobile Desktop Data Center Never/Rarely Sometimes Often/Almost Always “My applications have requirements about energy usage.” [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  41. 59.

    Energy Concern Among Mobile Developers 16 Percentage (%) 0 25

    50 75 100 Mobile Desktop Data Center Never/Rarely Sometimes Often/Almost Always “My applications have requirements about energy usage.” [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  42. 60.

    Developers are Willing to Make Trade-offs 17 [ICSE 2016] Manotas

    et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  43. 61.

    Developers are Willing to Make Trade-offs 17 Percentage (%) 0

    25 50 75 100 Mobile Never/Rarely Sometimes Often/Almost Always “I'm willing to sacrifice performance, etc. for reduced energy usage.” [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  44. 62.

    Developers are Willing to Make Trade-offs 17 Percentage (%) 0

    25 50 75 100 Mobile Never/Rarely Sometimes Often/Almost Always “I'm willing to sacrifice performance, etc. for reduced energy usage.” [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”
  45. 70.

    21 ▸ Runtime that saves energy while meeting the QoS

    constraints GreenWeb: Language for Energy-Efficiency ▸ Language abstractions for expressing QoS
  46. 71.

    21 ▸ Runtime that saves energy while meeting the QoS

    constraints ▸ Result in 60% energy savings on real hardware/software implementations GreenWeb: Language for Energy-Efficiency ▸ Language abstractions for expressing QoS
  47. 72.

    21 ▸ Runtime the QoS constraints ▸ Result hardware/software implementations

    GreenWeb: Language for Energy-Efficiency ▸ Language abstractions for expressing QoS
  48. 76.

    23 Performance QoS Experience [OSDI 1996] Y. Endo et al.,

    “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS
  49. 77.

    23 Performance QoS Experience [OSDI 1996] Y. Endo et al.,

    “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS Too slow
  50. 78.

    23 Performance QoS Experience Unusable [OSDI 1996] Y. Endo et

    al., “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS Too slow
  51. 79.

    23 Performance QoS Experience Unusable Tolerable [OSDI 1996] Y. Endo

    et al., “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS Too slow
  52. 80.

    23 Performance QoS Experience Unusable Tolerable [OSDI 1996] Y. Endo

    et al., “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS Too slow Diminishing Returns
  53. 81.

    23 Performance QoS Experience Unusable Tolerable Imperceptible [OSDI 1996] Y.

    Endo et al., “Using Latency to Evaluate Interactive System Performance.” Understanding Mobile Web QoS Too slow Diminishing Returns
  54. 82.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  55. 83.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  56. 84.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  57. 85.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  58. 86.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS “Negative” Energy consumption Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  59. 87.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  60. 88.

    24 Performance QoS Experience Unusable Tolerable Imperceptible Understanding Mobile Web

    QoS Energy [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
  61. 90.

    25 Performance QoS Experience Unusable Tolerable Imperceptible Abstracting Mobile Web

    QoS ▸ Performance metric ▹ Frame latency vs. Frame throughput
  62. 91.

    25 Performance QoS Experience Unusable Tolerable Imperceptible Abstracting Mobile Web

    QoS ▸ Performance metric ▹ Frame latency vs. Frame throughput QoS Type
  63. 92.

    25 Performance QoS Experience Unusable Tolerable Imperceptible Abstracting Mobile Web

    QoS ▸ Performance metric ▹ Frame latency vs. Frame throughput ▸ Threshold performance values ▹ Imperceptible target vs. Usable target QoS Type
  64. 93.

    25 Performance QoS Experience Unusable Tolerable Imperceptible Abstracting Mobile Web

    QoS ▸ Performance metric ▹ Frame latency vs. Frame throughput ▸ Threshold performance values ▹ Imperceptible target vs. Usable target QoS Type QoS Target
  65. 95.

    <html> <head> <script> function animateMove() { /* Animation code omitted

    */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> 26 Expressing Mobile Web QoS
  66. 96.

    <html> <head> <script> function animateMove() { /* Animation code omitted

    */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> 26 Expressing Mobile Web QoS element
  67. 97.

    <html> <head> <script> function animateMove() { /* Animation code omitted

    */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> 26 Expressing Mobile Web QoS element event
  68. 98.

    <html> <head> <script> function animateMove() { /* Animation code omitted

    */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> 26 Expressing Mobile Web QoS element event
  69. 99.

    <html> <head> <script> function animateMove() { /* Animation code omitted

    */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> 26 Expressing Mobile Web QoS Expressing QoS at an event granularity element event
  70. 100.

    <script> function animateMove() { /* Animation code omitted */ }

    </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation element event
  71. 101.

    <script> function animateMove() { /* Animation code omitted */ }

    </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation element event div { } ontouchend
  72. 102.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation element event div { } ontouchend: throughput, low;
  73. 103.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation element event div { } ontouchend: throughput, low;
  74. 104.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation element event div { } ontouchend: throughput, low;
  75. 105.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation function newAnimateMove() { /* New animation code */ } element event div { } ontouchend: throughput, low;
  76. 106.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation function newAnimateMove() { /* New animation code */ } Implementation independent element event div { } ontouchend: throughput, low;
  77. 107.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation function newAnimateMove() { /* New animation code */ } Implementation independent element event div { } ontouchend: throughput, low;
  78. 108.

    { : Type, Target} <script> function animateMove() { /* Animation

    code omitted */ } </script> </head> <body> <div ontouchend=“animateMove()”> <div/> <!— other elements --> </body> </html> <style> </style> <html> <head> 27 Expressing Mobile Web QoS Annotation function newAnimateMove() { /* New animation code */ } Implementation independent Non-interfering w.r.t. functionality element event div { } ontouchend: throughput, low;
  79. 111.

    28 Original application GreenWeb- annotated application GreenWeb Annotation Process Automatic

    Annotation? ▸ AutoGreen: automatically reasons about and inserts GreenWeb annotations
  80. 112.

    28 GreenWeb- annotated application GreenWeb Annotation Process Automatic Annotation? ▸

    AutoGreen: automatically reasons about and inserts GreenWeb annotations DOM Tree
  81. 113.

    ▸ AutoGreen: automatically reasons about and inserts GreenWeb annotations 29

    GreenWeb- annotated application GreenWeb Annotation Process
  82. 114.

    ▸ AutoGreen: automatically reasons about and inserts GreenWeb annotations 29

    GreenWeb- annotated application GreenWeb Annotation Process Callback Instrumentation
  83. 115.

    ▸ AutoGreen: automatically reasons about and inserts GreenWeb annotations 29

    GreenWeb- annotated application GreenWeb Annotation Process QoS Information Event Profiling Callback Instrumentation
  84. 116.

    ▸ AutoGreen: automatically reasons about and inserts GreenWeb annotations 29

    GreenWeb- annotated application GreenWeb Annotation Process QoS Information Event Profiling Annotation Generation Callback Instrumentation
  85. 117.

    ▸ Language abstractions for expressing QoS 30 ▸ Runtime the

    QoS constraints ▸ Result hardware/software implementations GreenWeb: Language for Energy-Efficiency
  86. 118.

    ▸ Language abstractions 30 ▸ Runtime that saves energy while

    meeting the QoS constraints ▸ Result hardware/software implementations GreenWeb: Language for Energy-Efficiency
  87. 122.

    31 GreenWeb Runtime Overview Frame Event Enforcing event-level QoS at

    the frame-level energy-efficiently Runtime Objective QoS Annotations
  88. 123.

    QoS type: latency QoS target: 16 ms 31 GreenWeb Runtime

    Overview Frame Event Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  89. 124.

    QoS type: latency QoS target: 16 ms 31 GreenWeb Runtime

    Overview Frame Event Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  90. 125.

    QoS type: latency QoS target: 16 ms 31 GreenWeb Runtime

    Overview Frame Event Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  91. 126.

    QoS type: latency QoS target: 16 ms 31 GreenWeb Runtime

    Overview Frame Event 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  92. 127.

    QoS type: latency QoS target: 16 ms throughput 31 GreenWeb

    Runtime Overview Frame Event 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  93. 128.

    QoS type: latency QoS target: 16 ms throughput 31 GreenWeb

    Runtime Overview Frame Event Frame Frame 16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective
  94. 129.

    31 GreenWeb Runtime Overview Frame Event Frame Frame Time Event

    16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective QoS target: 2 s
  95. 130.

    31 GreenWeb Runtime Overview Frame Event Frame Frame Time Event

    Frame 16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Runtime Objective 2 s QoS target: 2 s
  96. 131.

    31 GreenWeb Runtime Overview Frame Event Frame Frame Time Event

    Frame 16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently { Runtime Objective 2 s QoS target: 2 s
  97. 132.

    31 GreenWeb Runtime Overview Frame Event Frame Frame Time Event

    Frame 16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Frame Association { Runtime Objective 1 2 s
  98. 133.

    31 GreenWeb Runtime Overview Frame Event Frame Frame Time Event

    Frame 16 ms 16 ms 16 ms Enforcing event-level QoS at the frame-level energy-efficiently Frame Association Frame Scheduling { Runtime Objective 1 2 2 s
  99. 136.

    33 Frame Association S S L P C Frame Browser

    Process Renderer Process GPU Process Event
  100. 137.

    33 Frame Association S S L P C Frame Browser

    Process Renderer Process GPU Process Event Main Thread Compositor Thread
  101. 138.

    33 Frame Association S S L P C Frame Browser

    Process Renderer Process GPU Process Event Main Thread Compositor Thread IPC Inter-thread Message
  102. 139.

    34 Frame Association S S L P C Frame Browser

    Process Renderer Process GPU Process Event Frame S S L P C
  103. 140.

    34 Frame Association S S L P C Frame Browser

    Process Renderer Process GPU Process Event Frame S S L P C Distribute QoS information along with the communication messages
  104. 142.

    Choices of Energy-saving Techniques 35 GreenWeb can support a range

    of energy saving techniques ▹Dynamic resolution scaling [MobiCom 2015] ▹Power-saving display colors [MobiSys 2012] ▹Selective resource loading [NSDI 2015]
  105. 143.

    Choices of Energy-saving Techniques 35 GreenWeb can support a range

    of energy saving techniques ▹Dynamic resolution scaling [MobiCom 2015] ▹Power-saving display colors [MobiSys 2012] ▹Selective resource loading [NSDI 2015] ▹ACMP-based hardware mechanism (WebRT)
  106. 146.
  107. 147.

    Energy Consumption Performance Big Core Small Core Asymmetric Chip-multiprocessor 36

    ▸ Offer a large performance-energy trade-off space Frequency Levels
  108. 148.

    Energy Consumption Performance Big Core Small Core Asymmetric Chip-multiprocessor 36

    ▸ Offer a large performance-energy trade-off space ▸ Already used in commodity devices (e.g., Samsung Galaxy S6) Frequency Levels
  109. 149.

    Energy Consumption Performance Big Core Small Core ACMP-based GreenWeb Runtime

    37 ▸Provide just enough energy to meet QoS constraints Frequency Levels
  110. 150.

    Energy Consumption Performance Big Core Small Core ACMP-based GreenWeb Runtime

    37 ▸Provide just enough energy to meet QoS constraints div {ontouchend: latency, 16 ms} Frequency Levels
  111. 151.

    Energy Consumption Performance Big Core Small Core ACMP-based GreenWeb Runtime

    37 ▸Provide just enough energy to meet QoS constraints 16 ms div {ontouchend: latency, 16 ms}
  112. 152.

    Energy Consumption Performance Big Core Small Core ACMP-based GreenWeb Runtime

    37 ▸Provide just enough energy to meet QoS constraints
  113. 153.

    Energy Consumption Performance Big Core Small Core ACMP-based GreenWeb Runtime

    37 ▸Provide just enough energy to meet QoS constraints Predict the performance and energy of each configuration!
  114. 157.

    Different Strategies for Different Events 38 Loading Touching Moving Events

    Proactive Mechanism WebRT Component Repetitive in a usage session
  115. 158.

    Different Strategies for Different Events 38 Loading Touching Moving Events

    Proactive Mechanism WebRT Component Adaptive Mechanism
  116. 159.

    39 Loading Touching Moving Events Proactive Mechanism WebRT Component Adaptive

    Mechanism Different Strategies for Different Events
  117. 164.

    40 Breaking Down the Computations DOM Tree Tag Attribute HTML

    (Structure) CSS (Style) Selector Property 40
  118. 165.

    40 Breaking Down the Computations DOM Tree Tag Attribute HTML

    (Structure) CSS (Style) Selector Property 40 Web Primitives
  119. 166.

    Predicting Loading Performance & Energy 41 Idea: predict load time

    & energy (responses) based on Web primitives (predictors)
  120. 167.

    Predicting Loading Performance & Energy 41 Identify Predictors Training using

    top 2,500 webpages Predictors (HTML, CSS) Responses (Time, Energy)
  121. 168.

    Predicting Loading Performance & Energy 41 Identify Predictors Training using

    top 2,500 webpages Model Construction & Refinement Refine the linear model Predictors (HTML, CSS) Responses (Time, Energy) Mitigate Over-fitting Model Non-Linearity Linear Regression
  122. 169.

    Predicting Loading Performance & Energy 41 Identify Predictors Training using

    top 2,500 webpages Model Construction & Refinement Refine the linear model Model Validation Validating on another 2,500 webpages Predictors (HTML, CSS) Responses (Time, Energy) Mitigate Over-fitting Model Non-Linearity Linear Regression Loading Time Model Energy Model
  123. 170.

    42 0.00 0.05 0.10 0.15 0.20 performance • • •

    • • • • • • • • • • • • • • • • • • • • • • 0.00 0.05 0.10 0.15 0.20 energy Median prediction error is less than 5% Predicting Loading Performance & Energy
  124. 171.
  125. 172.
  126. 173.

    Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time =

    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy =
  127. 174.

    Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time =

    Tmemory + [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy =
  128. 175.

    Tcpu Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time

    = Tmemory + [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy =
  129. 176.

    Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time =

    Tmemory + Ncycles / f [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy =
  130. 177.

    Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time =

    Tmemory + Ncycles / f [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy =
  131. 178.

    Energy Consumption Performance ACMP-based GreenWeb Runtime 44 Execution Time =

    Tmemory + Ncycles / f [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits” Energy = Execution Time x Power
  132. 179.

    ▸ Language abstractions 45 ▸ Runtime that saves energy while

    meeting the QoS constraints ▸ Result hardware/software implementations GreenWeb: Language for Energy-Efficiency
  133. 180.

    ▸ Language abstractions 45 ▸ Runtime the QoS constraints ▸

    Result in 60% energy savings on real hardware/software implementations GreenWeb: Language for Energy-Efficiency
  134. 181.
  135. 182.

    Real Hardware/Software Setup 46 ODroid XU+E development board, which contains

    an Exynos 5410 SoC used in Samsung Galaxy S4. Little core cluster: ARM Cortex A7, In-order with 2 issue Big core cluster: ARM Cortex A15, OoO with 3 issue Overhead: ▸ Frequency switch: 100 us ▸ Core migration: 20 us
  136. 183.

    Real Hardware/Software Setup 47 ODroid XU+E development board, which contains

    an Exynos 5410 SoC used in Samsung Galaxy S4. Implementation incorporated into Chrome running on Android. Little core cluster: ARM Cortex A7, In-order with 2 issue Big core cluster: ARM Cortex A15, OoO with 3 issue Overhead: ▸ Frequency switch: 100 us ▸ Core migration: 20 us
  137. 184.

    Real Hardware/Software Setup 47 ODroid XU+E development board, which contains

    an Exynos 5410 SoC used in Samsung Galaxy S4. Implementation incorporated into Chrome running on Android. UI-level record and replay for reproducibility. [ISPASS’15] Little core cluster: ARM Cortex A7, In-order with 2 issue Big core cluster: ARM Cortex A15, OoO with 3 issue Overhead: ▸ Frequency switch: 100 us ▸ Core migration: 20 us
  138. 185.

    Power and Energy Measurements 48 + - Vin+ Vin- Vout

    GND Sense resistor 15mΩ SoC ARM Cortex A9 VRM Gain x50 Probe Data Acquisition (DAQ) Power = (Vin + - Vin -) / Rsense * Vin -
  139. 186.

    Evaluation ▸Baseline Mechanisms ▹Highest performance (Perf) — Standard to guarantee

    responsiveness ▹Interactive governor (Interactive) — Android default 49 49
  140. 187.

    Evaluation ▸Baseline Mechanisms ▹Highest performance (Perf) — Standard to guarantee

    responsiveness ▹Interactive governor (Interactive) — Android default 49 ▸Metrics ▹Energy Saving ▹QoS Violation 49
  141. 188.

    Evaluation ▸Baseline Mechanisms ▹Highest performance (Perf) — Standard to guarantee

    responsiveness ▹Interactive governor (Interactive) — Android default 49 ▸Metrics ▹Energy Saving ▹QoS Violation 49 ▸Applications ▹Top webpages (e.g., www.amazon.com) ▹Web Apps based on popular frameworks (e.g., Todo List)
  142. 189.

    50 Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist

    Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf Evaluation Results
  143. 190.

    51 Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist

    Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf Evaluation Results
  144. 191.

    52 Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist

    Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf Evaluation Results
  145. 192.

    53 Evaluation Results QoS Violations (%) 0.0 0.8 1.5 2.3

    3.0 CamanJS Craigslist Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf
  146. 193.

    Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist Paperjs

    Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf 54 Evaluation Results QoS Violations (%) 0.0 0.8 1.5 2.3 3.0 CamanJS Craigslist Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN No QoS Violations
  147. 194.

    Norm. Energy 0.0 0.3 0.5 0.8 1.0 CamanJS Craigslist Paperjs

    Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN GreenWeb Interactive Perf 54 Evaluation Results QoS Violations (%) 0.0 0.8 1.5 2.3 3.0 CamanJS Craigslist Paperjs Goo Google Todo CNet BBC LZMA-JS Amazon W3School MSN 29.2% - 66.0% energy savings, 0.8% more QoS violations No QoS Violations
  148. 195.

    Architecture Configuration Distribution 55 100 80 60 40 20 0

    Time Distribution (%) CamanJS Craigslist Paperjs Goo Google Todo Cnet BBC LZMA-JS Amazon W3School MSN A15 A7 GHz 1.6 1.4 1.2 1.0 0.8 0.6 0.4
  149. 196.

    Architecture Configuration Distribution 55 100 80 60 40 20 0

    Time Distribution (%) CamanJS Craigslist Paperjs Goo Google Todo Cnet BBC LZMA-JS Amazon W3School MSN A15 A7 GHz 1.6 1.4 1.2 1.0 0.8 0.6 0.4
  150. 198.

    56 GreenWeb Programming language support for balancing energy-efficiency and QoS

    in mobile Web computing Abstraction Express QoS constraints
  151. 199.

    56 GreenWeb Programming language support for balancing energy-efficiency and QoS

    in mobile Web computing Abstraction Express QoS constraints Runtime Satisfy QoS specifications using energy saving techniques
  152. 200.

    56 GreenWeb Programming language support for balancing energy-efficiency and QoS

    in mobile Web computing Abstraction Express QoS constraints Runtime Satisfy QoS specifications using energy saving techniques Effect Significant energy savings
  153. 201.

    Runtime 57 My Approach Architecture Application WebRT Energy-aware Web Runtime

    WebCore Web-specific Architecture GreenWeb Language Extensions My Dissertation Work
  154. 205.

    58 Execution Time Energy ASIC? Extremely challenging ‣Chrome: 17M LoC,

    29 languages ▹ c.f., H264 codec: 0.13M LoC, 6 languages ‣Code base is very irregular ▹ Not amenable to traditional specialization General-Purpose Designs WebCore: a Web-Specific Mobile Architecture
  155. 207.
  156. 212.

    Specialization Target: Style Resolution Kernel 60 10% 13% 17% 25%

    35% Render Style Other Layout DOM 12% 14% 16% 18% 40% Render Style Other Layout DOM Execution time breakdown Energy breakdown
  157. 213.

    Specialization Target: Style Resolution Kernel 60 10% 13% 17% 25%

    35% Render Style Other Layout DOM 12% 14% 16% 18% 40% Render Style Other Layout DOM Execution time breakdown Energy breakdown
  158. 214.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}}
  159. 215.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}}
  160. 216.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP)
  161. 217.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP)
  162. 218.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP) Property-level Parallelism (PLP)
  163. 219.

    Specialization Target: Style Resolution Kernel 60 for (each rule in

    matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP) Property-level Parallelism (PLP) ▸ Exploiting the parallelism to increase the arithmetic intensity ▸ Move operands closer to operations to sustain the computations
  164. 220.

    ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit 61 Prop m Prop m 61 Input Scratchpad Conflict Resolution Output Scratchpad Compute Lanes
  165. 223.

    Evaluation Results 62 ▸Fully synthesized using Synopsys 28 nm toolchain

    ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  166. 224.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  167. 225.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) A15-like design ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  168. 226.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) A15-like design Customization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  169. 227.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) 18.6% A15-like design Customization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  170. 228.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% A15-like design Customization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  171. 229.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% A15-like design Customization Specialization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  172. 230.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% 22.2% A15-like design Customization Specialization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  173. 231.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% 9.2% 22.2% A15-like design Customization Specialization ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  174. 232.

    Evaluation Results 62 0.55 0.688 0.825 0.963 1.1 1.6 1.8

    2 2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization 29.2% 47.0% ▸Fully synthesized using Synopsys 28 nm toolchain ▸Cost of specialization: 0.59 mm2 area overhead ▹ SoC die area is 122 mm2 in Samsung Galaxy S4 ▹ A15s’ area: 19 mm2
  175. 235.

    Retrospective: Three Principles Learnt 63 Runtime Application Architecture ▸ Exploiting

    Application Diversity ▸ General-purpose vs. Specialization
  176. 236.

    Retrospective: Three Principles Learnt 63 Runtime Application Architecture ▸ Empowering

    Web Developers ▸ Exploiting Application Diversity ▸ General-purpose vs. Specialization
  177. 237.

    64 1990 HTML 1996 JavaScript 2008 Mobile Web 2012 Responsive

    Web 2016 Watt Wise Web The Web Evolution
  178. 238.

    64 1990 HTML 1996 JavaScript 2008 Mobile Web 2012 Responsive

    Web 2016 Watt Wise Web The Web Evolution ???
  179. 239.

    65

  180. 240.

    65

  181. 241.

    65 Franeworks and Libraries HTML JavaScript CSS Language Runtime Styling

    Security Local Storage User Input Layout Render Franeworks and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Franeworks and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Franeworks and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render Franeworks and Libraries HTML JavaScript CSS Language Runtime Styling Security Local Storage User Input Layout Render
  182. 243.
  183. 244.

    [ACM Queue] Yuhao Zhu, Vijay Janapa Reddi, “The Red future

    of Mobile Web Computing” [PLDI 2016] Yuhao Zhu, Vijay Janapa Reddi, “GreenWeb: Language Extensions for Energy-Efficient Mobile Web Computing” [HPCA 2015] Yuhao Zhu, Matthew Halpern, Vijay Janapa Reddi, “Event- Based Scheduling for Energy-Efficient QoS (eQoS) in Mobile Web Applications” [HPCA 2013] Yuhao Zhu, Vijay Janapa Reddi, “High-Performance and Energy-Efficient Mobile Web Browsing on Big/Little Systems” [CAL 2012] Yuhao Zhu, Aditya Srikanth, Jingwen Leng, Vijay Janapa Reddi, “Exploiting Webpage Characteristics for Energy-Efficient Mobile Web Browsing” (Best of CAL) [ISCA 2014] Yuhao Zhu, Vijay Janapa Reddi, “WebCore: Architectural Support for Mobile Web Browsing” [IEEE MICRO 2015] Yuhao Zhu, Matthew Halpern, Vijay Janapa Reddi, “The Role of the CPU in Energy-Efficient Mobile Web Browsing” [HPCA 2016] Matthew Halpern, Yuhao Zhu, Vijay Janapa Reddi, “Mobile CPU’s Rise to Power: Quantifying the Impact of Generational Mobile CPU Design Trends on Performance, Energy, and User Satisfaction” GreenWeb WebRT WebCore Motivational Studies Future Web
  184. 245.

    [DAC 2011] Yuhao Zhu, Yangdong Deng, Yubei Chen, “Hermes: An

    Integrated CPU/GPU Microarchitecture for IP Routing.” [DAC 2010] Bo Wang, Yuhao Zhu, Yangdong Deng, “Distributed Time, Conservative Parallel Logic Simulation on GPUs.” [TODAES 2011] Yuhao Zhu, Bo Wang, Yangdong Deng, “Massively Parallel Logic Simulation with GPUs.” [ISPASS 2015] Matthew Halpern, Yuhao Zhu, Ramesh Peri, and Vijay Janapa Reddi, “Mosaic: Cross-platform User-interaction Record and Replay for the Fragmented Android Ecosystem.” [IRPS 2014] Chen Zhou, Xiaofei Wang, Weichao Xu, Yuhao Zhu, Vijay Janapa Reddi, Chris Kim, “Estimation of Instantaneous Frequency Fluctuation in a Fast DVFS Environment Using an Empirical BTI Stress- Relaxation Model.” GPGPU & IP Routing Architecture Tools Reliability [MICRO 2015] Yuhao Zhu, Daniel Richins, Matthew Halpern, Vijay Janapa Reddi, “Microarchitectural Implications of Event-driven Server- side Web Applications” (Top Picks Honorable Mention) Server Microarch
  185. 246.

    Coursework 70 Name Instructor Semester SUP Grade COMPILERS Keshav Pingali

    Fall 2010 A ADV EMBED MICROCONTROL SYS Mark McDermott Spring 2011 A- MEMORY MANAGEMENT Kathryn McKinley Spring 2011 Y A VLSI I Jacob Abraham Fall 2011 A- COMP ARCH: PARALLISM/LOCLTY Mattan Erez Fall 2011 A MICROARCHITECTURE Yale Patt Spring 2012 B DYNAMIC COMPILATION Vijay Janapa Reddi Spring 2012 A- COMP PERF EVAL/BENCHMARKING Lizy John Fall 2012 B+ PARALLEL COMP ARCHITECTURE Derek Chiou Spring 2013 B+ HUMAN COMPUT & CROWDSRCING Matt Lease Fall 2015 Y A-