$30 off During Our Annual Pro Sale. View Details »

Defense talk: Energy-Efficient Mobile Web Computing

Yuhao Zhu
September 07, 2016

Defense talk: Energy-Efficient Mobile Web Computing

Defense talk

Yuhao Zhu

September 07, 2016
Tweet

More Decks by Yuhao Zhu

Other Decks in Education

Transcript

  1. 1
    Energy-Efficient

    Mobile Web Computing
    Sept. 7th, 2016
    Yuhao Zhu
    Electrical and Computer Engineering Department

    The University of Texas at Austin

    Advisor: Vijay Janapa Reddi

    View Slide

  2. 2
    1990
    HTML
    The Web Evolution

    View Slide

  3. 2
    1990
    HTML
    The Web Evolution

    View Slide

  4. 2
    1990
    HTML
    1996
    JavaScript
    The Web Evolution

    View Slide

  5. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    The Web Evolution

    View Slide

  6. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    The Web Evolution

    View Slide

  7. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    The Web Evolution
    Functionality

    View Slide

  8. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    The Web Evolution
    Functionality Performance

    View Slide

  9. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    2016
    Watt Wise Web
    The Web Evolution
    Functionality Performance Energy

    View Slide

  10. 2
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    2016
    Watt Wise Web
    The Web Evolution
    Functionality Performance Energy

    View Slide

  11. 3
    Web: Mobile Overtaking Desktop

    View Slide

  12. 0
    30
    60
    90
    120
    2011 2012 2013 2014 2015 2016
    3
    Source: BIA/Kelsey
    Search Volume (B)
    Web: Mobile Overtaking Desktop

    View Slide

  13. 0
    30
    60
    90
    120
    2011 2012 2013 2014 2015 2016
    3
    Source: BIA/Kelsey
    Search Volume (B)
    Mobile
    Desktop
    Web: Mobile Overtaking Desktop

    View Slide

  14. 0
    30
    60
    90
    120
    2011 2012 2013 2014 2015 2016
    3
    Source: BIA/Kelsey
    Search Volume (B)
    Mobile
    Desktop
    Web: Mobile Overtaking Desktop
    When the

    work started

    View Slide

  15. 0
    30
    60
    90
    120
    2011 2012 2013 2014 2015 2016
    3
    Source: BIA/Kelsey
    Search Volume (B)
    Mobile
    Desktop
    Web: Mobile Overtaking Desktop
    When the

    work started

    View Slide

  16. 4
    Web ≈ Mobile Web

    View Slide

  17. 5
    The Scope of Mobile Web
    Mobile
    Client

    View Slide

  18. 5
    The Scope of Mobile Web
    Mobile
    Client
    Cloud
    Web Servers

    View Slide

  19. 5
    The Scope of Mobile Web
    Mobile
    Client
    Cloud
    Web Servers
    Cellular
    Network

    View Slide

  20. 5
    The Scope of Mobile Web
    Mobile
    Client
    Cloud
    Web Servers
    Cellular
    Network

    View Slide

  21. 5
    The Scope of Mobile Web
    Mobile
    Client
    Cloud
    Web Servers
    Cellular
    Network
    [MICRO 2015] (Top Picks
    Honorable Mention)

    View Slide

  22. 6
    The Scope of Mobile Web
    Mobile
    Client
    Cellular
    Network

    View Slide

  23. 6
    The Scope of Mobile Web
    Mobile
    Client
    Cellular
    Network

    View Slide

  24. 7
    Isn’t Mobile Web a Network Issue?
    Mobile
    Client
    Cellular
    Network

    View Slide

  25. 38
    32
    26
    20
    14
    8
    2
    Load time (s)
    10
    2 3 4 5 6 7 8
    100
    2 3 4 5 6 7 8
    1000
    2
    Network RTT (ms)
    8
    Isn’t Mobile Web a Network Issue?

    View Slide

  26. 38
    32
    26
    20
    14
    8
    2
    Load time (s)
    10
    2 3 4 5 6 7 8
    100
    2 3 4 5 6 7 8
    1000
    2
    Network RTT (ms)
    8
    Isn’t Mobile Web a Network Issue?
    ▸ Samsung Galaxy
    S4 smartphone.
    ▸ Hot webpages
    from Alexa1.
    ▸ Time measured
    using Navigation
    Timing API2.
    1. http://www.alexa.com/
    2. https://www.w3.org/TR/navigation-timing-2/

    View Slide

  27. 38
    32
    26
    20
    14
    8
    2
    Load time (s)
    10
    2 3 4 5 6 7 8
    100
    2 3 4 5 6 7 8
    1000
    2
    Network RTT (ms)
    8
    LTE 3G Adverse 3G
    2G
    Wi-Fi
    Isn’t Mobile Web a Network Issue?
    ▸ Samsung Galaxy
    S4 smartphone.
    ▸ Hot webpages
    from Alexa1.
    ▸ Time measured
    using Navigation
    Timing API2.
    1. http://www.alexa.com/
    2. https://www.w3.org/TR/navigation-timing-2/

    View Slide

  28. 38
    32
    26
    20
    14
    8
    2
    Load time (s)
    10
    2 3 4 5 6 7 8
    100
    2 3 4 5 6 7 8
    1000
    2
    Network RTT (ms)
    8
    LTE 3G Adverse 3G
    2G
    Wi-Fi
    Isn’t Mobile Web a Network Issue?
    ▸ Samsung Galaxy
    S4 smartphone.
    ▸ Hot webpages
    from Alexa1.
    ▸ Time measured
    using Navigation
    Timing API2.
    1. http://www.alexa.com/
    2. https://www.w3.org/TR/navigation-timing-2/

    View Slide

  29. 38
    32
    26
    20
    14
    8
    2
    Load time (s)
    10
    2 3 4 5 6 7 8
    100
    2 3 4 5 6 7 8
    1000
    2
    Network RTT (ms)
    8
    LTE 3G Adverse 3G
    2G
    Wi-Fi
    Isn’t Mobile Web a Network Issue?
    ▸ Samsung Galaxy
    S4 smartphone.
    ▸ Hot webpages
    from Alexa1.
    ▸ Time measured
    using Navigation
    Timing API2.
    1. http://www.alexa.com/
    2. https://www.w3.org/TR/navigation-timing-2/

    View Slide

  30. 9
    Mobile Web is also a Compute Issue!
    Mobile
    Client
    Cellular
    Network

    View Slide

  31. 9
    Mobile Web is also a Compute Issue!
    Mobile
    Client
    Cellular
    Network
    My Work

    View Slide

  32. 10
    Traditional Approach

    View Slide

  33. 10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render

    View Slide

  34. 10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application

    View Slide

  35. ▸ Parallelize browser computation
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application

    View Slide

  36. ▸ Parallelize browser computation
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture

    View Slide

  37. ▸ Parallelize browser computation
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture ▸ Voltage/frequency scaling on
    general-purpose processors

    View Slide

  38. ▸ Parallelize browser computation
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Inputs
    Architecture ▸ Voltage/frequency scaling on
    general-purpose processors

    View Slide

  39. ▸ Parallelize browser computation
    ▸ Ignored!
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Inputs
    Architecture ▸ Voltage/frequency scaling on
    general-purpose processors

    View Slide

  40. ▸ Parallelize browser computation
    ▸ Ignored!
    10
    Traditional Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Inputs
    Architecture ▸ Voltage/frequency scaling on
    general-purpose processors
    ▸ End of Dennard Scaling!
    ▸ Diminishing return

    View Slide

  41. ▸ Parallelize browser computation
    ▸ Ignored!
    11
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Inputs
    Architecture
    WebCore
    Web-specific Architecture

    View Slide

  42. ▸ Parallelize browser computation
    11
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Inputs
    Architecture
    ▸ Lost page-level diversity
    ▸ Lost user QoS requirements
    WebCore
    Web-specific Architecture

    View Slide

  43. ▸ Parallelize browser computation
    11
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    ▸ Lost page-level diversity
    ▸ Lost user QoS requirements
    WebCore
    Web-specific Architecture

    View Slide

  44. 12
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions

    View Slide

  45. 12
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    Runtime

    View Slide

  46. 12
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    Runtime

    View Slide

  47. 12
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    Runtime

    View Slide

  48. WebRT
    Energy-aware
    Web Runtime
    12
    My Approach
    Frameworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Application
    Architecture
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    Runtime

    View Slide

  49. Runtime
    13
    My Approach
    Architecture
    Application
    WebRT
    Energy-aware
    Web Runtime
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions

    View Slide

  50. Runtime
    13
    My Approach
    Architecture
    Application
    My Dissertation Work
    WebRT
    Energy-aware
    Web Runtime
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    [PLDI 2016]
    [ISCA 2014]
    [HPCA 2013]
    [HPCA 2015]
    [CAL 2014]
    (Best of CAL)

    View Slide

  51. Thesis Statement
    14

    View Slide

  52. Thesis Statement
    14
    Future mobile Web systems can achieve
    energy-efficiency without sacrificing
    responsiveness by incorporating:

    View Slide

  53. Thesis Statement
    14
    ▸ Programming language annotations
    to convey user QoS information
    ▸ Runtime scheduling mechanisms to
    exploit heterogeneous hardware
    ▸ Hardware accelerators specialized
    for the key computation kernel
    Future mobile Web systems can achieve
    energy-efficiency without sacrificing
    responsiveness by incorporating:

    View Slide

  54. Runtime
    15
    My Approach
    Architecture
    Application
    My Dissertation Work
    WebRT
    Energy-aware
    Web Runtime
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions

    View Slide

  55. Runtime
    15
    My Approach
    Architecture
    Application
    My Dissertation Work
    WebRT
    Energy-aware
    Web Runtime
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions

    View Slide

  56. Energy Concern Among Mobile Developers
    16
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  57. Energy Concern Among Mobile Developers
    16
    Percentage (%)
    0
    25
    50
    75
    100
    Mobile Desktop Data Center
    Never/Rarely
    Sometimes
    Often/Almost Always
    “My applications
    have requirements
    about energy usage.”
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  58. Energy Concern Among Mobile Developers
    16
    Percentage (%)
    0
    25
    50
    75
    100
    Mobile Desktop Data Center
    Never/Rarely
    Sometimes
    Often/Almost Always
    “My applications
    have requirements
    about energy usage.”
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  59. Energy Concern Among Mobile Developers
    16
    Percentage (%)
    0
    25
    50
    75
    100
    Mobile Desktop Data Center
    Never/Rarely
    Sometimes
    Often/Almost Always
    “My applications
    have requirements
    about energy usage.”
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  60. Developers are Willing to Make Trade-offs
    17
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  61. Developers are Willing to Make Trade-offs
    17
    Percentage (%)
    0
    25
    50
    75
    100
    Mobile
    Never/Rarely
    Sometimes
    Often/Almost Always
    “I'm willing to sacrifice
    performance, etc. for
    reduced energy usage.”
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  62. Developers are Willing to Make Trade-offs
    17
    Percentage (%)
    0
    25
    50
    75
    100
    Mobile
    Never/Rarely
    Sometimes
    Often/Almost Always
    “I'm willing to sacrifice
    performance, etc. for
    reduced energy usage.”
    [ICSE 2016] Manotas et al., “An Empirical Study of Practitioners’ Perspectives on Green Software Engineering”

    View Slide

  63. Energy-efficiency
    18

    View Slide

  64. Quality-of-service Energy-efficiency
    18

    View Slide

  65. Quality-of-service Energy-efficiency
    Conflicting
    requirements
    18

    View Slide

  66. Quality-of-service Energy-efficiency
    Conflicting
    requirements
    19
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing

    View Slide

  67. 20
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing

    View Slide

  68. GreenWeb
    20
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing

    View Slide

  69. 21
    GreenWeb: Language for Energy-Efficiency
    ▸ Language abstractions for expressing QoS

    View Slide

  70. 21
    ▸ Runtime that saves energy while meeting
    the QoS constraints
    GreenWeb: Language for Energy-Efficiency
    ▸ Language abstractions for expressing QoS

    View Slide

  71. 21
    ▸ Runtime that saves energy while meeting
    the QoS constraints
    ▸ Result in 60% energy savings on real
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency
    ▸ Language abstractions for expressing QoS

    View Slide

  72. 21
    ▸ Runtime
    the QoS constraints
    ▸ Result
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency
    ▸ Language abstractions for expressing QoS

    View Slide

  73. 22
    What is QoS in mobile Web?

    View Slide

  74. 23
    Understanding Mobile Web QoS

    View Slide

  75. 23
    Performance
    QoS Experience
    Understanding Mobile Web QoS

    View Slide

  76. 23
    Performance
    QoS Experience
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS

    View Slide

  77. 23
    Performance
    QoS Experience
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS
    Too slow

    View Slide

  78. 23
    Performance
    QoS Experience
    Unusable
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS
    Too slow

    View Slide

  79. 23
    Performance
    QoS Experience
    Unusable Tolerable
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS
    Too slow

    View Slide

  80. 23
    Performance
    QoS Experience
    Unusable Tolerable
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS
    Too slow
    Diminishing
    Returns

    View Slide

  81. 23
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”
    Understanding Mobile Web QoS
    Too slow
    Diminishing
    Returns

    View Slide

  82. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  83. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  84. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  85. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  86. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    “Negative” Energy consumption
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  87. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  88. 24
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Understanding Mobile Web QoS
    Energy
    [OSDI 1996] Y. Endo et al., “Using Latency to Evaluate Interactive System Performance.”

    View Slide

  89. 25
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Abstracting Mobile Web QoS

    View Slide

  90. 25
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Abstracting Mobile Web QoS
    ▸ Performance metric

    ▹ Frame latency vs. Frame throughput

    View Slide

  91. 25
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Abstracting Mobile Web QoS
    ▸ Performance metric

    ▹ Frame latency vs. Frame throughput
    QoS Type

    View Slide

  92. 25
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Abstracting Mobile Web QoS
    ▸ Performance metric

    ▹ Frame latency vs. Frame throughput
    ▸ Threshold performance values

    ▹ Imperceptible target vs. Usable target
    QoS Type

    View Slide

  93. 25
    Performance
    QoS Experience
    Unusable Tolerable Imperceptible
    Abstracting Mobile Web QoS
    ▸ Performance metric

    ▹ Frame latency vs. Frame throughput
    ▸ Threshold performance values

    ▹ Imperceptible target vs. Usable target
    QoS Type
    QoS Target

    View Slide

  94. 26
    Expressing Mobile Web QoS

    View Slide


  95. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>



    26
    Expressing Mobile Web QoS

    View Slide


  96. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>



    26
    Expressing Mobile Web QoS
    element

    View Slide


  97. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>



    26
    Expressing Mobile Web QoS
    element event

    View Slide


  98. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>



    26
    Expressing Mobile Web QoS
    element event

    View Slide


  99. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>



    26
    Expressing Mobile Web QoS
    Expressing QoS at an event granularity
    element event

    View Slide

  100. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  101. <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  102. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  103. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  104. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  105. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  106. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  107. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  108. { : Type, Target}
    <br/>function animateMove() {<br/>/* Animation code omitted */<br/>}<br/>


    View Slide

  109. 28
    Original
    application
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    Manual Annotation

    View Slide

  110. 28
    Original
    application
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    Automatic Annotation?

    View Slide

  111. 28
    Original
    application
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    Automatic Annotation?
    ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations

    View Slide

  112. 28
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    Automatic Annotation?
    ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations
    DOM
    Tree

    View Slide

  113. ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations
    29
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process

    View Slide

  114. ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations
    29
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    Callback
    Instrumentation

    View Slide

  115. ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations
    29
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    QoS
    Information
    Event
    Profiling
    Callback
    Instrumentation

    View Slide

  116. ▸ AutoGreen: automatically reasons about
    and inserts GreenWeb annotations
    29
    GreenWeb-
    annotated
    application
    GreenWeb Annotation Process
    QoS
    Information
    Event
    Profiling
    Annotation
    Generation
    Callback
    Instrumentation

    View Slide

  117. ▸ Language abstractions for expressing QoS
    30
    ▸ Runtime
    the QoS constraints
    ▸ Result
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency

    View Slide

  118. ▸ Language abstractions
    30
    ▸ Runtime that saves energy while meeting
    the QoS constraints
    ▸ Result
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency

    View Slide

  119. 31
    GreenWeb Runtime Overview

    View Slide

  120. 31
    GreenWeb Runtime Overview
    Frame
    Event

    View Slide

  121. 31
    GreenWeb Runtime Overview
    Frame
    Event
    QoS
    Annotations

    View Slide

  122. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective
    QoS
    Annotations

    View Slide

  123. QoS type: latency
    QoS target: 16 ms
    31
    GreenWeb Runtime Overview
    Frame
    Event
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  124. QoS type: latency
    QoS target: 16 ms
    31
    GreenWeb Runtime Overview
    Frame
    Event
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  125. QoS type: latency
    QoS target: 16 ms
    31
    GreenWeb Runtime Overview
    Frame
    Event
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  126. QoS type: latency
    QoS target: 16 ms
    31
    GreenWeb Runtime Overview
    Frame
    Event 16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  127. QoS type: latency
    QoS target: 16 ms
    throughput
    31
    GreenWeb Runtime Overview
    Frame
    Event 16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  128. QoS type: latency
    QoS target: 16 ms
    throughput
    31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective

    View Slide

  129. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    Time
    Event
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective
    QoS target: 2 s

    View Slide

  130. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    Time
    Event Frame
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Runtime
    Objective
    2 s
    QoS target: 2 s

    View Slide

  131. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    Time
    Event Frame
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    {
    Runtime
    Objective
    2 s
    QoS target: 2 s

    View Slide

  132. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    Time
    Event Frame
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Frame Association
    {
    Runtime
    Objective
    1
    2 s

    View Slide

  133. 31
    GreenWeb Runtime Overview
    Frame
    Event
    Frame
    Frame
    Time
    Event Frame
    16 ms
    16 ms
    16 ms
    Enforcing event-level
    QoS at the frame-level
    energy-efficiently
    Frame Association
    Frame Scheduling
    {
    Runtime
    Objective
    1
    2
    2 s

    View Slide

  134. 32
    Frame Association
    Scripting Style Layout Paint Composite
    Frame
    Event

    View Slide

  135. 33
    Frame Association
    S S L P C
    Frame
    Event

    View Slide

  136. 33
    Frame Association
    S S L P C
    Frame
    Browser
    Process
    Renderer
    Process
    GPU
    Process
    Event

    View Slide

  137. 33
    Frame Association
    S S L
    P C
    Frame
    Browser
    Process
    Renderer
    Process
    GPU
    Process
    Event
    Main
    Thread
    Compositor
    Thread

    View Slide

  138. 33
    Frame Association
    S S L
    P C
    Frame
    Browser
    Process
    Renderer
    Process
    GPU
    Process
    Event
    Main
    Thread
    Compositor
    Thread
    IPC
    Inter-thread
    Message

    View Slide

  139. 34
    Frame Association
    S S L
    P C
    Frame
    Browser
    Process
    Renderer
    Process
    GPU
    Process
    Event
    Frame
    S S L
    P C

    View Slide

  140. 34
    Frame Association
    S S L
    P C
    Frame
    Browser
    Process
    Renderer
    Process
    GPU
    Process
    Event
    Frame
    S S L
    P C
    Distribute QoS information along
    with the communication messages

    View Slide

  141. Choices of Energy-saving Techniques
    35
    GreenWeb can
    support a range
    of energy saving
    techniques

    View Slide

  142. Choices of Energy-saving Techniques
    35
    GreenWeb can
    support a range
    of energy saving
    techniques
    ▹Dynamic resolution scaling [MobiCom 2015]
    ▹Power-saving display colors [MobiSys 2012]
    ▹Selective resource loading [NSDI 2015]

    View Slide

  143. Choices of Energy-saving Techniques
    35
    GreenWeb can
    support a range
    of energy saving
    techniques
    ▹Dynamic resolution scaling [MobiCom 2015]
    ▹Power-saving display colors [MobiSys 2012]
    ▹Selective resource loading [NSDI 2015]
    ▹ACMP-based hardware mechanism (WebRT)

    View Slide

  144. Asymmetric Chip-multiprocessor
    36

    View Slide

  145. Asymmetric Chip-multiprocessor
    36
    ▸ Offer a large performance-energy trade-off space

    View Slide

  146. Energy Consumption
    Performance
    Big Core
    Small Core
    Asymmetric Chip-multiprocessor
    36
    ▸ Offer a large performance-energy trade-off space

    View Slide

  147. Energy Consumption
    Performance
    Big Core
    Small Core
    Asymmetric Chip-multiprocessor
    36
    ▸ Offer a large performance-energy trade-off space
    Frequency
    Levels

    View Slide

  148. Energy Consumption
    Performance
    Big Core
    Small Core
    Asymmetric Chip-multiprocessor
    36
    ▸ Offer a large performance-energy trade-off space
    ▸ Already used in commodity devices (e.g., Samsung Galaxy S6)
    Frequency
    Levels

    View Slide

  149. Energy Consumption
    Performance
    Big Core
    Small Core
    ACMP-based GreenWeb Runtime
    37
    ▸Provide just enough energy to meet QoS constraints
    Frequency
    Levels

    View Slide

  150. Energy Consumption
    Performance
    Big Core
    Small Core
    ACMP-based GreenWeb Runtime
    37
    ▸Provide just enough energy to meet QoS constraints
    div {ontouchend: latency, 16 ms}
    Frequency
    Levels

    View Slide

  151. Energy Consumption
    Performance
    Big Core
    Small Core
    ACMP-based GreenWeb Runtime
    37
    ▸Provide just enough energy to meet QoS constraints
    16 ms
    div {ontouchend: latency, 16 ms}

    View Slide

  152. Energy Consumption
    Performance
    Big Core
    Small Core
    ACMP-based GreenWeb Runtime
    37
    ▸Provide just enough energy to meet QoS constraints

    View Slide

  153. Energy Consumption
    Performance
    Big Core
    Small Core
    ACMP-based GreenWeb Runtime
    37
    ▸Provide just enough energy to meet QoS constraints
    Predict the performance
    and energy of each
    configuration!

    View Slide

  154. Different Strategies for Different Events
    38
    Loading
    Touching
    Moving
    Events

    View Slide

  155. Different Strategies for Different Events
    38
    Loading
    Touching
    Moving
    Events
    Once per
    usage session

    View Slide

  156. Different Strategies for Different Events
    38
    Loading
    Touching
    Moving
    Events
    Proactive
    Mechanism
    WebRT

    Component

    View Slide

  157. Different Strategies for Different Events
    38
    Loading
    Touching
    Moving
    Events
    Proactive
    Mechanism
    WebRT

    Component
    Repetitive in
    a usage session

    View Slide

  158. Different Strategies for Different Events
    38
    Loading
    Touching
    Moving
    Events
    Proactive
    Mechanism
    WebRT

    Component
    Adaptive
    Mechanism

    View Slide

  159. 39
    Loading
    Touching
    Moving
    Events
    Proactive
    Mechanism
    WebRT

    Component
    Adaptive
    Mechanism
    Different Strategies for Different Events

    View Slide

  160. 40
    Breaking Down the Computations
    40

    View Slide

  161. 40
    Breaking Down the Computations
    HTML (Structure)
    CSS (Style)
    40

    View Slide

  162. 40
    Breaking Down the Computations
    Tag
    Attribute
    HTML (Structure)
    CSS (Style)
    40

    View Slide

  163. 40
    Breaking Down the Computations
    Tag
    Attribute
    HTML (Structure)
    CSS (Style)
    Selector
    Property
    40

    View Slide

  164. 40
    Breaking Down the Computations
    DOM Tree
    Tag
    Attribute
    HTML (Structure)
    CSS (Style)
    Selector
    Property
    40

    View Slide

  165. 40
    Breaking Down the Computations
    DOM Tree
    Tag
    Attribute
    HTML (Structure)
    CSS (Style)
    Selector
    Property
    40
    Web Primitives

    View Slide

  166. Predicting Loading Performance & Energy
    41
    Idea: predict load time & energy (responses)
    based on Web primitives (predictors)

    View Slide

  167. Predicting Loading Performance & Energy
    41
    Identify Predictors
    Training using top
    2,500 webpages
    Predictors

    (HTML, CSS)
    Responses

    (Time, Energy)

    View Slide

  168. Predicting Loading Performance & Energy
    41
    Identify Predictors
    Training using top
    2,500 webpages
    Model Construction &
    Refinement
    Refine the linear model
    Predictors

    (HTML, CSS)
    Responses

    (Time, Energy)
    Mitigate Over-fitting
    Model Non-Linearity
    Linear Regression

    View Slide

  169. Predicting Loading Performance & Energy
    41
    Identify Predictors
    Training using top
    2,500 webpages
    Model Construction &
    Refinement
    Refine the linear model
    Model Validation
    Validating on another
    2,500 webpages
    Predictors

    (HTML, CSS)
    Responses

    (Time, Energy)
    Mitigate Over-fitting
    Model Non-Linearity
    Linear Regression
    Loading Time
    Model
    Energy Model

    View Slide

  170. 42
    0.00 0.05 0.10 0.15 0.20
    performance

























    0.00 0.05 0.10 0.15 0.20
    energy
    Median prediction error is less than 5%
    Predicting Loading Performance & Energy

    View Slide

  171. 43
    Loading
    Touching
    Moving
    Interactions
    Proactive
    Mechanism
    WebRT

    Component
    Adaptive
    Mechanism
    Different Strategies for Different Events

    View Slide

  172. 43
    Loading
    Touching
    Moving
    Interactions
    Proactive
    Mechanism
    WebRT

    Component
    Adaptive
    Mechanism
    Different Strategies for Different Events

    View Slide

  173. Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    =
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy =

    View Slide

  174. Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    = Tmemory
    +
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy =

    View Slide

  175. Tcpu
    Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    = Tmemory
    +
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy =

    View Slide

  176. Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    = Tmemory
    + Ncycles / f
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy =

    View Slide

  177. Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    = Tmemory
    + Ncycles / f
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy =

    View Slide

  178. Energy Consumption
    Performance
    ACMP-based GreenWeb Runtime
    44
    Execution
    Time
    = Tmemory
    + Ncycles / f
    [PLDI 2003] Xie, et al., “Compile-Time Dynamic Voltage Scaling Settings: Opportunities and Limits”
    Energy = Execution
    Time
    x Power

    View Slide

  179. ▸ Language abstractions
    45
    ▸ Runtime that saves energy while meeting
    the QoS constraints
    ▸ Result
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency

    View Slide

  180. ▸ Language abstractions
    45
    ▸ Runtime
    the QoS constraints
    ▸ Result in 60% energy savings on real
    hardware/software implementations
    GreenWeb: Language for Energy-Efficiency

    View Slide

  181. Real Hardware/Software Setup
    46
    ODroid XU+E development board,
    which contains an Exynos 5410 SoC
    used in Samsung Galaxy S4.

    View Slide

  182. Real Hardware/Software Setup
    46
    ODroid XU+E development board,
    which contains an Exynos 5410 SoC
    used in Samsung Galaxy S4.
    Little core cluster: ARM Cortex
    A7, In-order with 2 issue
    Big core cluster: ARM Cortex
    A15, OoO with 3 issue
    Overhead:
    ▸ Frequency switch: 100 us
    ▸ Core migration: 20 us

    View Slide

  183. Real Hardware/Software Setup
    47
    ODroid XU+E development board,
    which contains an Exynos 5410 SoC
    used in Samsung Galaxy S4.
    Implementation incorporated into
    Chrome running on Android.
    Little core cluster: ARM Cortex
    A7, In-order with 2 issue
    Big core cluster: ARM Cortex
    A15, OoO with 3 issue
    Overhead:
    ▸ Frequency switch: 100 us
    ▸ Core migration: 20 us

    View Slide

  184. Real Hardware/Software Setup
    47
    ODroid XU+E development board,
    which contains an Exynos 5410 SoC
    used in Samsung Galaxy S4.
    Implementation incorporated into
    Chrome running on Android.
    UI-level record and replay for
    reproducibility. [ISPASS’15]
    Little core cluster: ARM Cortex
    A7, In-order with 2 issue
    Big core cluster: ARM Cortex
    A15, OoO with 3 issue
    Overhead:
    ▸ Frequency switch: 100 us
    ▸ Core migration: 20 us

    View Slide

  185. Power and Energy Measurements
    48
    + -
    Vin+ Vin-
    Vout GND
    Sense resistor
    15mΩ
    SoC
    ARM Cortex A9
    VRM
    Gain
    x50
    Probe
    Data Acquisition
    (DAQ)
    Power =

    (Vin
    + - Vin
    -) / Rsense * Vin
    -

    View Slide

  186. Evaluation
    ▸Baseline Mechanisms
    ▹Highest performance (Perf) — Standard to guarantee responsiveness
    ▹Interactive governor (Interactive) — Android default
    49
    49

    View Slide

  187. Evaluation
    ▸Baseline Mechanisms
    ▹Highest performance (Perf) — Standard to guarantee responsiveness
    ▹Interactive governor (Interactive) — Android default
    49
    ▸Metrics
    ▹Energy Saving
    ▹QoS Violation
    49

    View Slide

  188. Evaluation
    ▸Baseline Mechanisms
    ▹Highest performance (Perf) — Standard to guarantee responsiveness
    ▹Interactive governor (Interactive) — Android default
    49
    ▸Metrics
    ▹Energy Saving
    ▹QoS Violation
    49
    ▸Applications
    ▹Top webpages (e.g., www.amazon.com)
    ▹Web Apps based on popular frameworks (e.g., Todo List)

    View Slide

  189. 50
    Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf
    Evaluation Results

    View Slide

  190. 51
    Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf
    Evaluation Results

    View Slide

  191. 52
    Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf
    Evaluation Results

    View Slide

  192. 53
    Evaluation Results
    QoS Violations (%)
    0.0
    0.8
    1.5
    2.3
    3.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf

    View Slide

  193. Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf
    54
    Evaluation Results
    QoS Violations (%)
    0.0
    0.8
    1.5
    2.3
    3.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    No QoS
    Violations

    View Slide

  194. Norm. Energy
    0.0
    0.3
    0.5
    0.8
    1.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    GreenWeb Interactive Perf
    54
    Evaluation Results
    QoS Violations (%)
    0.0
    0.8
    1.5
    2.3
    3.0
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    CNet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    29.2% - 66.0% energy savings, 0.8% more QoS violations
    No QoS
    Violations

    View Slide

  195. Architecture Configuration Distribution
    55
    100
    80
    60
    40
    20
    0
    Time Distribution (%)
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    Cnet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    A15 A7
    GHz
    1.6
    1.4
    1.2
    1.0
    0.8
    0.6
    0.4

    View Slide

  196. Architecture Configuration Distribution
    55
    100
    80
    60
    40
    20
    0
    Time Distribution (%)
    CamanJS
    Craigslist
    Paperjs
    Goo
    Google
    Todo
    Cnet
    BBC
    LZMA-JS
    Amazon
    W3School
    MSN
    A15 A7
    GHz
    1.6
    1.4
    1.2
    1.0
    0.8
    0.6
    0.4

    View Slide

  197. 56
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing

    View Slide

  198. 56
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing
    Abstraction Express QoS constraints

    View Slide

  199. 56
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing
    Abstraction Express QoS constraints
    Runtime Satisfy QoS specifications using
    energy saving techniques

    View Slide

  200. 56
    GreenWeb
    Programming language support for
    balancing energy-efficiency and QoS
    in mobile Web computing
    Abstraction Express QoS constraints
    Runtime Satisfy QoS specifications using
    energy saving techniques
    Effect Significant energy savings

    View Slide

  201. Runtime
    57
    My Approach
    Architecture
    Application
    WebRT
    Energy-aware
    Web Runtime
    WebCore
    Web-specific Architecture
    GreenWeb
    Language Extensions
    My Dissertation Work

    View Slide

  202. 58
    Execution Time
    Energy
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture

    View Slide

  203. 58
    Execution Time
    Energy
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture
    Diminishing
    return

    View Slide

  204. 58
    Execution Time
    Energy
    ASIC?
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture

    View Slide

  205. 58
    Execution Time
    Energy
    ASIC?
    Extremely challenging

    ‣Chrome: 17M LoC, 29 languages
    ▹ c.f., H264 codec: 0.13M LoC, 6 languages
    ‣Code base is very irregular
    ▹ Not amenable to traditional specialization
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture

    View Slide

  206. 58
    Execution Time
    Energy
    ASIC?
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture
    Goal

    View Slide

  207. 58
    Execution Time
    Energy
    ???
    ASIC?
    General-Purpose
    Designs
    WebCore: a Web-Specific Mobile Architecture
    Goal

    View Slide

  208. WebCore: a Web-Specific Mobile Architecture
    59
    Execution Time
    Energy
    General-Purpose
    Designs
    Goal

    View Slide

  209. WebCore: a Web-Specific Mobile Architecture
    59
    Execution Time
    Energy
    General-Purpose
    Designs
    Customization
    Goal

    View Slide

  210. WebCore: a Web-Specific Mobile Architecture
    59
    Execution Time
    Energy
    General-Purpose
    Designs
    Customization
    Specialization
    Goal

    View Slide

  211. Specialization Target: Style Resolution Kernel
    60

    View Slide

  212. Specialization Target: Style Resolution Kernel
    60
    10%
    13%
    17%
    25%
    35%
    Render
    Style
    Other
    Layout
    DOM
    12%
    14%
    16%
    18%
    40%
    Render
    Style
    Other
    Layout
    DOM
    Execution time
    breakdown
    Energy
    breakdown

    View Slide

  213. Specialization Target: Style Resolution Kernel
    60
    10%
    13%
    17%
    25%
    35%
    Render
    Style
    Other
    Layout
    DOM
    12%
    14%
    16%
    18%
    40%
    Render
    Style
    Other
    Layout
    DOM
    Execution time
    breakdown
    Energy
    breakdown

    View Slide

  214. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}

    View Slide

  215. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}

    View Slide

  216. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)

    View Slide

  217. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)

    View Slide

  218. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)
    Property-level
    Parallelism (PLP)

    View Slide

  219. Specialization Target: Style Resolution Kernel
    60
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)
    Property-level
    Parallelism (PLP)
    ▸ Exploiting the parallelism to increase the arithmetic intensity
    ▸ Move operands closer to operations to sustain the computations

    View Slide

  220. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit
    61
    Prop m Prop m
    61
    Input
    Scratchpad
    Conflict
    Resolution
    Output
    Scratchpad
    Compute
    Lanes

    View Slide

  221. Evaluation Results
    62

    View Slide

  222. Evaluation Results
    62
    ▸Fully synthesized using
    Synopsys 28 nm toolchain

    View Slide

  223. Evaluation Results
    62
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  224. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  225. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  226. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  227. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    A15-like
    design
    Customization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  228. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    A15-like
    design
    Customization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  229. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    A15-like
    design
    Customization
    Specialization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  230. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    22.2%
    A15-like
    design
    Customization
    Specialization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  231. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    9.2%
    22.2%
    A15-like
    design
    Customization
    Specialization
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  232. Evaluation Results
    62
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    29.2%
    47.0%
    ▸Fully synthesized using
    Synopsys 28 nm toolchain
    ▸Cost of specialization:
    0.59 mm2 area overhead
    ▹ SoC die area is 122 mm2 in
    Samsung Galaxy S4
    ▹ A15s’ area: 19 mm2

    View Slide

  233. Retrospective: Three Principles Learnt
    63
    Runtime
    Application
    Architecture

    View Slide

  234. Retrospective: Three Principles Learnt
    63
    Runtime
    Application
    Architecture ▸ General-purpose vs. Specialization

    View Slide

  235. Retrospective: Three Principles Learnt
    63
    Runtime
    Application
    Architecture
    ▸ Exploiting Application Diversity
    ▸ General-purpose vs. Specialization

    View Slide

  236. Retrospective: Three Principles Learnt
    63
    Runtime
    Application
    Architecture
    ▸ Empowering Web Developers
    ▸ Exploiting Application Diversity
    ▸ General-purpose vs. Specialization

    View Slide

  237. 64
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    2016
    Watt Wise Web
    The Web Evolution

    View Slide

  238. 64
    1990
    HTML
    1996
    JavaScript
    2008
    Mobile Web
    2012
    Responsive
    Web
    2016
    Watt Wise Web
    The Web Evolution
    ???

    View Slide

  239. 65

    View Slide

  240. 65

    View Slide

  241. 65
    Franeworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Franeworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Franeworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Franeworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render
    Franeworks and Libraries
    HTML JavaScript
    CSS
    Language Runtime
    Styling
    Security
    Local
    Storage
    User
    Input
    Layout
    Render

    View Slide

  242. 66
    wattwiseweb.org

    View Slide

  243. Thank you!

    View Slide

  244. [ACM Queue] Yuhao Zhu, Vijay Janapa Reddi, “The Red future of Mobile
    Web Computing”
    [PLDI 2016] Yuhao Zhu, Vijay Janapa Reddi, “GreenWeb: Language
    Extensions for Energy-Efficient Mobile Web Computing”
    [HPCA 2015] Yuhao Zhu, Matthew Halpern, Vijay Janapa Reddi, “Event-
    Based Scheduling for Energy-Efficient QoS (eQoS) in Mobile Web
    Applications”
    [HPCA 2013] Yuhao Zhu, Vijay Janapa Reddi, “High-Performance and
    Energy-Efficient Mobile Web Browsing on Big/Little Systems”
    [CAL 2012] Yuhao Zhu, Aditya Srikanth, Jingwen Leng, Vijay Janapa
    Reddi, “Exploiting Webpage Characteristics for Energy-Efficient Mobile
    Web Browsing” (Best of CAL)
    [ISCA 2014] Yuhao Zhu, Vijay Janapa Reddi, “WebCore: Architectural
    Support for Mobile Web Browsing”
    [IEEE MICRO 2015] Yuhao Zhu, Matthew Halpern, Vijay Janapa Reddi,
    “The Role of the CPU in Energy-Efficient Mobile Web Browsing”
    [HPCA 2016] Matthew Halpern, Yuhao Zhu, Vijay Janapa Reddi, “Mobile
    CPU’s Rise to Power: Quantifying the Impact of Generational Mobile
    CPU Design Trends on Performance, Energy, and User Satisfaction”
    GreenWeb
    WebRT
    WebCore
    Motivational
    Studies
    Future Web

    View Slide

  245. [DAC 2011] Yuhao Zhu, Yangdong Deng, Yubei Chen, “Hermes: An
    Integrated CPU/GPU Microarchitecture for IP Routing.”
    [DAC 2010] Bo Wang, Yuhao Zhu, Yangdong Deng, “Distributed Time,
    Conservative Parallel Logic Simulation on GPUs.”
    [TODAES 2011] Yuhao Zhu, Bo Wang, Yangdong Deng, “Massively
    Parallel Logic Simulation with GPUs.”
    [ISPASS 2015] Matthew Halpern, Yuhao Zhu, Ramesh Peri, and Vijay
    Janapa Reddi, “Mosaic: Cross-platform User-interaction Record and
    Replay for the Fragmented Android Ecosystem.”
    [IRPS 2014] Chen Zhou, Xiaofei Wang, Weichao Xu, Yuhao Zhu, Vijay
    Janapa Reddi, Chris Kim, “Estimation of Instantaneous Frequency
    Fluctuation in a Fast DVFS Environment Using an Empirical BTI Stress-
    Relaxation Model.”
    GPGPU &
    IP Routing
    Architecture
    Tools
    Reliability
    [MICRO 2015] Yuhao Zhu, Daniel Richins, Matthew Halpern, Vijay
    Janapa Reddi, “Microarchitectural Implications of Event-driven Server-
    side Web Applications” (Top Picks Honorable Mention)
    Server
    Microarch

    View Slide

  246. Coursework
    70
    Name Instructor Semester SUP Grade
    COMPILERS Keshav Pingali Fall 2010 A
    ADV EMBED MICROCONTROL SYS Mark McDermott Spring 2011 A-
    MEMORY MANAGEMENT Kathryn McKinley Spring 2011 Y A
    VLSI I Jacob Abraham Fall 2011 A-
    COMP ARCH: PARALLISM/LOCLTY Mattan Erez Fall 2011 A
    MICROARCHITECTURE Yale Patt Spring 2012 B
    DYNAMIC COMPILATION Vijay Janapa Reddi Spring 2012 A-
    COMP PERF EVAL/BENCHMARKING Lizy John Fall 2012 B+
    PARALLEL COMP ARCHITECTURE Derek Chiou Spring 2013 B+
    HUMAN COMPUT & CROWDSRCING Matt Lease Fall 2015 Y A-

    View Slide