WebCore: Architectural Support for Mobile Web Browsing

WebCore: Architectural Support for Mobile Web Browsing

ISCA 2014 Main talk

3c332dfc0b438785cb10c5234652dd66?s=128

Yuhao Zhu

June 18, 2014
Tweet

Transcript

  1. WebCore: Architectural Support for Mobile Web Browsing Yuhao Zhu, Vijay

    Janapa Reddi Department of Electrical and Computer Engineering The University of Texas at Austin ISCA MainTalk — June 18th, 2014
  2. None
  3. None
  4. None
  5. None
  6. None
  7. Swift

  8. None
  9. The Fundamental Challenges 4

  10. The Fundamental Challenges 4 Achieving High Performance Demanded by End-User

  11. The Fundamental Challenges 4 Achieving High Performance Demanded by End-User

    Conserving Energy Due to Limited Battery Capacity
  12. The Fundamental Challenges 4 Achieving High Performance Demanded by End-User

    Conserving Energy Due to Limited Battery Capacity Conflicting requirements
  13. The Fundamental Challenges How to achieve high performance with low

    energy? 4 Achieving High Performance Demanded by End-User Conserving Energy Due to Limited Battery Capacity Conflicting requirements
  14. The Fundamental Challenges How to achieve high performance with low

    energy? 4 Achieving High Performance Demanded by End-User Conserving Energy Due to Limited Battery Capacity Conflicting requirements A mobile architecture
  15. The Fundamental Challenges How to achieve high performance with low

    energy? 4 Achieving High Performance Demanded by End-User Conserving Energy Due to Limited Battery Capacity Conflicting requirements A mobile architecture WebCore:
  16. Executive Summary 5 Time Energy General Purpose Designs

  17. Executive Summary 5 Time Energy General Purpose Designs Diminishing return

  18. Executive Summary 5 Time Energy General Purpose Designs ASIC?

  19. Executive Summary 5 Time Energy General Purpose Designs ASIC? Extremely

    challenging ‣Chrome: 7M LoC, 29 languages ‣Firefox: 10M LoC, 33 languages
  20. Executive Summary 5 Time Energy General Purpose Designs ASIC?

  21. Executive Summary 5 Time Energy General Purpose Designs ASIC? WebCore

    Goal
  22. Executive Summary 5 Time Energy General Purpose Designs ??? ASIC?

    WebCore Goal
  23. Executive Summary 6 Time Energy General Purpose Designs WebCore Goal

  24. Executive Summary 6 Time Energy General Purpose Designs WebCore Goal

  25. Executive Summary 6 Time Energy General Purpose Designs Customizing µarch

    Parameters WebCore Goal
  26. Executive Summary 6 Time Energy General Purpose Designs Customizing µarch

    Parameters Specialized FU and Memory WebCore Goal
  27. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web 7
  28. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization 7
  29. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization ▸Evaluation Results 7
  30. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization ▸Evaluation Results ▸Related Work 7
  31. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization ▸Evaluation Results ▸Related Work 8
  32. Customization: Find the Ideal General Purpose Baseline Architecture

  33. ▸Why customization?!? Customization: Find the Ideal General Purpose Baseline Architecture

  34. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    Customization: Find the Ideal General Purpose Baseline Architecture
  35. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)? Customization: Find the Ideal General Purpose Baseline Architecture
  36. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)? ▹Are existing general purpose mobile designs ideal? Customization: Find the Ideal General Purpose Baseline Architecture
  37. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)? ▹Are existing general purpose mobile designs ideal? ▸Exhaustive design space exploration Customization: Find the Ideal General Purpose Baseline Architecture
  38. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)? ▹Are existing general purpose mobile designs ideal? ▸Exhaustive design space exploration Customization: Find the Ideal General Purpose Baseline Architecture
  39. ▸Why customization?!? ▸What is a proper general purpose baseline architecture?

    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)? ▹Are existing general purpose mobile designs ideal? ▸Exhaustive design space exploration Customization: Find the Ideal General Purpose Baseline Architecture
  40. Design Space Exploration (DSE) Setup ▸Integrated power (McPAT) and performance

    x86 full-system simulator (Marss86) ▸WebKit engine in the Chromium Web browser 10
  41. Design Space Exploration (DSE) Setup ▸Integrated power (McPAT) and performance

    x86 full-system simulator (Marss86) ▸WebKit engine in the Chromium Web browser 10
  42. Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA

  43. ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total)

    Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA
  44. ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total)

    Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA 10-4 10-3 10-2 10-1 100 101 PC2 (log) -5 0 5 PC1
  45. ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total)

    ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total) Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA 10-4 10-3 10-2 10-1 100 101 PC2 (log) -5 0 5 PC1 dominated by # webpage elements
  46. ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total)

    ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total) ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total) Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA 10-4 10-3 10-2 10-1 100 101 PC2 (log) -5 0 5 PC1 dominated by IPC
  47. ▹PCs calculated from webpage-inherent and µarch-dependent features (~400 in total)

    10-4 10-3 10-2 10-1 100 101 PC2 (log) -5 0 5 PC1 Design Space Exploration (DSE) Setup 11 ▸Webpages selection using PCA
  48. Design Space Exploration (DSE) Findings 12

  49. Design Space Exploration (DSE) Findings 12

  50. Design Space Exploration (DSE) Findings 12

  51. Design Space Exploration (DSE) Findings ▸Out-of-order µarchitecture is much more

    flexible 12
  52. Design Space Exploration (DSE) Findings ▸Out-of-order µarchitecture is much more

    flexible 12
  53. Design Space Exploration (DSE) Findings ▸Out-of-order µarchitecture is much more

    flexible 12 ▸In-order cores are acceptable if end-users can tolerate latency
  54. Understand the Difference Using Kernel Knowledge 13

  55. Understand the Difference Using Kernel Knowledge 13 Execution time breakdown

  56. Understand the Difference Using Kernel Knowledge In-order design 13

  57. Understand the Difference Using Kernel Knowledge In-order design 13

  58. ▸In-order designs show strong kernel variance Understand the Difference Using

    Kernel Knowledge In-order design 13
  59. ▸In-order designs show strong kernel variance Understand the Difference Using

    Kernel Knowledge In-order design 13
  60. ▸In-order designs show strong kernel variance Understand the Difference Using

    Kernel Knowledge In-order design 13
  61. ▸In-order designs show strong kernel variance Understand the Difference Using

    Kernel Knowledge In-order design 13 Out-of-order design
  62. ▸In-order designs show strong kernel variance Understand the Difference Using

    Kernel Knowledge In-order design 13 Out-of-order design ▸An Out-of-order design can accommodate kernel variance
  63. 14 Customization: Identifying Major Sources of Energy Inefficiency

  64. 14 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency

  65. 14 Customization: Identifying Major Sources of Energy Inefficiency P1 P2

    ARM A15 Issue width 1 3 3 # Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096
  66. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  67. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  68. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  69. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 ▸Instruction delivery 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  70. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 ▸Instruction delivery 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  71. P1 P2 ARM A15 Issue width 1 3 3 #

    Function units 2 3 8 Load queue size 4 16 16 Store queue size 4 16 16 BTB size 1024 128 256 ROB size 128 128 40+ L1 I-$ size (KB) 64 128 32 # Physical registers 128 140 ? L1 D-$ size (KB) 8 64 32 L2-$ size (KB) 256 1024 <4096 ▸Instruction delivery ▸Data feeding 15 P2 P1 Customization: Identifying Major Sources of Energy Inefficiency
  72. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization -Mitigate instruction delivery: Style resolution unit (SRU) -Improving data feeding: Browser engine cache ▸Evaluation Results ▸Related Work 16
  73. WebCore Specialization Overview 17 Customized core IF ID EX MEM

    WB Hardware Layer
  74. WebCore Specialization Overview 17 Customized core IF ID MEM WB

    ALU MUL FPU Hardware Layer
  75. WebCore Specialization Overview 17 Customized core IF ID MEM WB

    ALU MUL FPU SRU Hardware Layer
  76. L1 D-cache WebCore Specialization Overview 17 Customized core IF ID

    MEM WB ALU MUL FPU SRU Hardware Layer
  77. L1 D-cache WebCore Specialization Overview 17 Customized core IF ID

    MEM WB ALU MUL FPU SRU Hardware Layer Browser Engine Cache
  78. L1 D-cache WebCore Specialization Overview 17 Customized core IF ID

    MEM WB ALU MUL FPU SRU Hardware Layer API Layer Browser Engine Cache
  79. L1 D-cache WebCore Specialization Overview 17 Customized core IF ID

    MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Browser Engine Cache
  80. DOM_LD(Id, &attr); DOM_ST(Id, &attr); L1 D-cache WebCore Specialization Overview 17

    Customized core IF ID MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Browser Engine Cache
  81. DOM_LD(Id, &attr); DOM_ST(Id, &attr); L1 D-cache WebCore Specialization Overview 17

    Customized core IF ID MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Runtime Layer Browser Engine Cache
  82. DOM_LD(Id, &attr); DOM_ST(Id, &attr); L1 D-cache WebCore Specialization Overview 17

    Customized core IF ID MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Runtime Layer Cache Management Browser Engine Cache
  83. DOM_LD(Id, &attr); DOM_ST(Id, &attr); L1 D-cache WebCore Specialization Overview 17

    Customized core IF ID MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Runtime Layer Cache Management SRU Access Browser Engine Cache
  84. DOM_LD(Id, &attr); DOM_ST(Id, &attr); L1 D-cache WebCore Specialization Overview 17

    Customized core IF ID MEM WB ALU MUL FPU SRU Style_apply(Id); Hardware Layer API Layer Runtime Layer Cache Management Software Failsafe SRU Access Browser Engine Cache
  85. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization -Mitigate instruction delivery: Style resolution unit (SRU) -Improving data feeding: Browser engine cache ▸Evaluation Results ▸Related Work 18
  86. ▸Style kernel is the most critical kernel Style Resolution Unit

    19
  87. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 Execution time breakdown Energy consumption breakdown
  88. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}}
  89. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}}
  90. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP)
  91. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP)
  92. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP) Property-level Parallelism (PLP)
  93. ▸Style kernel is the most critical kernel Style Resolution Unit

    19 for (each rule in matchedRules) { for (each property in rule) { switch (property.id) { case Font: Style[Font] = Handler(property.value, DOMNode); break; case N: ...}}} Rule-level Parallelism (RLP) Property-level Parallelism (PLP) ▸Exploiting the parallelism to increase the arithmetic intensity and reduce instruction footprint
  94. ▸A running example from www.cnn.com Style Resolution Unit (2) Rule

    Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0
  95. ▸A running example from www.cnn.com Style Resolution Unit (2) Rule

    Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  96. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  97. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  98. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  99. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  100. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  101. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px margin 0 High priority
  102. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px ▸Order Matters in RLP ▸Order Does Not Matter in PLP margin 0 High priority
  103. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px ▸Order Matters in RLP ▸Order Does Not Matter in PLP margin 0 High priority
  104. Property 1 Property 1 Property 2 Property 2 Property 3

    Property 3 id value id value id value Final Style Info ▸A running example from www.cnn.com Style Resolution Unit (2) Rule Property 1 Property 1 Property 2 Property 2 Rule id value id value 1 padding 0 margin 0 2 padding 6 px width 36 px Style Rules padding 0 width 6 px 36 px ▸Order Matters in RLP ▸Order Does Not Matter in PLP margin 0 High priority
  105. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 ▸Order Matters in RLP ▸Order Does Not Matter in PLP
  106. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory ▸Order Matters in RLP ▸Order Does Not Matter in PLP
  107. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority
  108. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory Conflict Resolution ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority
  109. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory Conflict Resolution ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority Prop m Prop m
  110. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory Conflict Resolution ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority Prop m
  111. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory Conflict Resolution Compute Lanes ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority
  112. ... ... Rule j ... ... Prop l ... ...

    Rule i.id ... Prop m ... Prop k ... Rule j.id ... ... ... ... ... start end start end Rule i Prop k Prop m Prop m Prop l Style l Style m Style k Style Resolution Unit (3) 21 Input Scratchpad Memory Output Scratchpad Memory Conflict Resolution Compute Lanes ▸Order Matters in RLP ▸Order Does Not Matter in PLP Higher Priority
  113. Agenda of Today’s Talk ▸Motivation of our work: energy-efficiency of

    the mobile Web ▸How does WebCore improve the energy-efficiency? ▹Customization ▹Specialization ▸Evaluation Results ▸Related Work 22
  114. Evaluations 23 ▸Fully synthesized using Synopsys 28 nm toolchain

  115. Evaluations 23 ▸Fully synthesized using Synopsys 28 nm toolchain ▸24

    representative webpages
  116. Evaluations 23 ▸Fully synthesized using Synopsys 28 nm toolchain ▸24

    representative webpages www.amazon.com www.cnn.com www.msn.com www.google.com.hk www.twitter.com www.espn.go.com www.bbc.co.uk www.slashdot.org www.youtube.com www.ebay.com www.sina.com.cn www.163.com Desktop and mobile versions
  117. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s)
  118. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design
  119. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization
  120. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) 18.6% A15-like design Customization
  121. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% A15-like design Customization
  122. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% A15-like design Customization Specialization
  123. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% 22.2% A15-like design Customization Specialization
  124. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) 18.6% 22.2% 9.2% 22.2% A15-like design Customization Specialization
  125. Evaluations 24 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization 29.2% 47.0%
  126. Evaluations 25 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization Cost of specialization: 0.59 mm2 area overhead
  127. Evaluations 25 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization Cost of specialization: 0.59 mm2 area overhead Better than scaling- up approaches
  128. Evaluations 25 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization Cost of specialization: 0.59 mm2 area overhead Better than scaling- up approaches I$
  129. Evaluations 25 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization Cost of specialization: 0.59 mm2 area overhead Better than scaling- up approaches D$
  130. Evaluations 25 0.55 0.688 0.825 0.963 1.1 1.6 1.8 2

    2.2 2.4 Energy (J) Load Time (s) A15-like design Customization Specialization Cost of specialization: 0.59 mm2 area overhead Better than scaling- up approaches I+D$
  131. Related Work 26 Hardware Software Focus on Performance Focus on

    Energy-Efficiency
  132. Related Work 26 Hardware Software Focus on Performance Focus on

    Energy-Efficiency Parallelization Algorithm- level Zoomm Mozilla Servo
  133. Related Work 26 Hardware Software Focus on Performance Focus on

    Energy-Efficiency Parallelization Algorithm- level Zoomm Mozilla Servo System- level Optimizations Redundancy Removal Prefetching Big/little Scheduling
  134. Related Work 26 Hardware Software Focus on Performance Focus on

    Energy-Efficiency Parallelization Algorithm- level Zoomm Mozilla Servo ASIC Tegra 4 WebRTC accelerator SiChrome System- level Optimizations Redundancy Removal Prefetching Big/little Scheduling
  135. Related Work 26 Hardware Software Focus on Performance Focus on

    Energy-Efficiency Parallelization Algorithm- level Zoomm Mozilla Servo ASIC Tegra 4 WebRTC accelerator SiChrome System- level Optimizations Redundancy Removal Prefetching Big/little Scheduling WebCore
  136. Conclusions 27 The Web browser has become a general purpose

    platform that supports a wide range of mobile Web applications Customization allows us to find the ideal general-purpose baseline architecture Hardware/software collaborative specialization leverages application knowledge to mitigate inefficiencies in general-purpose architectures
  137. Thank you