Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WebCore: Architectural Support for Mobile Web Browsing

WebCore: Architectural Support for Mobile Web Browsing

ISCA 2014 Main talk

Yuhao Zhu

June 18, 2014
Tweet

More Decks by Yuhao Zhu

Other Decks in Education

Transcript

  1. WebCore:
    Architectural Support for Mobile Web Browsing
    Yuhao Zhu, Vijay Janapa Reddi
    Department of Electrical and Computer Engineering
    The University of Texas at Austin
    ISCA MainTalk — June 18th, 2014

    View Slide

  2. View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. Swift

    View Slide

  8. View Slide

  9. The Fundamental Challenges
    4

    View Slide

  10. The Fundamental Challenges
    4
    Achieving High Performance
    Demanded by End-User

    View Slide

  11. The Fundamental Challenges
    4
    Achieving High Performance
    Demanded by End-User
    Conserving Energy Due to
    Limited Battery Capacity

    View Slide

  12. The Fundamental Challenges
    4
    Achieving High Performance
    Demanded by End-User
    Conserving Energy Due to
    Limited Battery Capacity
    Conflicting
    requirements

    View Slide

  13. The Fundamental Challenges
    How to achieve high performance with low energy?
    4
    Achieving High Performance
    Demanded by End-User
    Conserving Energy Due to
    Limited Battery Capacity
    Conflicting
    requirements

    View Slide

  14. The Fundamental Challenges
    How to achieve high performance with low energy?
    4
    Achieving High Performance
    Demanded by End-User
    Conserving Energy Due to
    Limited Battery Capacity
    Conflicting
    requirements
    A mobile architecture

    View Slide

  15. The Fundamental Challenges
    How to achieve high performance with low energy?
    4
    Achieving High Performance
    Demanded by End-User
    Conserving Energy Due to
    Limited Battery Capacity
    Conflicting
    requirements
    A mobile architecture
    WebCore:

    View Slide

  16. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs

    View Slide

  17. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    Diminishing
    return

    View Slide

  18. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    ASIC?

    View Slide

  19. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    ASIC?
    Extremely challenging
    ‣Chrome: 7M LoC, 29 languages
    ‣Firefox: 10M LoC, 33 languages

    View Slide

  20. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    ASIC?

    View Slide

  21. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    ASIC? WebCore Goal

    View Slide

  22. Executive Summary
    5
    Time
    Energy
    General Purpose
    Designs
    ???
    ASIC? WebCore Goal

    View Slide

  23. Executive Summary
    6
    Time
    Energy
    General Purpose
    Designs
    WebCore Goal

    View Slide

  24. Executive Summary
    6
    Time
    Energy
    General Purpose
    Designs
    WebCore Goal

    View Slide

  25. Executive Summary
    6
    Time
    Energy
    General Purpose
    Designs
    Customizing µarch
    Parameters
    WebCore Goal

    View Slide

  26. Executive Summary
    6
    Time
    Energy
    General Purpose
    Designs
    Customizing µarch
    Parameters
    Specialized
    FU and Memory
    WebCore Goal

    View Slide

  27. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    7

    View Slide

  28. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    7

    View Slide

  29. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    ▸Evaluation Results
    7

    View Slide

  30. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    ▸Evaluation Results
    ▸Related Work
    7

    View Slide

  31. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    ▸Evaluation Results
    ▸Related Work
    8

    View Slide

  32. Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  33. ▸Why customization?!?
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  34. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  35. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)?
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  36. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)?
    ▹Are existing general purpose mobile designs ideal?
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  37. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)?
    ▹Are existing general purpose mobile designs ideal?
    ▸Exhaustive design space exploration
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  38. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)?
    ▹Are existing general purpose mobile designs ideal?
    ▸Exhaustive design space exploration
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  39. ▸Why customization?!?
    ▸What is a proper general purpose baseline architecture?
    ▹Out-of-order (Silvermont, A15) or in-order (Saltwell, A7)?
    ▹Are existing general purpose mobile designs ideal?
    ▸Exhaustive design space exploration
    Customization: Find the Ideal General
    Purpose Baseline Architecture

    View Slide

  40. Design Space Exploration (DSE) Setup
    ▸Integrated power (McPAT) and performance
    x86 full-system simulator (Marss86)
    ▸WebKit engine in the Chromium Web browser
    10

    View Slide

  41. Design Space Exploration (DSE) Setup
    ▸Integrated power (McPAT) and performance
    x86 full-system simulator (Marss86)
    ▸WebKit engine in the Chromium Web browser
    10

    View Slide

  42. Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA

    View Slide

  43. ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA

    View Slide

  44. ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA
    10-4
    10-3
    10-2
    10-1
    100
    101
    PC2 (log)
    -5 0 5
    PC1

    View Slide

  45. ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA
    10-4
    10-3
    10-2
    10-1
    100
    101
    PC2 (log)
    -5 0 5
    PC1
    dominated by
    # webpage elements

    View Slide

  46. ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA
    10-4
    10-3
    10-2
    10-1
    100
    101
    PC2 (log)
    -5 0 5
    PC1
    dominated by IPC

    View Slide

  47. ▹PCs calculated from webpage-inherent and µarch-dependent
    features (~400 in total)
    10-4
    10-3
    10-2
    10-1
    100
    101
    PC2 (log)
    -5 0 5
    PC1
    Design Space Exploration (DSE) Setup
    11
    ▸Webpages selection using PCA

    View Slide

  48. Design Space Exploration (DSE) Findings
    12

    View Slide

  49. Design Space Exploration (DSE) Findings
    12

    View Slide

  50. Design Space Exploration (DSE) Findings
    12

    View Slide

  51. Design Space Exploration (DSE) Findings
    ▸Out-of-order µarchitecture
    is much more flexible
    12

    View Slide

  52. Design Space Exploration (DSE) Findings
    ▸Out-of-order µarchitecture
    is much more flexible
    12

    View Slide

  53. Design Space Exploration (DSE) Findings
    ▸Out-of-order µarchitecture
    is much more flexible
    12
    ▸In-order cores are
    acceptable if end-users
    can tolerate latency

    View Slide

  54. Understand the Difference Using Kernel
    Knowledge
    13

    View Slide

  55. Understand the Difference Using Kernel
    Knowledge
    13
    Execution time
    breakdown

    View Slide

  56. Understand the Difference Using Kernel
    Knowledge
    In-order design 13

    View Slide

  57. Understand the Difference Using Kernel
    Knowledge
    In-order design 13

    View Slide

  58. ▸In-order designs show strong kernel variance
    Understand the Difference Using Kernel
    Knowledge
    In-order design 13

    View Slide

  59. ▸In-order designs show strong kernel variance
    Understand the Difference Using Kernel
    Knowledge
    In-order design 13

    View Slide

  60. ▸In-order designs show strong kernel variance
    Understand the Difference Using Kernel
    Knowledge
    In-order design 13

    View Slide

  61. ▸In-order designs show strong kernel variance
    Understand the Difference Using Kernel
    Knowledge
    In-order design 13
    Out-of-order design

    View Slide

  62. ▸In-order designs show strong kernel variance
    Understand the Difference Using Kernel
    Knowledge
    In-order design 13
    Out-of-order design
    ▸An Out-of-order design can accommodate kernel variance

    View Slide

  63. 14
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  64. 14
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  65. 14
    Customization: Identifying Major Sources
    of Energy Inefficiency
    P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096

    View Slide

  66. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096 15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  67. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096 15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  68. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096 15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  69. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096
    ▸Instruction delivery
    15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  70. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096
    ▸Instruction delivery
    15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  71. P1 P2 ARM
    A15
    Issue width 1 3 3
    # Function units 2 3 8
    Load queue size 4 16
    16
    Store queue size 4 16
    16
    BTB size 1024 128 256
    ROB size 128 128 40+
    L1 I-$ size (KB) 64 128 32
    # Physical
    registers
    128 140 ?
    L1 D-$ size (KB) 8 64 32
    L2-$ size (KB) 256 1024 <4096
    ▸Instruction delivery
    ▸Data feeding
    15
    P2
    P1
    Customization: Identifying Major Sources
    of Energy Inefficiency

    View Slide

  72. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    -Mitigate instruction delivery: Style resolution unit (SRU)
    -Improving data feeding: Browser engine cache
    ▸Evaluation Results
    ▸Related Work
    16

    View Slide

  73. WebCore Specialization Overview
    17
    Customized
    core
    IF ID EX MEM WB
    Hardware
    Layer

    View Slide

  74. WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    Hardware
    Layer

    View Slide

  75. WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Hardware
    Layer

    View Slide

  76. L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Hardware
    Layer

    View Slide

  77. L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Hardware
    Layer
    Browser
    Engine Cache

    View Slide

  78. L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Hardware
    Layer
    API
    Layer
    Browser
    Engine Cache

    View Slide

  79. L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Browser
    Engine Cache

    View Slide

  80. DOM_LD(Id, &attr);
    DOM_ST(Id, &attr);
    L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Browser
    Engine Cache

    View Slide

  81. DOM_LD(Id, &attr);
    DOM_ST(Id, &attr);
    L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Runtime
    Layer
    Browser
    Engine Cache

    View Slide

  82. DOM_LD(Id, &attr);
    DOM_ST(Id, &attr);
    L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Runtime
    Layer
    Cache
    Management
    Browser
    Engine Cache

    View Slide

  83. DOM_LD(Id, &attr);
    DOM_ST(Id, &attr);
    L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Runtime
    Layer
    Cache
    Management
    SRU
    Access
    Browser
    Engine Cache

    View Slide

  84. DOM_LD(Id, &attr);
    DOM_ST(Id, &attr);
    L1 D-cache
    WebCore Specialization Overview
    17
    Customized
    core
    IF ID MEM WB
    ALU
    MUL
    FPU
    SRU
    Style_apply(Id);
    Hardware
    Layer
    API
    Layer
    Runtime
    Layer
    Cache
    Management
    Software
    Failsafe
    SRU
    Access
    Browser
    Engine Cache

    View Slide

  85. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    -Mitigate instruction delivery: Style resolution unit (SRU)
    -Improving data feeding: Browser engine cache
    ▸Evaluation Results
    ▸Related Work
    18

    View Slide

  86. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19

    View Slide

  87. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    Execution time
    breakdown
    Energy consumption
    breakdown

    View Slide

  88. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}

    View Slide

  89. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}

    View Slide

  90. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)

    View Slide

  91. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)

    View Slide

  92. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)
    Property-level
    Parallelism (PLP)

    View Slide

  93. ▸Style kernel is the most critical kernel
    Style Resolution Unit
    19
    for (each rule in matchedRules) {
    for (each property in rule) {
    switch (property.id) {
    case Font:
    Style[Font] = Handler(property.value, DOMNode);
    break;
    case N: ...}}}
    Rule-level
    Parallelism (RLP)
    Property-level
    Parallelism (PLP)
    ▸Exploiting the parallelism to increase the arithmetic intensity
    and reduce instruction footprint

    View Slide

  94. ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules padding 0
    width
    6 px 36 px
    margin 0

    View Slide

  95. ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  96. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  97. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  98. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  99. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  100. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0
    width
    6 px 36 px
    margin 0
    High priority

    View Slide

  101. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0
    width
    6 px
    36 px
    margin 0
    High priority

    View Slide

  102. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0
    width
    6 px
    36 px
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    margin 0
    High priority

    View Slide

  103. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0 width
    6 px 36 px
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    margin 0
    High priority

    View Slide

  104. Property 1
    Property 1 Property 2
    Property 2 Property 3
    Property 3
    id value id value id value
    Final Style Info
    ▸A running example from www.cnn.com
    Style Resolution Unit (2)
    Rule
    Property 1
    Property 1 Property 2
    Property 2
    Rule
    id value id value
    1 padding 0 margin 0
    2 padding 6 px width 36 px
    Style Rules
    padding 0 width
    6 px 36 px
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    margin 0
    High priority

    View Slide

  105. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP

    View Slide

  106. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP

    View Slide

  107. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority

    View Slide

  108. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    Conflict
    Resolution
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority

    View Slide

  109. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    Conflict
    Resolution
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority
    Prop m Prop m

    View Slide

  110. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    Conflict
    Resolution
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority
    Prop m

    View Slide

  111. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    Conflict
    Resolution
    Compute
    Lanes
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority

    View Slide

  112. ... ... Rule j
    ... ...
    Prop l
    ... ...
    Rule i.id
    ... Prop m ... Prop k ...
    Rule j.id
    ...
    ...
    ... ... ...
    start end start end
    Rule i
    Prop k
    Prop m Prop m
    Prop l
    Style l Style m Style k
    Style Resolution Unit (3)
    21
    Input
    Scratchpad
    Memory
    Output
    Scratchpad
    Memory
    Conflict
    Resolution
    Compute
    Lanes
    ▸Order Matters in RLP
    ▸Order Does Not Matter in PLP
    Higher Priority

    View Slide

  113. Agenda of Today’s Talk
    ▸Motivation of our work: energy-efficiency of the mobile Web
    ▸How does WebCore improve the energy-efficiency?
    ▹Customization
    ▹Specialization
    ▸Evaluation Results
    ▸Related Work
    22

    View Slide

  114. Evaluations
    23
    ▸Fully synthesized using Synopsys 28 nm toolchain

    View Slide

  115. Evaluations
    23
    ▸Fully synthesized using Synopsys 28 nm toolchain
    ▸24 representative webpages

    View Slide

  116. Evaluations
    23
    ▸Fully synthesized using Synopsys 28 nm toolchain
    ▸24 representative webpages
    www.amazon.com
    www.cnn.com
    www.msn.com
    www.google.com.hk
    www.twitter.com
    www.espn.go.com
    www.bbc.co.uk
    www.slashdot.org
    www.youtube.com
    www.ebay.com
    www.sina.com.cn
    www.163.com
    Desktop and mobile versions

    View Slide

  117. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)

    View Slide

  118. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design

    View Slide

  119. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization

    View Slide

  120. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    A15-like
    design
    Customization

    View Slide

  121. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    A15-like
    design
    Customization

    View Slide

  122. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    A15-like
    design
    Customization
    Specialization

    View Slide

  123. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    22.2%
    A15-like
    design
    Customization
    Specialization

    View Slide

  124. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    18.6%
    22.2%
    9.2%
    22.2%
    A15-like
    design
    Customization
    Specialization

    View Slide

  125. Evaluations
    24
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    29.2%
    47.0%

    View Slide

  126. Evaluations
    25
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    Cost of specialization:
    0.59 mm2 area overhead

    View Slide

  127. Evaluations
    25
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    Cost of specialization:
    0.59 mm2 area overhead
    Better than scaling-
    up approaches

    View Slide

  128. Evaluations
    25
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    Cost of specialization:
    0.59 mm2 area overhead
    Better than scaling-
    up approaches
    I$

    View Slide

  129. Evaluations
    25
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    Cost of specialization:
    0.59 mm2 area overhead
    Better than scaling-
    up approaches
    D$

    View Slide

  130. Evaluations
    25
    0.55
    0.688
    0.825
    0.963
    1.1
    1.6 1.8 2 2.2 2.4
    Energy (J)
    Load Time (s)
    A15-like
    design
    Customization
    Specialization
    Cost of specialization:
    0.59 mm2 area overhead
    Better than scaling-
    up approaches
    I+D$

    View Slide

  131. Related Work
    26
    Hardware
    Software
    Focus on Performance
    Focus on Energy-Efficiency

    View Slide

  132. Related Work
    26
    Hardware
    Software
    Focus on Performance
    Focus on Energy-Efficiency
    Parallelization
    Algorithm-
    level
    Zoomm
    Mozilla
    Servo

    View Slide

  133. Related Work
    26
    Hardware
    Software
    Focus on Performance
    Focus on Energy-Efficiency
    Parallelization
    Algorithm-
    level
    Zoomm
    Mozilla
    Servo
    System-
    level Optimizations
    Redundancy
    Removal
    Prefetching Big/little
    Scheduling

    View Slide

  134. Related Work
    26
    Hardware
    Software
    Focus on Performance
    Focus on Energy-Efficiency
    Parallelization
    Algorithm-
    level
    Zoomm
    Mozilla
    Servo
    ASIC
    Tegra 4
    WebRTC
    accelerator
    SiChrome
    System-
    level Optimizations
    Redundancy
    Removal
    Prefetching Big/little
    Scheduling

    View Slide

  135. Related Work
    26
    Hardware
    Software
    Focus on Performance
    Focus on Energy-Efficiency
    Parallelization
    Algorithm-
    level
    Zoomm
    Mozilla
    Servo
    ASIC
    Tegra 4
    WebRTC
    accelerator
    SiChrome
    System-
    level Optimizations
    Redundancy
    Removal
    Prefetching Big/little
    Scheduling
    WebCore

    View Slide

  136. Conclusions
    27
    The Web browser has become a
    general purpose platform that supports
    a wide range of mobile Web applications
    Customization allows us to find the ideal
    general-purpose baseline architecture
    Hardware/software collaborative
    specialization leverages application
    knowledge to mitigate inefficiencies
    in general-purpose architectures

    View Slide

  137. Thank you

    View Slide