Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High-Performance and EnergyEfficient Mobile Web Browsing on Big/Little Systems

Yuhao Zhu
February 25, 2013

High-Performance and EnergyEfficient Mobile Web Browsing on Big/Little Systems

HPCA 2013

Yuhao Zhu

February 25, 2013
Tweet

More Decks by Yuhao Zhu

Other Decks in Education

Transcript

  1. High-Performance and Energy- Efficient Mobile Web Browsing on Big/Little Systems

    Yuhao Zhu Vijay Janapa Reddi Trinity Research Group The University of Texas at Austin
  2. Web Browsing Source: Flurry Analytics, 2011 2 Browser 47.8% Others

    52.2% •SMS •Games •Music •Phone calls •etc
  3. Web Browsing Source: Flurry Analytics, 2011 2 Browser 47.8% Others

    52.2% •SMS •Games •Music •Phone calls •etc
  4. Web Browsing Source: Flurry Analytics, 2011 3 Browser 47.8% Others

    52.2% •SMS •Games •Music •Phone calls •etc
  5. Web Browsing Source: Flurry Analytics, 2011 4 Browser 47.8% Others

    52.2% •SMS •Games •Music •Phone calls •etc
  6. Web Browsing Source: Flurry Analytics, 2011 5 Browser 47.8% Others

    52.2% •SMS •Games •Music •Phone calls •etc
  7. 6 Mobile Web Browsing Source: Reproduced from Microsoft Tag, 2011

    Internet Users 0 500 1000 1500 2000 2007 2009 2011E 2013E 2015E
  8. 6 Mobile Web Browsing Source: Reproduced from Microsoft Tag, 2011

    Internet Users 0 500 1000 1500 2000 2007 2009 2011E 2013E 2015E
  9. 6 Mobile Web Browsing Source: Reproduced from Microsoft Tag, 2011

    Internet Users 0 500 1000 1500 2000 2007 2009 2011E 2013E 2015E
  10. 8 Increasing Computational Intensity www.cnn.com ARM Cortex A9 @ 1.2

    GHz Webpages are becoming computationally intensive Compute Rendering Engine Network
  11. 8 Increasing Computational Intensity www.cnn.com ARM Cortex A9 @ 1.2

    GHz Webpages are becoming computationally intensive
  12. www.cnn.com 9 ARM Cortex A9 @ 1.2 GHz Increasing Computational

    Intensity Webpages are becoming computationally intensive
  13. www.cnn.com 10 ARM Cortex A9 @ 1.2 GHz Increasing Computational

    Intensity Webpages are becoming computationally intensive
  14. www.cnn.com 11 ARM Cortex A9 @ 1.2 GHz Increasing Computational

    Intensity Webpages are becoming computationally intensive
  15. www.cnn.com 12 ARM Cortex A9 @ 1.2 GHz Webpages are

    becoming computationally intensive Increasing Computational Intensity
  16. www.cnn.com 12 Mobile systems demand high performance! ARM Cortex A9

    @ 1.2 GHz Webpages are becoming computationally intensive Increasing Computational Intensity
  17. www.cnn.com Can we deploy a mobile system purely driven by

    performance objectives? 12 Mobile systems demand high performance! ARM Cortex A9 @ 1.2 GHz Webpages are becoming computationally intensive Increasing Computational Intensity
  18. No. Mobile devices are battery-constrained! www.cnn.com Can we deploy a

    mobile system purely driven by performance objectives? 12 Mobile systems demand high performance! ARM Cortex A9 @ 1.2 GHz Webpages are becoming computationally intensive Increasing Computational Intensity
  19. Executive Summary 13 Challenge: How to design the system architecture

    that guarantees both high performance and energy efficiency?
  20. Executive Summary 13 Challenge: How to design the system architecture

    that guarantees both high performance and energy efficiency? Alternatives: Single big/little core; symmetric designs; asymmetric designs;
  21. Challenge: How to design the system architecture that guarantees both

    high performance and energy efficiency? Alternatives: Single big/little core; symmetric designs; asymmetric designs; Big/Little systems • Different microarchitectures (Big, OoO + little, in-order) • Different operating points (DVFS) 14 Executive Summary
  22. Challenge: How to design the system architecture that guarantees both

    high performance and energy efficiency? Alternatives: Single big/little core; symmetric designs; asymmetric designs; Big/Little systems • Different microarchitectures (Big, OoO + little, in-order) • Different operating points (DVFS) Key insight: Webpages have different characteristics that lead to load time and energy consumption variance 14 Executive Summary
  23. Challenge: How to design the system architecture that guarantees both

    high performance and energy efficiency? Alternatives: Single big/little core; symmetric designs; asymmetric designs; Big/Little systems • Different microarchitectures (Big, OoO + little, in-order) • Different operating points (DVFS) Key insight: Webpages have different characteristics that lead to load time and energy consumption variance 14 Solution: Predict <core, frequency> configuration and schedule webpages accordingly Executive Summary
  24. Software Setup • We studied the Gecko rendering engine in

    Firefox • Excluded boot-strap and shut-down effects • Disabled browser cache • Hottest 5,000 webpages from www.alexa.com • Downloaded and mapped to the memory 15 Independent of the particular browser
  25. Hardware Setup Big core: ARM Cortex A9: OoO with 4

    issue (e.g. in Tegra 3-based tablets) PandaBoard ES Rev B1, 45nm DVFS: 350 MHz, 0.83 V DVFS: 700 MHz, 1.01 V DVFS: 920 MHz, 1.11 V DVFS: 1.2 GHz, 1.27 V 16
  26. Hardware Setup Big core: ARM Cortex A9: OoO with 4

    issue (e.g. in Tegra 3-based tablets) PandaBoard ES Rev B1, 45nm DVFS: 350 MHz, 0.83 V DVFS: 700 MHz, 1.01 V DVFS: 920 MHz, 1.11 V DVFS: 1.2 GHz, 1.27 V 16 < 3% run to run variation across 10 runs; use the median Built a current sensing circuitry to measure the voltage and energy of the SoC (isolate from other board peripherals)
  27. Hardware Setup Big core: ARM Cortex A9: OoO with 4

    issue (e.g. in Tegra 3-based tablets) PandaBoard ES Rev B1, 45nm DVFS: 350 MHz, 0.83 V DVFS: 700 MHz, 1.01 V DVFS: 920 MHz, 1.11 V DVFS: 1.2 GHz, 1.27 V 17 Little core: ARM Cortex A8: In-order with 2 issue (e.g. in Apple A4 -- iPhone 4) BeagleBoard xM, 45 nm DVFS: 300 MHz, 0.94 V DVFS: 600 MHz, 1.10 V DVFS: 800 MHz, 1.26 V
  28. Why Big/Little Systems? 19 A9 1.2GHz A9 920MHz A9 700MHz

    A9 350MHz A8 800MHz A8 600MHz A8 300MHz www.autoblog.com
  29. Why Big/Little Systems? 19 A9 1.2GHz A9 920MHz A9 700MHz

    A9 350MHz A8 800MHz A8 600MHz A8 300MHz www.autoblog.com
  30. 21 “Webpages variance” in load time and energy Why Big/Little

    Systems? Different operating frequencies Different uarchitectures www.adobe.com www.newegg.com www.autoblog.com
  31. Our Approach Workload Characterization Performance/Energy Prediction Resource Management How to

    leverage the big/little system to capture the webpage variance? 22
  32. Our Approach How to capture the dynamic behavior of web

    browsing? 23 Workload Characterization Performance/Energy Prediction Resource Management
  33. Webpage Characterization We treat webpages as the workload instead of

    the browser! HTML (Structure) CSS (Style) 24
  34. Webpage Characterization We treat webpages as the workload instead of

    the browser! HTML (Structure) CSS (Style) Tag (h3, li, table, img) Attribute 24
  35. Webpage Characterization We treat webpages as the workload instead of

    the browser! HTML (Structure) CSS (Style) Selector Property Tag (h3, li, table, img) Attribute 24
  36. Webpage Characterization DOM Tree We treat webpages as the workload

    instead of the browser! HTML (Structure) CSS (Style) Selector Property Tag (h3, li, table, img) Attribute 24
  37. Webpage Characterization DOM Tree We treat webpages as the workload

    instead of the browser! HTML (Structure) CSS (Style) Selector Property Tag (h3, li, table, img) Attribute Tag (h3, li, table, img) HTML (Structure) 24
  38. HTML Tag Analysis # Tags 29 5K sorted by #tags

    Webpages have different tag counts (instruction counts) 5 Number of Tags (K)
  39. HTML Tag Analysis # Tags Webpages have a few hot

    HTML tags (hot instructions) # Tags 30 5K sorted by #tags Webpages have different tag counts (instruction counts) 5 Number of Tags (K)
  40. HTML Tag Analysis # Tags Webpages have a few hot

    HTML tags (hot instructions) # Tags 30 5K sorted by #tags Webpages have different tag counts (instruction counts) 5 Number of Tags (K)
  41. HTML Tag Analysis # Tags # Tags 31 5K sorted

    by #tags Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts) 5 Number of Tags (K)
  42. HTML Tag Analysis # Tags # Tags 31 5K sorted

    by #tags Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts) 5 Number of Tags (K)
  43. Tag Processing Overhead 32 0 50 100 150 200 h3

    table img 0 175 350 525 700 Load time Energy ms mJ Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts)
  44. Tag Processing Overhead 33 0 50 100 150 200 h3

    table img 0 175 350 525 700 Load time Energy ms mJ Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts)
  45. 34 0 50 100 150 200 h3 table img 0

    175 350 525 700 Load time Energy ms mJ Tag Processing Overhead Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts)
  46. 35 0 50 100 150 200 h3 table img 0

    175 350 525 700 Load time Energy ms mJ Tag Processing Overhead Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts)
  47. 35 0 50 100 150 200 h3 table img 0

    175 350 525 700 Load time Energy ms mJ HTML tags have different processing overhead (time & energy) Tag Processing Overhead Webpages have a few hot HTML tags (hot instructions) Webpages have different tag counts (instruction counts)
  48. Characterization Conclusions 1. Webpages have different HTML tag counts and

    mixes • Root cause of load time and energy variance 36
  49. Characterization Conclusions 1. Webpages have different HTML tag counts and

    mixes 2. Individual HTML tags involve different processing overheads • Root cause of load time and energy variance 36
  50. Our Approach 37 Performance/Energy Prediction Resource Management How to predict

    the time and energy of webpage loading? Workload Characterization
  51. Regression Modeling Strategy Idea: predict webpage load time and energy

    consumption (responses) using webpage characteristics (predictors) 38
  52. Regression Modeling Strategy Identify Training Predictors and Responses Training using

    hottest 2,500 webpages Model Construction and Refinement Start from the linear model and progressively refine it Model Validation Validating on another 2,500 webpages 39
  53. Regression Modeling 40 Group Model Predictors Number of each tag

    HTML Number of each attribute Number of DOM tree nodes Number of rules CSS Number of each selector pattern Number of each property Content- dependent Total image size Content- dependent Total webpage size
  54. Regression Modeling 40 Group Model Predictors Number of each tag

    HTML Number of each attribute Number of DOM tree nodes Number of rules CSS Number of each selector pattern Number of each property Content- dependent Total image size Content- dependent Total webpage size
  55. Regression Modeling Group Model Predictors Number of each tag HTML

    Number of each attribute Number of DOM tree nodes Number of rules CSS Number of each selector pattern Number of each property Content- dependent Total image size Content- dependent Total webpage size 41
  56. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20
  57. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20
  58. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20 73.0% webpages < 10% error
  59. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20 73.0% webpages < 10% error 5.7% median error rate
  60. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20 73.0% webpages < 10% error 5.7% median error rate F(webpages) % 30 0 60 40 20 40 100 60 80 0 10 50 Error (%) 20
  61. Regression Modeling 43 F(webpages) % 30 0 60 40 20

    40 100 60 80 0 10 50 Error (%) 20 73.0% webpages < 10% error 5.7% median error rate F(webpages) % 30 0 60 40 20 40 100 60 80 0 10 50 Error (%) 20 Unoptimized models
  62. Regression Modeling 44 60 F(webpages) % 30 0 60 40

    20 40 100 60 80 0 10 50 Error (%) 20 F(webpages) % 100 80 60 40 20 0 0 10 20 30 40 50 60 Error (%) 70.0% webpages < 10% error 6.4% median error rate
  63. Webpage-aware Scheduling 48 Prediction (Minimal overhead) ........ Predict the load

    time and energy for each <core, freq> conf. Normal webpage rendering Webpage-aware scheduling
  64. Webpage-aware Scheduling 49 Scheduling ........ Scheduling overhead • Big/little migration:

    ~20ms [1] • Frequency scaling: ~3ms considering both HW/SW time [1] Big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. http://goo.gl/7mgbL Normal webpage rendering Webpage-aware scheduling
  65. Evaluation Methodology 51 • Today’s system: performance-oriented strategy, i.e., highest

    frequency on the big core (baseline) • Our proposal: Webpage-aware scheduling (WS)
  66. Evaluation Methodology 51 • Today’s system: performance-oriented strategy, i.e., highest

    frequency on the big core (baseline) • OS DVFS strategies (OS) • Our proposal: Webpage-aware scheduling (WS)
  67. Evaluation Methodology 51 • Today’s system: performance-oriented strategy, i.e., highest

    frequency on the big core (baseline) • OS DVFS strategies (OS) • OS-Big • Our proposal: Webpage-aware scheduling (WS)
  68. Evaluation Methodology 51 • Today’s system: performance-oriented strategy, i.e., highest

    frequency on the big core (baseline) • OS DVFS strategies (OS) • OS-Big • OS-Little • Our proposal: Webpage-aware scheduling (WS)
  69. Evaluation Methodology 51 • Today’s system: performance-oriented strategy, i.e., highest

    frequency on the big core (baseline) • OS DVFS strategies (OS) • OS-Big • OS-Little • (Hypothetical) OS-Big/Little • Our proposal: Webpage-aware scheduling (WS)
  70. 52 0 10 20 30 40 0 25 50 75

    100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  71. 53 0 10 20 30 40 OS (Big) 0 25

    50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  72. 54 0 10 20 30 40 OS (Big) OS (Little)

    0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  73. 55 0 10 20 30 40 OS (Big) OS (Little)

    WS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  74. 56 0 10 20 30 40 OS (Big) OS (Little)

    WS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  75. 57 0 10 20 30 40 OS (Big) OS (Little)

    WS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results
  76. 58 0 10 20 30 40 OS (Big) OS (Little)

    WS OS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline (Big+Little) Results Hypothetical
  77. 59 0 10 20 30 40 OS (Big) OS (Little)

    WS OS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results (Big+Little)
  78. 60 0 10 20 30 40 OS (Big) OS (Little)

    WS OS 0 25 50 75 100 Cut-off Violations (%) Energy Savings (%) Perf-oriented strategy as the baseline Results (Big+Little)
  79. 61 • VS. Performance-oriented strategy • 83% energy saving •

    4.1% more cut-off violations Results Webpage-aware Scheduler
  80. 61 • VS. Performance-oriented strategy • 83% energy saving •

    4.1% more cut-off violations • VS. (Hypothetical) Big/Little OS DVFS • 8.6% energy saving with minimal additional cut-off violations • 4.0% performance improvement Results Webpage-aware Scheduler
  81. Conclusions Webpage-inherent characterization • Webpages are drastically different in load

    time and energy consumption Platform-dependent load time/energy prediction • 94.3% and 93.6% accuracy, respectively Big/little scheduling to effectively utilize the hardware resources • Significant energy saving over the performance-oriented strategy • Improve energy and performance over the Big/Little OS DVFS strategy
  82. High-Performance and Energy- Efficient Mobile Web Browsing on Big/Little Systems

    Yuhao Zhu (गᜏᚠ) Vijay Janapa Reddi Trinity Research Group The University of Texas at Austin
  83. 66 0 2 4 6 8 10 40% 25% 0%

    0 2 4 6 8 10 Page abandonment rate Page Abandonment [1] RD2: “The three second rule”. http://goo.gl/pynBl Source: Reproduced from Kissmetrics, 2011 Webpage load time (s)
  84. Oracle Analysis 67 • 3.5% cut-off violations • ~80% average

    energy saving (over perf.-oriented mode) Oracle Scheduler
  85. Oracle Analysis 68 0 25 50 75 100 # Webpages

    (%) 86% same as oracle 4% under-prediction 10% over-prediction
  86. Integrated Scheduler 71 0.7 0.8 0.9 1 Normalized Energy 0.9

    0.6 0.3 0.0 F(webpages) E(Integrated) / E(WS)
  87. 74 Feature Pruning prop_font.family prop_font.weight prop_text.decoration prop_font.size prop_color emb_sel_cnt attr_style

    attr_colspan tag_tr tag_td attr_rowspan attr_height attr_width attr_bgcolor attr_border attr_align attr_valign attr_cellpadding attr_cellspacing tag_tbody tag_table attr_background attr_dir attr_http.equiv attr_color attr_size attr_noshade attr_face prop_border.style prop_border.width prop_vertical.align prop_max.width prop_max.height prop_min.height prop_border.color prop_border.bottom.width prop_border.top.width prop_border.left.width prop_border.right.width attr_onmouseout attr_onmouseover attr_label tag_optgroup attr_language tag_u prop_content tag_textarea tag_head tag_html tag_samp tag_dt tag_dl tag_i tag_dd tag_em attr_target attr_rel attr_lang tag_title attr_content tag_meta tag_button tag_tfoot prop_outline.style attr_media tag_link tag_legend tag_fieldset attr_valuetype attr_tabindex attr_checked tag_br tot_size img_size attr_data attr_id attr_onmouseup prop_letter.spacing prop_padding.right prop_list.style.type prop_border.right prop_visibility tag_body tag_noscript tag_style attr_usemap tag_map attr_shape tag_area attr_coords tag_s attr_cols attr_rows tag_cite tag_th tag_thead tag_caption attr_scope tag_param tag_object tag_embed attr_codebase attr_classid tag_address tag_code attr_abbr attr_headers prop_outline.color prop_min.width tag_section tag_nav tag_footer tag_header tag_h1 tag_aside tag_article tag_time attr_datetime attr_onload tag_figcaption tag_figure prop_font.variant prop_font.size.adjust prop_font.stretch attr_text attr_vlink attr_alink attr_vspace attr_hspace attr_nowrap prop_border.collapse prop_list.style.position prop_table.layout prop_empty.cells prop_border.bottom.color prop_border.right.color prop_border.bottom.style prop_border.left.style prop_border.right.style prop_border.left.color prop_border.top.color prop_border.top.style tag_mark tag_q attr_cite tag_source tag_video prop_orphans prop_widows prop_page.break.inside prop_page.break.after tag_col tag_colgroup prop_unicode.bidi tag_h4 tag_ol tag_h5 attr_disabled attr_onkeyup attr_multiple attr_enctype tag_h6 attr_start attr_readonly prop_outline.width tag_b prop_direction tag_var prop_list.style.image attr_defer attr_rules attr_clear attr_onunload tag_dfn attr_link attr_version attr_profile tag_ins tag_hr prop_quotes tag_blockquote tag_sup tag_del tag_menu tag_hgroup attr_charset attr_longdesc attr_rev attr_onmousemove attr_accept.charset attr_span attr_noresize attr_summary attr_scheme attr_accesskey tag_abbr prop_word.spacing prop_text.transform prop_background.attachment attr_hreflang prop_clip attr_onmousedown prop_voice.family attr_nohref attr_ismap tag_pre attr_accept attr_onkeypress tag_canvas tag_sub tag_wbr prop_size prop_caption.side prop_bottom prop_right spec_high prop_left prop_top prop_z.index prop_position prop_display prop_padding prop_background prop_margin prop_float prop_height prop_width spec_low prop_border prop_line.height spec_med dyn_extint_sel_cnt prop_overflow tag_small prop_white.space prop_border.left attr_alt tag_img attr_src tag_h3 tag_p tag_h2 tag_strong tag_li tag_ul attr_title tag_span energy time tag_div attr_class node attr_href tag_a depth tag_script attr_type attr_for tag_label tag_input tag_form attr_method attr_action attr_onclick attr_name attr_selected tag_select tag_option attr_value attr_onchange attr_onkeydown attr_maxlength attr_onsubmit attr_onblur attr_onfocus attr_marginheight attr_marginwidth attr_scrolling attr_frameborder tag_iframe prop_background.position prop_text.indent prop_background.repeat prop_background.image static_extint_sel_cnt prop_outline prop_font.style prop_border.spacing prop_font prop_text.shadow prop_list.style prop_margin.left prop_margin.top prop_padding.bottom prop_padding.left prop_padding.top prop_border.bottom prop_border.top prop_clear prop_cursor prop_background.color prop_margin.bottom prop_margin.right prop_text.align prop_text.align prop_margin.right prop_margin.bottom prop_background.color prop_cursor prop_clear prop_border.top prop_border.bottom prop_padding.top prop_padding.left prop_padding.bottom prop_margin.top prop_margin.left prop_list.style prop_text.shadow prop_font prop_border.spacing prop_font.style prop_outline static_extint_sel_cnt prop_background.image prop_background.repeat prop_text.indent prop_background.position tag_iframe attr_frameborder attr_scrolling attr_marginwidth attr_marginheight attr_onfocus attr_onblur attr_onsubmit attr_maxlength attr_onkeydown attr_onchange attr_value tag_option tag_select attr_selected attr_name attr_onclick attr_action attr_method tag_form tag_input tag_label attr_for attr_type tag_script depth tag_a attr_href node attr_class tag_div time energy tag_span attr_title tag_ul tag_li tag_strong tag_h2 tag_p tag_h3 attr_src tag_img attr_alt prop_border.left prop_white.space tag_small prop_overflow dyn_extint_sel_cnt spec_med prop_line.height prop_border spec_low prop_width prop_height prop_float prop_margin prop_background prop_padding prop_display prop_position prop_z.index prop_top prop_left spec_high prop_right prop_bottom prop_caption.side prop_size tag_wbr tag_sub tag_canvas attr_onkeypress attr_accept tag_pre attr_ismap attr_nohref prop_voice.family attr_onmousedown prop_clip attr_hreflang prop_background.attachment prop_text.transform prop_word.spacing tag_abbr attr_accesskey attr_scheme attr_summary attr_noresize attr_span attr_accept.charset attr_onmousemove attr_rev attr_longdesc attr_charset tag_hgroup tag_menu tag_del tag_sup tag_blockquote prop_quotes tag_hr tag_ins attr_profile attr_version attr_link tag_dfn attr_onunload attr_clear attr_rules attr_defer prop_list.style.image tag_var prop_direction tag_b prop_outline.width attr_readonly attr_start tag_h6 attr_enctype attr_multiple attr_onkeyup attr_disabled tag_h5 tag_ol tag_h4 prop_unicode.bidi tag_colgroup tag_col prop_page.break.after prop_page.break.inside prop_widows prop_orphans tag_video tag_source attr_cite tag_q tag_mark prop_border.top.style prop_border.top.color prop_border.left.color prop_border.right.style prop_border.left.style prop_border.bottom.style prop_border.right.color prop_border.bottom.color prop_empty.cells prop_table.layout prop_list.style.position prop_border.collapse attr_nowrap attr_hspace attr_vspace attr_alink attr_vlink attr_text prop_font.stretch prop_font.size.adjust prop_font.variant tag_figure tag_figcaption attr_onload attr_datetime tag_time tag_article tag_aside tag_h1 tag_header tag_footer tag_nav tag_section prop_min.width prop_outline.color attr_headers attr_abbr tag_code tag_address attr_classid attr_codebase tag_embed tag_object tag_param attr_scope tag_caption tag_thead tag_th tag_cite attr_rows attr_cols tag_s attr_coords tag_area attr_shape tag_map attr_usemap tag_style tag_noscript tag_body prop_visibility prop_border.right prop_list.style.type prop_padding.right prop_letter.spacing attr_onmouseup attr_id attr_data img_size tot_size tag_br attr_checked attr_tabindex attr_valuetype tag_fieldset tag_legend tag_link attr_media prop_outline.style tag_tfoot tag_button tag_meta attr_content tag_title attr_lang attr_rel attr_target tag_em tag_dd tag_i tag_dl tag_dt tag_samp tag_html tag_head tag_textarea prop_content tag_u attr_language tag_optgroup attr_label attr_onmouseover attr_onmouseout prop_border.right.width prop_border.left.width prop_border.top.width prop_border.bottom.width prop_border.color prop_min.height prop_max.height prop_max.width prop_vertical.align prop_border.width prop_border.style attr_face attr_noshade attr_size attr_color attr_http.equiv attr_dir attr_background tag_table tag_tbody attr_cellspacing attr_cellpadding attr_valign attr_align attr_border attr_bgcolor attr_width attr_height attr_rowspan tag_td tag_tr attr_colspan attr_style emb_sel_cnt prop_color prop_font.size prop_text.decoration prop_font.weight prop_font.family Tag CSS Tag’ CSS’ 1.We find several correlations between the different features 2.Some correlations are stronger than others
  88. 75 Feature Pruning 369 features 167 features 9 features prop_font.family

    prop_font.weight prop_text.decoration prop_font.size prop_color emb_sel_cnt attr_style attr_colspan tag_tr tag_td attr_rowspan attr_height attr_width attr_bgcolor attr_border attr_align attr_valign attr_cellpadding attr_cellspacing tag_tbody tag_table attr_background attr_dir attr_http.equiv attr_color attr_size attr_noshade attr_face prop_border.style prop_border.width prop_vertical.align prop_max.width prop_max.height prop_min.height prop_border.color prop_border.bottom.width prop_border.top.width prop_border.left.width prop_border.right.width attr_onmouseout attr_onmouseover attr_label tag_optgroup attr_language tag_u prop_content tag_textarea tag_head tag_html tag_samp tag_dt tag_dl tag_i tag_dd tag_em attr_target attr_rel attr_lang tag_title attr_content tag_meta tag_button tag_tfoot prop_outline.style attr_media tag_link tag_legend tag_fieldset attr_valuetype attr_tabindex attr_checked tag_br tot_size img_size attr_data attr_id attr_onmouseup prop_letter.spacing prop_padding.right prop_list.style.type prop_border.right prop_visibility tag_body tag_noscript tag_style attr_usemap tag_map attr_shape tag_area attr_coords tag_s attr_cols attr_rows tag_cite tag_th tag_thead tag_caption attr_scope tag_param tag_object tag_embed attr_codebase attr_classid tag_address tag_code attr_abbr attr_headers prop_outline.color prop_min.width tag_section tag_nav tag_footer tag_header tag_h1 tag_aside tag_article tag_time attr_datetime attr_onload tag_figcaption tag_figure prop_font.variant prop_font.size.adjust prop_font.stretch attr_text attr_vlink attr_alink attr_vspace attr_hspace attr_nowrap prop_border.collapse prop_list.style.position prop_table.layout prop_empty.cells prop_border.bottom.color prop_border.right.color prop_border.bottom.style prop_border.left.style prop_border.right.style prop_border.left.color prop_border.top.color prop_border.top.style tag_mark tag_q attr_cite tag_source tag_video prop_orphans prop_widows prop_page.break.inside prop_page.break.after tag_col tag_colgroup prop_unicode.bidi tag_h4 tag_ol tag_h5 attr_disabled attr_onkeyup attr_multiple attr_enctype tag_h6 attr_start attr_readonly prop_outline.width tag_b prop_direction tag_var prop_list.style.image attr_defer attr_rules attr_clear attr_onunload tag_dfn attr_link attr_version attr_profile tag_ins tag_hr prop_quotes tag_blockquote tag_sup tag_del tag_menu tag_hgroup attr_charset attr_longdesc attr_rev attr_onmousemove attr_accept.charset attr_span attr_noresize attr_summary attr_scheme attr_accesskey tag_abbr prop_word.spacing prop_text.transform prop_background.attachment attr_hreflang prop_clip attr_onmousedown prop_voice.family attr_nohref attr_ismap tag_pre attr_accept attr_onkeypress tag_canvas tag_sub tag_wbr prop_size prop_caption.side prop_bottom prop_right spec_high prop_left prop_top prop_z.index prop_position prop_display prop_padding prop_background prop_margin prop_float prop_height prop_width spec_low prop_border prop_line.height spec_med dyn_extint_sel_cnt prop_overflow tag_small prop_white.space prop_border.left attr_alt tag_img attr_src tag_h3 tag_p tag_h2 tag_strong tag_li tag_ul attr_title tag_span energy time tag_div attr_class node attr_href tag_a depth tag_script attr_type attr_for tag_label tag_input tag_form attr_method attr_action attr_onclick attr_name attr_selected tag_select tag_option attr_value attr_onchange attr_onkeydown attr_maxlength attr_onsubmit attr_onblur attr_onfocus attr_marginheight attr_marginwidth attr_scrolling attr_frameborder tag_iframe prop_background.position prop_text.indent prop_background.repeat prop_background.image static_extint_sel_cnt prop_outline prop_font.style prop_border.spacing prop_font prop_text.shadow prop_list.style prop_margin.left prop_margin.top prop_padding.bottom prop_padding.left prop_padding.top prop_border.bottom prop_border.top prop_clear prop_cursor prop_background.color prop_margin.bottom prop_margin.right prop_text.align prop_text.align prop_margin.right prop_margin.bottom prop_background.color prop_cursor prop_clear prop_border.top prop_border.bottom prop_padding.top prop_padding.left prop_padding.bottom prop_margin.top prop_margin.left prop_list.style prop_text.shadow prop_font prop_border.spacing prop_font.style prop_outline static_extint_sel_cnt prop_background.image prop_background.repeat prop_text.indent prop_background.position tag_iframe attr_frameborder attr_scrolling attr_marginwidth attr_marginheight attr_onfocus attr_onblur attr_onsubmit attr_maxlength attr_onkeydown attr_onchange attr_value tag_option tag_select attr_selected attr_name attr_onclick attr_action attr_method tag_form tag_input tag_label attr_for attr_type tag_script depth tag_a attr_href node attr_class tag_div time energy tag_span attr_title tag_ul tag_li tag_strong tag_h2 tag_p tag_h3 attr_src tag_img attr_alt prop_border.left prop_white.space tag_small prop_overflow dyn_extint_sel_cnt spec_med prop_line.height prop_border spec_low prop_width prop_height prop_float prop_margin prop_background prop_padding prop_display prop_position prop_z.index prop_top prop_left spec_high prop_right prop_bottom prop_caption.side prop_size tag_wbr tag_sub tag_canvas attr_onkeypress attr_accept tag_pre attr_ismap attr_nohref prop_voice.family attr_onmousedown prop_clip attr_hreflang prop_background.attachment prop_text.transform prop_word.spacing tag_abbr attr_accesskey attr_scheme attr_summary attr_noresize attr_span attr_accept.charset attr_onmousemove attr_rev attr_longdesc attr_charset tag_hgroup tag_menu tag_del tag_sup tag_blockquote prop_quotes tag_hr tag_ins attr_profile attr_version attr_link tag_dfn attr_onunload attr_clear attr_rules attr_defer prop_list.style.image tag_var prop_direction tag_b prop_outline.width attr_readonly attr_start tag_h6 attr_enctype attr_multiple attr_onkeyup attr_disabled tag_h5 tag_ol tag_h4 prop_unicode.bidi tag_colgroup tag_col prop_page.break.after prop_page.break.inside prop_widows prop_orphans tag_video tag_source attr_cite tag_q tag_mark prop_border.top.style prop_border.top.color prop_border.left.color prop_border.right.style prop_border.left.style prop_border.bottom.style prop_border.right.color prop_border.bottom.color prop_empty.cells prop_table.layout prop_list.style.position prop_border.collapse attr_nowrap attr_hspace attr_vspace attr_alink attr_vlink attr_text prop_font.stretch prop_font.size.adjust prop_font.variant tag_figure tag_figcaption attr_onload attr_datetime tag_time tag_article tag_aside tag_h1 tag_header tag_footer tag_nav tag_section prop_min.width prop_outline.color attr_headers attr_abbr tag_code tag_address attr_classid attr_codebase tag_embed tag_object tag_param attr_scope tag_caption tag_thead tag_th tag_cite attr_rows attr_cols tag_s attr_coords tag_area attr_shape tag_map attr_usemap tag_style tag_noscript tag_body prop_visibility prop_border.right prop_list.style.type prop_padding.right prop_letter.spacing attr_onmouseup attr_id attr_data img_size tot_size tag_br attr_checked attr_tabindex attr_valuetype tag_fieldset tag_legend tag_link attr_media prop_outline.style tag_tfoot tag_button tag_meta attr_content tag_title attr_lang attr_rel attr_target tag_em tag_dd tag_i tag_dl tag_dt tag_samp tag_html tag_head tag_textarea prop_content tag_u attr_language tag_optgroup attr_label attr_onmouseover attr_onmouseout prop_border.right.width prop_border.left.width prop_border.top.width prop_border.bottom.width prop_border.color prop_min.height prop_max.height prop_max.width prop_vertical.align prop_border.width prop_border.style attr_face attr_noshade attr_size attr_color attr_http.equiv attr_dir attr_background tag_table tag_tbody attr_cellspacing attr_cellpadding attr_valign attr_align attr_border attr_bgcolor attr_width attr_height attr_rowspan tag_td tag_tr attr_colspan attr_style emb_sel_cnt prop_color prop_font.size prop_text.decoration prop_font.weight prop_font.family attr_align tag_tr attr_height attr_width attr_bgcolor attr_border attr_valign attr_cellpadding attr_cellspacing tag_tbody tag_table prop_color emb_sel_cnt attr_style prop_overflow tag_small prop_white.space prop_border.left attr_colspan tag_td attr_alt tag_img attr_src attr_dir attr_http.equiv attr_color attr_face attr_size prop_border.style prop_border.width prop_vertical.align prop_max.height prop_min.height prop_max.width prop_background.position prop_background.image prop_text.indent prop_background.repeat prop_outline prop_font.style static_extint_sel_cnt prop_min.width prop_font prop_text.shadow prop_list.style tag_body attr_onclick tag_noscript tag_style prop_border.color attr_tabindex prop_visibility prop_letter.spacing prop_padding.right prop_list.style.type prop_border.right attr_shape tag_area attr_coords tag_th attr_media tag_link tag_button tag_h4 tag_ol tag_h5 attr_rel attr_lang tag_title tag_meta attr_content attr_data attr_id tag_br attr_label attr_language attr_onmouseout attr_onmouseover tag_dd tag_dt tag_dl attr_target tag_em tag_i prop_content tag_head tag_html tag_h1 tag_section tag_article prop_text.transform prop_border.collapse tag_h6 attr_onmousedown attr_charset tag_b tag_hr tag_param tag_div attr_class node attr_href tag_a attr_scrolling attr_frameborder tag_iframe tag_strong tot_size img_size tag_h2 tag_h3 tag_p tag_ul tag_li attr_title tag_span energy time depth tag_script attr_type attr_maxlength attr_for tag_label attr_onfocus tag_select tag_option attr_value attr_name tag_input attr_method attr_action tag_form prop_font.family prop_font.weight prop_text.decoration prop_font.size prop_margin.left prop_margin.top prop_padding.bottom prop_padding.left prop_padding.top prop_border.bottom prop_border.top prop_margin.bottom prop_clear prop_cursor prop_background.color prop_margin.right prop_text.align prop_float prop_line.height prop_bottom prop_right prop_left prop_top prop_z.index prop_position prop_display prop_padding prop_background prop_margin spec_high spec_low prop_border spec_med dyn_extint_sel_cnt prop_height prop_width prop_width prop_height dyn_extint_sel_cnt spec_med prop_border spec_low spec_high prop_margin prop_background prop_padding prop_display prop_position prop_z.index prop_top prop_left prop_right prop_bottom prop_line.height prop_float prop_text.align prop_margin.right prop_background.color prop_cursor prop_clear prop_margin.bottom prop_border.top prop_border.bottom prop_padding.top prop_padding.left prop_padding.bottom prop_margin.top prop_margin.left prop_font.size prop_text.decoration prop_font.weight prop_font.family tag_form attr_action attr_method tag_input attr_name attr_value tag_option tag_select attr_onfocus tag_label attr_for attr_maxlength attr_type tag_script depth time energy tag_span attr_title tag_li tag_ul tag_p tag_h3 tag_h2 img_size tot_size tag_strong tag_iframe attr_frameborder attr_scrolling tag_a attr_href node attr_class tag_div tag_param tag_hr tag_b attr_charset attr_onmousedown tag_h6 prop_border.collapse prop_text.transform tag_article tag_section tag_h1 tag_html tag_head prop_content tag_i tag_em attr_target tag_dl tag_dt tag_dd attr_onmouseover attr_onmouseout attr_language attr_label tag_br attr_id attr_data attr_content tag_meta tag_title attr_lang attr_rel tag_h5 tag_ol tag_h4 tag_button tag_link attr_media tag_th attr_coords tag_area attr_shape prop_border.right prop_list.style.type prop_padding.right prop_letter.spacing prop_visibility attr_tabindex prop_border.color tag_style tag_noscript attr_onclick tag_body prop_list.style prop_text.shadow prop_font prop_min.width static_extint_sel_cnt prop_font.style prop_outline prop_background.repeat prop_text.indent prop_background.image prop_background.position prop_max.width prop_min.height prop_max.height prop_vertical.align prop_border.width prop_border.style attr_size attr_face attr_color attr_http.equiv attr_dir attr_src tag_img attr_alt tag_td attr_colspan prop_border.left prop_white.space tag_small prop_overflow attr_style emb_sel_cnt prop_color tag_table tag_tbody attr_cellspacing attr_cellpadding attr_valign attr_border attr_bgcolor attr_width attr_height tag_tr attr_align prop_min.width prop_margin prop_margin.bottom prop_margin.right prop_margin.left prop_margin.top prop_max.height prop_max.width prop_min.height prop_min.height prop_max.width prop_max.height prop_margin.top prop_margin.left prop_margin.right prop_margin.bottom prop_margin prop_min.width