$30 off During Our Annual Pro Sale. View Details »

Frontend Performance: Beginner to Expert to Insanity

Frontend Performance: Beginner to Expert to Insanity

There’s no such thing as fast enough. You can always make your website faster. This talk will show you how. The very first requirement of a great user experience is actually getting the bytes of that experience to the user before they they get tired and leave.

In this talk we’ll start with the basics and get progressively insane. We’ll go over several frontend performance best practices, a few anti-patterns, the reasoning behind the rules, and how they’ve changed over the years. How should you tune TCP? Should you combine your JavaScript or split it into multiple files? Should you use a Cookieless domain for CSS or not? How does HTTP/2 affect performance? How do you use Wireshark and browser tools to debug performance issues? We’ll answer most of these questions during this talk.

https://phpconference.com/session/web-performance-beginner-to-expert-to-insanity/

Philip Tellis

May 31, 2016
Tweet

More Decks by Philip Tellis

Other Decks in Technology

Transcript

  1. FrontEnd Performance
    Beginner to Expert to Insanity

    View Slide

  2. Philip Tellis
    @bluesmoon
    http://tech.bluesmoon.info
    http://www.soasta.com/mpulse/

    View Slide

  3. View Slide

  4. Get the most benefit with the
    least effort

    View Slide

  5. 0
    Start with a really slow site

    View Slide

  6. 0.1 Start Measuring

    View Slide

  7. Or use RUM for real user data (boomerang/mPulse)

    View Slide

  8. • Pre-gzip static assets (gzip_static in nginx)
    • eg: foo.js.gz is served if Accept-Encoding: gzip is included
    • For dynamic content, use chunked transfers with
    gzipped chunks
    • You can do this by flushing buffers on the server
    0.2 enable gzip
    http://www.slideshare.net/billwscott/improving-netflix-performance-experience

    View Slide

  9. • Understand newer compression formats like
    ZopFli, Brotli, and WebP.
    • But also understand the trade-off between
    better compression and complexity of image
    decoding on mobile devices
    0.2 enable gzip… and compression
    http://google-opensource.blogspot.ca/2015/09/introducing-brotli-new-compression.html
    http://blog.codinghorror.com/zopfli-optimization-literally-free-bandwidth/
    https://developers.google.com/speed/webp/

    View Slide

  10. In Switzerland, you
    can get gluten free
    compression!
    https://twitter.com/dougsillars/status/
    737558689297534976

    View Slide

  11. 0.3 Cache
    Cache-control: public, max-age=31415926
    Do NOT set LastModified or ETags headers to
    avoid conditional requests

    View Slide

  12. 0 Congratulations
    You’ve just been promoted!

    View Slide

  13. 1
    What the Experts do

    View Slide

  14. 1.1 CDN
    • Serve your root domain through a CDN
    • Put CSS on the root domain
    • Chrome opens two TCP connections to the
    primary host, the second one is "just in case"
    http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html

    View Slide

  15. 1.1 Google Chrome will open two TCP connections to the
    primary host, one for the page, and the second "just in case"

    View Slide

  16. 1.2 Split JavaScript
    • Critical: in the HEAD
    • Enhancements: loaded async
    • Flush buffers after the HEAD
    • Approximately 14.4Kb gzipped per file
    • for HTTP/2, these would have different priorities

    View Slide

  17. 1.3 Parallelize downloads… or maybe don’t
    • You can have more bandwidth, but you cannot
    have lower latency
    • For HTTP/1.1, mitigate latency effects by
    parallelizing across multiple TCP sockets
    • But with HTTP/2, this rule is turned on its head
    since multiplexing and pipelining is built in
    http://www.soasta.com/blog/more-bandwidth-isnt-a-magic-bullet-for-web-performance/

    View Slide

  18. 1.4 Flush Early and Often
    • avoid TCP Slow Start,
    • speed up CSS
    • Help the browser’s lookahead parser
    Getting bytes to the client ASAP will:

    View Slide

  19. TCP Handshake & Congestion Control

    View Slide

  20. 1.5 Increase initcwnd
    Initial Congestion Window: Number of packets to send
    before waiting for an ACK
    http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/

    View Slide

  21. 1.5 Increase initcwnd
    @mobtec on Twitter

    View Slide

  22. 1.5b Also…
    net.ipv4.tcp_slow_start_after_idle=0
    http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/

    View Slide

  23. 1.6 PageSpeed
    mod_pagespeed and ngx_pagespeed

    View Slide

  24. 1.7 Don’t just FastClick
    • FastClick fires a Click onTouchEnd
    • It might be better to initiate a TCP connect
    onTouchStart and fetch content normally onClick
    • Use

    View Slide

  25. 1.8 Use UserTiming to measure your code
    • The UserTiming API allows you to set
    performance timeline marks within your code
    • performance.mark("name")
    • performance.measure("name",

    "start_mark", "end_mark")
    http://www.html5rocks.com/en/tutorials/webperformance/usertiming/

    View Slide

  26. 1.9 Avoid Pre-flighted XHR
    • Make sure any JS Library you use doesn’t
    automatically add an X-Requested-With header
    http://www.soasta.com/blog/options-web-performance-with-single-page-applications/

    View Slide

  27. Relax
    https://www.flickr.com/photos/29825916@N05/5760905069/

    View Slide

  28. 2
    Things are gonna get insane!

    View Slide

  29. Sort in ascending order of signal latency
    • Electrons through copper
    • Light through fibre
    • Pulsars
    • Station Wagons
    • Smoke Signals

    View Slide

  30. Sort in ascending order of signal latency
    1.Pulsars (light through vacuum)
    2.Smoke Signals (light through air)
    3.Electrons through copper / Light through fibre
    4.Station Wagons (possibly highest bandwidth)

    View Slide

  31. 2.0 Bandwidth is different around the world

    View Slide

  32. 2.0 As are people

    View Slide

  33. 2.0 Study real user data
    Look for potential places to parallelize, predict and cache

    View Slide

  34. 2.1 Use RUM to determine optimum POP location
    • Use RUM to measure latency from user to
    multiple POPs
    • Pick POP based on lowest latency
    • Adapt to changes in network topology
    http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum

    View Slide

  35. 2.1 Use RUM to determine best CDN
    • Use RUM to measure latency from user to
    multiple CDN providers
    • Dynamically pick CDN based on what works best

    View Slide

  36. 2.2 pre-browsing




    https://w3c.github.io/resource-hints/
    http://w3c.github.io/preload/
    http://filamentgroup.github.io/loadCSS/test/preload.html

    View Slide

  37. 2.2
    • Does a DNS lookup for hostname mentioned in
    URL
    • This could help reduce latency when the request
    shows up — for first page views at least
    • Your DNS TTL needs to be long enough to
    survive past a page load

    View Slide

  38. 2.2
    • Tells the parser that this resource will be required
    later on
    • Browser can start downloading in the
    background if it has nothing better to do with its
    resources
    • no-cache header only applies to subsequent
    pages

    View Slide

  39. 2.2
    • Tells the browser to open a TCP connection to
    this host, and hold on to it
    • Any CORS restrictions will apply to this
    connection

    View Slide

  40. 2.2
    • Tells the parser that this page is likely to be
    requested by the user
    • Browser downloads page, and all its resources,
    renders it, executes JavaScript and fires the
    onload event.
    • It’s like opening the page in a hidden Tab

    View Slide

  41. 2.2
    • When user follows the URL, the page just shows
    up (< 5ms latency)
    • This is actually faster than switching tabs in the
    browser
    • The onVisibilityChange event fires and
    visibilityState changes from “prerender” to
    “visible” or “hidden”

    View Slide

  42. 2.2 — Caveats
    • The page needs to be requested using GET
    • The page should not require Authentication (401
    response)
    • Prerender will be aborted if cookies, or
    localStorage change, or if the prerendered page
    has non-idempotent components

    View Slide

  43. 2.2 onVisibilityChange
    And while you’re at it, don’t do expensive work if the
    page is hidden
    https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/
    Using_the_Page_Visibility_API

    View Slide

  44. 2.3 Post-load
    Fetch optional assets after onload

    View Slide

  45. 2.4 Detect broken accept-encoding
    Many Windows anti-viruses and firewalls disable gzip by
    munging the Accept-Encoding header
    http://www.lognormal.com/blog/2012/08/17/accept-encoding-stats/

    View Slide

  46. 2.5 HTTP/2
    • Only one TCP connection per host
    • Do NOT use domain sharding
    • Do NOT use sprites
    • Do use Stream Multiplexing with Priorities
    • Do use Server Push
    http://chimera.labs.oreilly.com/books/1230000000545/ch12.html

    View Slide

  47. — Tim Kadlec
    “4:2:0 subsampling of JPEGs gets a 62.5%
    memory savings”
    2.6 Use 4:2:0 Chroma Subsampling
    Chroma Subsampling takes advantage of the fact that the
    human visual system is less sensitive to changes in colour
    than luminance
    http://en.wikipedia.org/wiki/Chroma_subsampling

    View Slide

  48. 2.7 Resize Images for target Device Dimensions
    Resizing Images for specific screen sizes could be the
    difference between 1.5s and 30ms
    https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015

    View Slide

  49. 2.8 Decode large images in a Worker Thread
    fetch(imageURL)
    // Get the image as a blob.
    .then(response => response.blob())
    // Decode the image.
    .then(blobData => createImageBitmap(blobData))
    // Send it to the main thread
    .then(imageBitmap => {
    self.postMessage({ imageBitmap }, [imageBitmap]);
    }, err => {
    self.postMessage({ err });
    });
    https://github.com/googlechrome/offthread-image

    View Slide

  50. 2.9 Don’t force layout operations
    • DOM manipulations followed by a read of
    invalidated properties forces a layout
    • This has a huge CPU impact
    • Read before write
    • Batch update
    • Move operations into the HEAD
    Amiya Gupta @ Velocity 2015

    View Slide

  51. 2.10 Understand 3PoFs
    Use blackhole.webpagetest.org to test for 3rd party single
    points of failure
    http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html

    View Slide

  52. 2.10 3PoFs
    Request Map by Simon Hearne

    View Slide

  53. 2.11 Understand your SpeedIndex
    http://heatmap.webperf.tools/render/150527_SH_827d2295b66e4180892db766eaf8a492/10000

    View Slide

  54. 2.12 What does your site cost?
    https://whatdoesmysitecost.com/test/160531_68_43bd4bcf91f91ca0bed2b3dfeb5eb67e

    View Slide

  55. 2.13 Prioritize optimizations based on user impact
    Conversion Impact Score in mPulse DSWB

    View Slide

  56. 2.14 Become a WebPageTest power user
    • Check out the comparison view
    • Collect packet captures
    • Use Wireshark
    • Test out different network types

    View Slide

  57. References
    • WebPageTest — http://webpagetest.org
    • Boomerang — http://www.lognormal.com/boomerang/doc/
    • SOASTA mPulse — http://www.soasta.com/mpulse
    • Netflix gzip study — http://www.slideshare.net/billwscott/improving-netflix-performance-experience
    • Nginx gzip_static — http://wiki.nginx.org/HttpGzipStaticModule
    • ImageOptim — http://imageoptim.com/
    • uncss — https://github.com/giakki/uncss
    • grunt-uncss — https://github.com/addyosmani/grunt-uncss
    • Caching — http://www.mnot.net/cache_docs/
    • Same domain CSS — http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html
    • initcwnd — http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/
    • Linux TCP Tuning — http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/
    • Prerender — https://developers.google.com/chrome/whitepapers/prerender
    • DNS prefetching — https://developer.mozilla.org/en-US/docs/Controlling_DNS_prefetching
    • Preloading CSS — http://filamentgroup.github.io/loadCSS/test/preload.html
    • FE SPoF — http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html
    • Page Visibility API — https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/Using_the_Page_Vis...
    • HTTP/2 — http://chimera.labs.oreilly.com/books/1230000000545/ch12.html
    • More Bandwidth is not a Magic Bullet — http://performancebeacon.com/more-bandwidth-isnt-a-magic-bullet-for...
    • The UserTiming API — http://www.html5rocks.com/en/tutorials/webperformance/usertiming/
    • The 3.5s dash for attention — http://www.slideshare.net/buddybrewer/the-35s-dash-for-attention-and-other-stuff-we-...
    • POPs & RUM — http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum
    • Optimizing Images for Mobile — https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015
    • Load Images in a Worker Thread — https://aerotwist.com/blog/the-hack-is-back/
    • Optimizing the MSN Homepage — Amiya Gupta @ Velocity 2015
    • Simon Hearne’s Webperf Tools — http://requestmap.webperf.tools and http://heatmap.webperf.tools
    • What does my site cost — http://whatdoesmysitecost.com
    • Reducing JPG File size — https://medium.com/@duhroach/reducing-jpg-file-size-e5b27df3257c

    View Slide

  58. Thank You

    View Slide

  59. Philip Tellis
    @bluesmoon
    http://tech.bluesmoon.info
    http://www.soasta.com/mpulse/

    View Slide

  60. Image Credits
    • Apple Pie

    http://www.flickr.com/photos/24609729@N00/3353226142/
    • Puppies 7 weeks

    https://www.flickr.com/photos/29825916@N05/5760905069/
    • Zöpfli

    https://twitter.com/dougsillars/status/737558689297534976

    View Slide