Frontend Performance: Beginner to Expert to Insanity

FrontEnd Performance Beginner to Expert to Insanity

Philip Tellis @bluesmoon http://tech.bluesmoon.info http://www.soasta.com/mpulse/

Get the most benefit with the least effort

0 Start with a really slow site

0.1 Start Measuring

Or use RUM for real user data (boomerang/mPulse)

• Pre-gzip static assets (gzip_static in nginx) • eg: foo.js.gz
is served if Accept-Encoding: gzip is included • For dynamic content, use chunked transfers with gzipped chunks • You can do this by flushing buffers on the server 0.2 enable gzip http://www.slideshare.net/billwscott/improving-netflix-performance-experience

• Understand newer compression formats like ZopFli, Brotli, and WebP.
• But also understand the trade-off between better compression and complexity of image decoding on mobile devices 0.2 enable gzip… and compression http://google-opensource.blogspot.ca/2015/09/introducing-brotli-new-compression.html http://blog.codinghorror.com/zopfli-optimization-literally-free-bandwidth/ https://developers.google.com/speed/webp/

In Switzerland, you can get gluten free compression! https://twitter.com/dougsillars/status/ 737558689297534976

0.3 Cache Cache-control: public, max-age=31415926 Do NOT set LastModified or
ETags headers to avoid conditional requests

0 Congratulations You’ve just been promoted!

1 What the Experts do

1.1 CDN • Serve your root domain through a CDN
• Put CSS on the root domain • Chrome opens two TCP connections to the primary host, the second one is "just in case" http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html

1.1 Google Chrome will open two TCP connections to the
primary host, one for the page, and the second "just in case"

1.2 Split JavaScript • Critical: in the HEAD • Enhancements:
loaded async • Flush buffers after the HEAD • Approximately 14.4Kb gzipped per file • for HTTP/2, these would have different priorities

1.3 Parallelize downloads… or maybe don’t • You can have
more bandwidth, but you cannot have lower latency • For HTTP/1.1, mitigate latency effects by parallelizing across multiple TCP sockets • But with HTTP/2, this rule is turned on its head since multiplexing and pipelining is built in http://www.soasta.com/blog/more-bandwidth-isnt-a-magic-bullet-for-web-performance/

1.4 Flush Early and Often • avoid TCP Slow Start,
• speed up CSS • Help the browser’s lookahead parser Getting bytes to the client ASAP will:

TCP Handshake & Congestion Control

1.5 Increase initcwnd Initial Congestion Window: Number of packets to
send before waiting for an ACK http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/

1.5 Increase initcwnd @mobtec on Twitter

1.5b Also… net.ipv4.tcp_slow_start_after_idle=0 http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/

1.6 PageSpeed mod_pagespeed and ngx_pagespeed

1.7 Don’t just FastClick • FastClick fires a Click onTouchEnd
• It might be better to initiate a TCP connect onTouchStart and fetch content normally onClick • Use <link rel=“preconnect”>

1.8 Use UserTiming to measure your code • The UserTiming
API allows you to set performance timeline marks within your code • performance.mark("name") • performance.measure("name",  "start_mark", "end_mark") http://www.html5rocks.com/en/tutorials/webperformance/usertiming/

1.9 Avoid Pre-ﬂighted XHR • Make sure any JS Library
you use doesn’t automatically add an X-Requested-With header http://www.soasta.com/blog/options-web-performance-with-single-page-applications/

Relax https://www.flickr.com/photos/29825916@N05/5760905069/

2 Things are gonna get insane!

Sort in ascending order of signal latency • Electrons through
copper • Light through fibre • Pulsars • Station Wagons • Smoke Signals

Sort in ascending order of signal latency 1.Pulsars (light through
vacuum) 2.Smoke Signals (light through air) 3.Electrons through copper / Light through fibre 4.Station Wagons (possibly highest bandwidth)

2.0 Bandwidth is different around the world

2.0 As are people

2.0 Study real user data Look for potential places to
parallelize, predict and cache

2.1 Use RUM to determine optimum POP location • Use
RUM to measure latency from user to multiple POPs • Pick POP based on lowest latency • Adapt to changes in network topology http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum

2.1 Use RUM to determine best CDN • Use RUM
to measure latency from user to multiple CDN providers • Dynamically pick CDN based on what works best

2.2 pre-browsing <link rel="prerender" href="url"> <link rel="preload" href=“url" as="type"> <link
rel="dns-prefetch" href=“url"> <link rel="preconnect" href="url"> https://w3c.github.io/resource-hints/ http://w3c.github.io/preload/ http://filamentgroup.github.io/loadCSS/test/preload.html

2.2 <link rel=“dns-prefetch”> • Does a DNS lookup for hostname
mentioned in URL • This could help reduce latency when the request shows up — for first page views at least • Your DNS TTL needs to be long enough to survive past a page load

2.2 <link rel=“preload”> • Tells the parser that this resource
will be required later on • Browser can start downloading in the background if it has nothing better to do with its resources • no-cache header only applies to subsequent pages

2.2 <link rel=“preconnect”> • Tells the browser to open a
TCP connection to this host, and hold on to it • Any CORS restrictions will apply to this connection

2.2 <link rel=“prerender”> • Tells the parser that this page
is likely to be requested by the user • Browser downloads page, and all its resources, renders it, executes JavaScript and fires the onload event. • It’s like opening the page in a hidden Tab

2.2 <link rel=“prerender”> • When user follows the URL, the
page just shows up (< 5ms latency) • This is actually faster than switching tabs in the browser • The onVisibilityChange event fires and visibilityState changes from “prerender” to “visible” or “hidden”

2.2 <link rel=“prerender”> — Caveats • The page needs to
be requested using GET • The page should not require Authentication (401 response) • Prerender will be aborted if cookies, or localStorage change, or if the prerendered page has non-idempotent components

2.2 onVisibilityChange And while you’re at it, don’t do expensive
work if the page is hidden https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/ Using_the_Page_Visibility_API

2.3 Post-load Fetch optional assets after onload

2.4 Detect broken accept-encoding Many Windows anti-viruses and firewalls disable
gzip by munging the Accept-Encoding header http://www.lognormal.com/blog/2012/08/17/accept-encoding-stats/

2.5 HTTP/2 • Only one TCP connection per host •
Do NOT use domain sharding • Do NOT use sprites • Do use Stream Multiplexing with Priorities • Do use Server Push http://chimera.labs.oreilly.com/books/1230000000545/ch12.html

— Tim Kadlec “4:2:0 subsampling of JPEGs gets a 62.5%
memory savings” 2.6 Use 4:2:0 Chroma Subsampling Chroma Subsampling takes advantage of the fact that the human visual system is less sensitive to changes in colour than luminance http://en.wikipedia.org/wiki/Chroma_subsampling

2.7 Resize Images for target Device Dimensions Resizing Images for
specific screen sizes could be the difference between 1.5s and 30ms https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015

2.8 Decode large images in a Worker Thread fetch(imageURL) //
Get the image as a blob. .then(response => response.blob()) // Decode the image. .then(blobData => createImageBitmap(blobData)) // Send it to the main thread .then(imageBitmap => { self.postMessage({ imageBitmap }, [imageBitmap]); }, err => { self.postMessage({ err }); }); https://github.com/googlechrome/offthread-image

2.9 Don’t force layout operations • DOM manipulations followed by
a read of invalidated properties forces a layout • This has a huge CPU impact • Read before write • Batch update • Move operations into the HEAD Amiya Gupta @ Velocity 2015

2.10 Understand 3PoFs Use blackhole.webpagetest.org to test for 3rd party
single points of failure http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html

2.10 3PoFs Request Map by Simon Hearne

2.11 Understand your SpeedIndex http://heatmap.webperf.tools/render/150527_SH_827d2295b66e4180892db766eaf8a492/10000

2.12 What does your site cost? https://whatdoesmysitecost.com/test/160531_68_43bd4bcf91f91ca0bed2b3dfeb5eb67e

2.13 Prioritize optimizations based on user impact Conversion Impact Score
in mPulse DSWB

2.14 Become a WebPageTest power user • Check out the
comparison view • Collect packet captures • Use Wireshark • Test out different network types

References • WebPageTest — http://webpagetest.org • Boomerang — http://www.lognormal.com/boomerang/doc/ •
SOASTA mPulse — http://www.soasta.com/mpulse • Netflix gzip study — http://www.slideshare.net/billwscott/improving-netflix-performance-experience • Nginx gzip_static — http://wiki.nginx.org/HttpGzipStaticModule • ImageOptim — http://imageoptim.com/ • uncss — https://github.com/giakki/uncss • grunt-uncss — https://github.com/addyosmani/grunt-uncss • Caching — http://www.mnot.net/cache_docs/ • Same domain CSS — http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html • initcwnd — http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/ • Linux TCP Tuning — http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/ • Prerender — https://developers.google.com/chrome/whitepapers/prerender • DNS prefetching — https://developer.mozilla.org/en-US/docs/Controlling_DNS_prefetching • Preloading CSS — http://filamentgroup.github.io/loadCSS/test/preload.html • FE SPoF — http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html • Page Visibility API — https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/Using_the_Page_Vis... • HTTP/2 — http://chimera.labs.oreilly.com/books/1230000000545/ch12.html • More Bandwidth is not a Magic Bullet — http://performancebeacon.com/more-bandwidth-isnt-a-magic-bullet-for... • The UserTiming API — http://www.html5rocks.com/en/tutorials/webperformance/usertiming/ • The 3.5s dash for attention — http://www.slideshare.net/buddybrewer/the-35s-dash-for-attention-and-other-stuff-we-... • POPs & RUM — http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum • Optimizing Images for Mobile — https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015 • Load Images in a Worker Thread — https://aerotwist.com/blog/the-hack-is-back/ • Optimizing the MSN Homepage — Amiya Gupta @ Velocity 2015 • Simon Hearne’s Webperf Tools — http://requestmap.webperf.tools and http://heatmap.webperf.tools • What does my site cost — http://whatdoesmysitecost.com • Reducing JPG File size — https://medium.com/@duhroach/reducing-jpg-file-size-e5b27df3257c

Thank You

Philip Tellis @bluesmoon http://tech.bluesmoon.info http://www.soasta.com/mpulse/

Image Credits • Apple Pie  http://www.flickr.com/photos/24609729@N00/3353226142/ • Puppies 7 weeks 
https://www.flickr.com/photos/29825916@N05/5760905069/ • Zöpfli  https://twitter.com/dougsillars/status/737558689297534976

Frontend Performance: Beginner to Expert to Ins...

Frontend Performance: Beginner to Expert to Insanity

More Decks by Philip Tellis

Other Decks in Technology

Featured

Transcript