Slide 1

Slide 1 text

FrontEnd Performance Beginner to Expert to Insanity

Slide 2

Slide 2 text

Philip Tellis @bluesmoon http://tech.bluesmoon.info http://www.soasta.com/mpulse/

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Get the most benefit with the least effort

Slide 5

Slide 5 text

0 Start with a really slow site

Slide 6

Slide 6 text

0.1 Start Measuring

Slide 7

Slide 7 text

Or use RUM for real user data (boomerang/mPulse)

Slide 8

Slide 8 text

• Pre-gzip static assets (gzip_static in nginx) • eg: foo.js.gz is served if Accept-Encoding: gzip is included • For dynamic content, use chunked transfers with gzipped chunks • You can do this by flushing buffers on the server 0.2 enable gzip http://www.slideshare.net/billwscott/improving-netflix-performance-experience

Slide 9

Slide 9 text

• Understand newer compression formats like ZopFli, Brotli, and WebP. • But also understand the trade-off between better compression and complexity of image decoding on mobile devices 0.2 enable gzip… and compression http://google-opensource.blogspot.ca/2015/09/introducing-brotli-new-compression.html http://blog.codinghorror.com/zopfli-optimization-literally-free-bandwidth/ https://developers.google.com/speed/webp/

Slide 10

Slide 10 text

In Switzerland, you can get gluten free compression! https://twitter.com/dougsillars/status/ 737558689297534976

Slide 11

Slide 11 text

0.3 Cache Cache-control: public, max-age=31415926 Do NOT set LastModified or ETags headers to avoid conditional requests

Slide 12

Slide 12 text

0 Congratulations You’ve just been promoted!

Slide 13

Slide 13 text

1 What the Experts do

Slide 14

Slide 14 text

1.1 CDN • Serve your root domain through a CDN • Put CSS on the root domain • Chrome opens two TCP connections to the primary host, the second one is "just in case" http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html

Slide 15

Slide 15 text

1.1 Google Chrome will open two TCP connections to the primary host, one for the page, and the second "just in case"

Slide 16

Slide 16 text

1.2 Split JavaScript • Critical: in the HEAD • Enhancements: loaded async • Flush buffers after the HEAD • Approximately 14.4Kb gzipped per file • for HTTP/2, these would have different priorities

Slide 17

Slide 17 text

1.3 Parallelize downloads… or maybe don’t • You can have more bandwidth, but you cannot have lower latency • For HTTP/1.1, mitigate latency effects by parallelizing across multiple TCP sockets • But with HTTP/2, this rule is turned on its head since multiplexing and pipelining is built in http://www.soasta.com/blog/more-bandwidth-isnt-a-magic-bullet-for-web-performance/

Slide 18

Slide 18 text

1.4 Flush Early and Often • avoid TCP Slow Start, • speed up CSS • Help the browser’s lookahead parser Getting bytes to the client ASAP will:

Slide 19

Slide 19 text

TCP Handshake & Congestion Control

Slide 20

Slide 20 text

1.5 Increase initcwnd Initial Congestion Window: Number of packets to send before waiting for an ACK http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/

Slide 21

Slide 21 text

1.5 Increase initcwnd @mobtec on Twitter

Slide 22

Slide 22 text

1.5b Also… net.ipv4.tcp_slow_start_after_idle=0 http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/

Slide 23

Slide 23 text

1.6 PageSpeed mod_pagespeed and ngx_pagespeed

Slide 24

Slide 24 text

1.7 Don’t just FastClick • FastClick fires a Click onTouchEnd • It might be better to initiate a TCP connect onTouchStart and fetch content normally onClick • Use

Slide 25

Slide 25 text

1.8 Use UserTiming to measure your code • The UserTiming API allows you to set performance timeline marks within your code • performance.mark("name") • performance.measure("name",
 "start_mark", "end_mark") http://www.html5rocks.com/en/tutorials/webperformance/usertiming/

Slide 26

Slide 26 text

1.9 Avoid Pre-flighted XHR • Make sure any JS Library you use doesn’t automatically add an X-Requested-With header http://www.soasta.com/blog/options-web-performance-with-single-page-applications/

Slide 27

Slide 27 text

Relax https://www.flickr.com/photos/29825916@N05/5760905069/

Slide 28

Slide 28 text

2 Things are gonna get insane!

Slide 29

Slide 29 text

Sort in ascending order of signal latency • Electrons through copper • Light through fibre • Pulsars • Station Wagons • Smoke Signals

Slide 30

Slide 30 text

Sort in ascending order of signal latency 1.Pulsars (light through vacuum) 2.Smoke Signals (light through air) 3.Electrons through copper / Light through fibre 4.Station Wagons (possibly highest bandwidth)

Slide 31

Slide 31 text

2.0 Bandwidth is different around the world

Slide 32

Slide 32 text

2.0 As are people

Slide 33

Slide 33 text

2.0 Study real user data Look for potential places to parallelize, predict and cache

Slide 34

Slide 34 text

2.1 Use RUM to determine optimum POP location • Use RUM to measure latency from user to multiple POPs • Pick POP based on lowest latency • Adapt to changes in network topology http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum

Slide 35

Slide 35 text

2.1 Use RUM to determine best CDN • Use RUM to measure latency from user to multiple CDN providers • Dynamically pick CDN based on what works best

Slide 36

Slide 36 text

2.2 pre-browsing https://w3c.github.io/resource-hints/ http://w3c.github.io/preload/ http://filamentgroup.github.io/loadCSS/test/preload.html

Slide 37

Slide 37 text

2.2 • Does a DNS lookup for hostname mentioned in URL • This could help reduce latency when the request shows up — for first page views at least • Your DNS TTL needs to be long enough to survive past a page load

Slide 38

Slide 38 text

2.2 • Tells the parser that this resource will be required later on • Browser can start downloading in the background if it has nothing better to do with its resources • no-cache header only applies to subsequent pages

Slide 39

Slide 39 text

2.2 • Tells the browser to open a TCP connection to this host, and hold on to it • Any CORS restrictions will apply to this connection

Slide 40

Slide 40 text

2.2 • Tells the parser that this page is likely to be requested by the user • Browser downloads page, and all its resources, renders it, executes JavaScript and fires the onload event. • It’s like opening the page in a hidden Tab

Slide 41

Slide 41 text

2.2 • When user follows the URL, the page just shows up (< 5ms latency) • This is actually faster than switching tabs in the browser • The onVisibilityChange event fires and visibilityState changes from “prerender” to “visible” or “hidden”

Slide 42

Slide 42 text

2.2 — Caveats • The page needs to be requested using GET • The page should not require Authentication (401 response) • Prerender will be aborted if cookies, or localStorage change, or if the prerendered page has non-idempotent components

Slide 43

Slide 43 text

2.2 onVisibilityChange And while you’re at it, don’t do expensive work if the page is hidden https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/ Using_the_Page_Visibility_API

Slide 44

Slide 44 text

2.3 Post-load Fetch optional assets after onload

Slide 45

Slide 45 text

2.4 Detect broken accept-encoding Many Windows anti-viruses and firewalls disable gzip by munging the Accept-Encoding header http://www.lognormal.com/blog/2012/08/17/accept-encoding-stats/

Slide 46

Slide 46 text

2.5 HTTP/2 • Only one TCP connection per host • Do NOT use domain sharding • Do NOT use sprites • Do use Stream Multiplexing with Priorities • Do use Server Push http://chimera.labs.oreilly.com/books/1230000000545/ch12.html

Slide 47

Slide 47 text

— Tim Kadlec “4:2:0 subsampling of JPEGs gets a 62.5% memory savings” 2.6 Use 4:2:0 Chroma Subsampling Chroma Subsampling takes advantage of the fact that the human visual system is less sensitive to changes in colour than luminance http://en.wikipedia.org/wiki/Chroma_subsampling

Slide 48

Slide 48 text

2.7 Resize Images for target Device Dimensions Resizing Images for specific screen sizes could be the difference between 1.5s and 30ms https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015

Slide 49

Slide 49 text

2.8 Decode large images in a Worker Thread fetch(imageURL) // Get the image as a blob. .then(response => response.blob()) // Decode the image. .then(blobData => createImageBitmap(blobData)) // Send it to the main thread .then(imageBitmap => { self.postMessage({ imageBitmap }, [imageBitmap]); }, err => { self.postMessage({ err }); }); https://github.com/googlechrome/offthread-image

Slide 50

Slide 50 text

2.9 Don’t force layout operations • DOM manipulations followed by a read of invalidated properties forces a layout • This has a huge CPU impact • Read before write • Batch update • Move operations into the HEAD Amiya Gupta @ Velocity 2015

Slide 51

Slide 51 text

2.10 Understand 3PoFs Use blackhole.webpagetest.org to test for 3rd party single points of failure http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html

Slide 52

Slide 52 text

2.10 3PoFs Request Map by Simon Hearne

Slide 53

Slide 53 text

2.11 Understand your SpeedIndex http://heatmap.webperf.tools/render/150527_SH_827d2295b66e4180892db766eaf8a492/10000

Slide 54

Slide 54 text

2.12 What does your site cost? https://whatdoesmysitecost.com/test/160531_68_43bd4bcf91f91ca0bed2b3dfeb5eb67e

Slide 55

Slide 55 text

2.13 Prioritize optimizations based on user impact Conversion Impact Score in mPulse DSWB

Slide 56

Slide 56 text

2.14 Become a WebPageTest power user • Check out the comparison view • Collect packet captures • Use Wireshark • Test out different network types

Slide 57

Slide 57 text

References • WebPageTest — http://webpagetest.org • Boomerang — http://www.lognormal.com/boomerang/doc/ • SOASTA mPulse — http://www.soasta.com/mpulse • Netflix gzip study — http://www.slideshare.net/billwscott/improving-netflix-performance-experience • Nginx gzip_static — http://wiki.nginx.org/HttpGzipStaticModule • ImageOptim — http://imageoptim.com/ • uncss — https://github.com/giakki/uncss • grunt-uncss — https://github.com/addyosmani/grunt-uncss • Caching — http://www.mnot.net/cache_docs/ • Same domain CSS — http://www.jonathanklein.net/2014/02/revisiting-cookieless-domain.html • initcwnd — http://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/ • Linux TCP Tuning — http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/ • Prerender — https://developers.google.com/chrome/whitepapers/prerender • DNS prefetching — https://developer.mozilla.org/en-US/docs/Controlling_DNS_prefetching • Preloading CSS — http://filamentgroup.github.io/loadCSS/test/preload.html • FE SPoF — http://blog.patrickmeenan.com/2011/10/testing-for-frontend-spof.html • Page Visibility API — https://developer.mozilla.org/en-US/docs/Web/Guide/User_experience/Using_the_Page_Vis... • HTTP/2 — http://chimera.labs.oreilly.com/books/1230000000545/ch12.html • More Bandwidth is not a Magic Bullet — http://performancebeacon.com/more-bandwidth-isnt-a-magic-bullet-for... • The UserTiming API — http://www.html5rocks.com/en/tutorials/webperformance/usertiming/ • The 3.5s dash for attention — http://www.slideshare.net/buddybrewer/the-35s-dash-for-attention-and-other-stuff-we-... • POPs & RUM — http://www.slideshare.net/rmaheshw/velocity-2015-pops-and-rum • Optimizing Images for Mobile — https://speakerdeck.com/tkadlec/mobile-image-processing-at-velocity-sc-2015 • Load Images in a Worker Thread — https://aerotwist.com/blog/the-hack-is-back/ • Optimizing the MSN Homepage — Amiya Gupta @ Velocity 2015 • Simon Hearne’s Webperf Tools — http://requestmap.webperf.tools and http://heatmap.webperf.tools • What does my site cost — http://whatdoesmysitecost.com • Reducing JPG File size — https://medium.com/@duhroach/reducing-jpg-file-size-e5b27df3257c

Slide 58

Slide 58 text

Thank You

Slide 59

Slide 59 text

Philip Tellis @bluesmoon http://tech.bluesmoon.info http://www.soasta.com/mpulse/

Slide 60

Slide 60 text

Image Credits • Apple Pie
 http://www.flickr.com/photos/24609729@N00/3353226142/ • Puppies 7 weeks
 https://www.flickr.com/photos/29825916@N05/5760905069/ • Zöpfli
 https://twitter.com/dougsillars/status/737558689297534976