Embracing the Network – ColdFront, September 2015

276c149f793de9af4e98991ed52ff874?s=47 Patrick Hamann
September 03, 2015

Embracing the Network – ColdFront, September 2015

The network is intrinsically unreliable. More so, the network is out of your control as a developer. Therefore, we must design systems which embrace the unpredictability of the network and defend against it all costs. How can you prioritise the delivery of your core content? What best-practices can you use to optimise your assets? How can we design interfaces which adapt and respond to changing network conditions? And finally, how are new APIs such as ServiceWorker changing the way we think about the network?

During this talk Patrick will share his experiences delivering high-performance websites to millions of users over the past 3 years at The Guardian and Financial Times. Which – most importantly – are resilient to the network.

276c149f793de9af4e98991ed52ff874?s=128

Patrick Hamann

September 03, 2015
Tweet

Transcript

  1. Embracing the network Patrick Hamann — ColdFront Conference, September 2015


    @patrickhamann Modern techniques for building resilient front ends !
  2. None
  3. None
  4. Why?

  5. https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing <a href=“http://next.ft.com”>FT.com</a>

  6. None
  7. None
  8. None
  9. https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing 1. The network is reliable. 2. Latency is zero.

    3. Bandwidth is infinite. 4. The network is secure. 5. Topology doesn't change. 6. There is one administrator. 7. Transport cost is zero. 8. The network is homogeneous. 
 The fallacies of distributed computing 
 PETER DEUTSCH — SUN MICROSYSTEMS, 1994 https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
  10. Multiple points of failure Dmytrii Shchadei — http://www.slideshare.net/metrofun/reduce-mobile-latency " #

    # # $ Internal latency Firewalls, Load Balancers, Servers % Internet routing latency CDNs, ISPs, Caches, Proxies Control plane latency ~600ms on average UK 3G connection ~200ms
  11. The radio state machine http://developer.android.com/training/efficient-downloads/efficient-network-access.html Radio standby Radio low power

    Radio full power 2s latency 1.5s latency Radio idle for 5 seconds Radio idle for 12 seconds
  12. Title Text Request Map: http://requestmap.webperf.tools/

  13. Failure is inevitable • 1,000,000 page views a day •

    3 years product life-time • (1000000*360)*3 = 1,080,000,000 • 1 billion opportunities for something to go wrong.
  14. Baking in the assumption that everything can and will fail

    leads you to think differently about how you solve problems. 
 SAM NEWMAN — BUILDING MICROSERVICES, O’REILLY 2015 http://shop.oreilly.com/product/0636920033158.do
  15. 1. Testing and monitoring failure 2. Designing for failure 3.

    Embracing failure 4. The future
  16. Testing and monitoring failure

  17. A single point of failure (SPOF) is a part of

    a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system. https://en.wikipedia.org/wiki/Single_point_of_failure
  18. Front end SPOFs http://www.stevesouders.com/blog/2010/06/01/frontend-spof/ Chrome Firefox IE Opera Safari External

    script Blank below Blank below Blank below Blank below Blank below Stylesheet Flash Flash Blank below Flash Blank below inlined 
 @font-face Delayed Flash Flash Flash Delayed Stylesheet @font-face Delayed Flash Totally blank Flash Delayed Script then
 @font-face Delayed Flash Totally blank Flash Delayed
  19. http://uk.businessinsider.com/doubleclick-for-publishers-down-2014-11

  20. http://uk.businessinsider.com/doubleclick-for-publishers-down-2014-11

  21. http://uk.businessinsider.com/doubleclick-for-publishers-down-2014-11

  22. None
  23. https://twitter.com/scottjehl/status/636303029533282304

  24. https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing

  25. Jeremy Keith — https://flic.kr/p/o9thWy

  26. Resource timing API http://www.w3.org/TR/resource-timing/ var resourceList = window.performance.getEntriesByType(‘resource'); for (i

    = 0; i < resourceList.length; i++) { if (resourceList[i].initiatorType == 'link' || 'script') { navigator.sendBeacon('https://stats.ft.com/', { resource: resourceList[i].name, timing: (resourceList[i].responseEnd - resourceList[i].startTime) }); } }
  27. None
  28. None
  29. • Test 1st and 3rd party dependencies for SPOFs •

    Use network shaping to simulate mobile conditions • Monitor all response times, timeouts and errors • Alert if metrics pass their thresholds • Consider creating chaos in production
  30. 1. Testing and monitoring failure 2. Designing for failure 3.

    Embracing failure 4. The future
  31. Designing for failure

  32. None
  33. There seems to be an excessive amount of waiting around

    for pages to refresh and load -it doesn't seem as quick as the previous version. Luke Wroblewski – http://www.lukew.com/ff/entry.asp?1797
  34. None
  35. None
  36. None
  37. None
  38. Time and perception Delay User perception 0 - 100ms Instant

    100 - 300ms Small perceptible delay 300 - 1000ms Machine is working 1000+ ms Likely mental context switch 10,000+ ms Task is abandoned
  39. None
  40. None
  41. None
  42. • Respond within a 1000ms • Avoid loading spinners •

    Consider using skeleton screens • Re-use familiar UI patterns within error messaging • Use human friendly copy, especially in error states • Visualise the state of the network
  43. 1. Testing and monitoring failure 2. Designing for failure 3.

    Embracing failure 4. The future
  44. Embracing failure

  45. Pre-connect Ilya Grigorik — Eliminating roundtrips wth pre-connect: https://www.igvita.com/2015/08/17/eliminating-roundtrips-with-preconnect/ socket

    setup
  46. Pre-connect <link href='https://media.ft.com/' rel=‘preconnect’> <link href='https://api.livefyre.com/' rel='preconnect' crossorigin> function preconnectTo(url)

    { var hint = document.createElement("link"); hint.rel = "preconnect"; hint.href = url; document.head.appendChild(hint); } Ilya Grigorik — Eliminating roundtrips wth pre-connect: https://www.igvita.com/2015/08/17/eliminating-roundtrips-with-preconnect/
  47. None
  48. Caching Libraries Utilities Application Rarely change Frequent change

  49. Caching Libraries Utilities Application Rarely change Frequent change

  50. Caching Libraries Utilities Application Rarely change Frequent change & core.js

    & app.js
  51. HTTP/2 R.U.M data sampled over 1 month from https://next.ft.com/

  52. Alternative protocols

  53. Alternative protocols Method Communication Support AJAX request → response All

    major browsers Long poll request → wait → response All major browsers WebSockets client ↔ server Very good WebRTC peer ↔ peer Medium Server-sent events client ← server Good, except IE
  54. None
  55. Loading fonts https://tabatkins.github.io/specs/css-font-rendering/ Chrome Opera Safari Firefox IE Timeout 3s

    3s no timeout 3s 0 Fallback Yes Yes n/a Yes Yes Swap Yes Yes n/a Yes Yes
  56. CSS Font Rendering Controls Module https://tabatkins.github.io/specs/css-font-rendering/ body { font-display: swap;

    } .ft-promo { font-display: block; }
  57. ' ServiceWorker & ( %

  58. ' ServiceWorker & ( % 

  59. ' ServiceWorker & ( %

  60. Cache only self.addEventListener('install', function(event) { event.waitUntil( caches.open(cacheName).then(function(cache) { return cache.addAll([

    'http://fonts.ft.com/serif', 'http://fonts.ft.com/sans-serif' ]); }) ); }); https://jakearchibald.com/2014/offline-cookbook
  61. Cache only https://jakearchibald.com/2014/offline-cookbook self.addEventListener('fetch', function(event) { // If this is

    a request for font file if (/fonts\.ft\.com/.test(event.request)) { // If a match isn't found in the cache, the response // will look like a connection error event.respondWith(caches.match(event.request)); } });
  62. X Cache only ' & ( %  ( X

    ( ( &
  63. Loading CSS https://developers.google.com/web/fundamentals/performance/critical-rendering-path/analyzing-crp?hl=en Request page Network Render GET html Build

    DOM response Build CSSOM Render page GET css GET js response response idle idle Render blocking Run JS Render blocking
  64. Loading CSS Request page Network Render GET html Build DOM

    response Build CSSOM Render page GET css GET js response response idle idle Async Render blocking Run JS https://developers.google.com/web/fundamentals/performance/critical-rendering-path/analyzing-crp?hl=en
  65. https://github.com/filamentgroup/loadCSS

  66. Loading CSS Request page Network Render GET html Build DOM

    response Build CSSOM Render page GET css response idle idle Async https://developers.google.com/web/fundamentals/performance/critical-rendering-path/analyzing-crp?hl=en Reflow
  67. Stale-while-revalidate self.addEventListener('fetch', function(event) { // Get stale from cache event.respondWith(

    caches.open(cacheName).then(function(cache) { return cache.match(event.request).then(function(response) { // If in cache relvalidate var network = fetch(event.request).then(function(response){ if(response.status < 400) { cache.put(event.request, response.clone()); } }); // Return stale or network if none return response || network; }); }) ); }); https://jakearchibald.com/2014/offline-cookbook
  68. Stale-while-revalidate ' & ( %  (

  69. Stale-if-error self.addEventListener('fetch', function(event) { // Always fetch response from the

    network event.respondWith( fetch(event.request).then(function(response) { return caches.open(cacheName).then(function(cache) { // If we received an error response if(!response.ok) { return cache.match(event.request) } else { // Response was healthy so update cached version cache.put(event.request, response.clone()); return response; } }); }) ); });
  70. Timeouts ' & ( % X * X

  71. Timeout race function timeout(delay) { return new Promise(function(resolve, reject) {

    setTimeout(function() { resolve(new Response('', { status: 408, statusText: 'Request timed out.' })); }, delay); }); } self.addEventListener('fetch', function(event) { // Attempt to fetch with timeout event.respondWith(Promise.race([timeout(2000), fetch(event.request)])); });
  72. Circuit breakers https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing

  73. Circuit breakers ' & ( % ( X ( (

    X X X X Circuit closed (
  74. Circuit breakers ' & ( % ( X ( (

    X X Circuit open, fail fast (
  75. https://github.com/yammer/circuit-breaker-js

  76. importScripts('js/circuit-breaker.js'); CircuitBreaker.prototype.fetch = function(request) { return new Promise(function(resolve, reject) {

    this.run(function(success, fail) { fetch(request).then(function(response) { if(response.ok) { success(); } else { fail(); } resolve(response); }) .catch(function(err) { fail(); reject(Response.error()); }); }, function() { resolve(Response.error()); }); }.bind(this)); };
  77. Fetch via circuit breaker var circuitBreakers = {}; var options

    = { windowDuraion: 1000, timeoutDuration: 3000, errorThreshold: 50, volumeThrehold: 2 }; self.addEventListener('fetch', function(event) { var url = event.request.url; if(!circuitBreakers[url]) { circuitBreakers[url] = new CircuitBreaker(options); } event.respondWith(circuitBreakers[url].fetch(event.request)); });
  78. Dead letter queue ' & ( % X + offline

  79. Dead letter queue ' & ( % + Online

  80. Dead letter queue var queue = {}; function queueFailedRequest(request) {

    queue[Date.now()] = request; } self.addEventListener('fetch', function(event) { if(/track\.ft\.com/.test(event.request.url) { event.respondWith( fetch(event.request).then(function(response){ if(response.status >= 500) { return Response.error(); } else { return response; } }).catch(function() { queueFailedRequest(event.request); }) ); } });
  81. Dead letter queue function replayQueuedRequests() { Object.keys(queue).forEach(function(event) { fetch(queue[event]).then(function(){ if(response.status

    >= 500) { return Response.error(); } delete queue[error]; }).catch(function() { var timeDelta = Date.now() - event; if (timeDelta > expiration) { delete queue[error]; } }); }); }
  82. • Pre-connect sockets • Pre-fetch and cache critical resources •

    Bundle resources by their frequency of change • Choose appropriate protocols • Queue and batch requests • Use timeouts, bulkheads and circuit breakers • Treat the network as an enhancement!
  83. 1. Testing and monitoring failure 2. Designing for failure 3.

    Embracing failure 4. The future
  84. The future

  85. None
  86. Push notifications

  87. Offline Network unavaliable

  88. Background sync navigator.serviceWorker.ready.then(function(registration) { regitration.periodicSync.register({ tag: 'get-latet-news', // default: ''

    minPeriod: 12 * 60 * 60 * 1000, // default: 0 powertState: 'avoid-draining', // default: 'auto' networkState: 'avoid-cellular' // default: 'online' }).then(function(periodicsyncReg) { // success }, function() { // failure }); });
  89. Progressive apps: • Responsive design • Secure (via TLS) •

    Network agnostic (via ServiceWorker) • Homescreen access • Push notifications • Background Sync
  90. 1. Testing and monitoring failure 2. Designing for failure 3.

    Embracing failure 4. The future
  91. Conclusion

  92. • Bake in the assumption that everything can and will

    fail • Guard against SPOFs at all costs • Always provide visual user feedback • Optimise for efficient networking • Treat the network as an enhancement • Embrace the future
  93. Thanks @patrickhamann speakerdeck.com/patrickhamann
 github.com/Financial-Times ! , - Do you want

    to help build this stuff? Join in. jobs@labs.ft.com