Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When dispatcher caching is not enough...

When dispatcher caching is not enough...

Content distribution for worldwide audience is not a trivial task. Most of the time the goal is very well known - keep your users happy and deliver them content they need as fast as you can.

There are at least two ways you can achieve that. You can build (and manage!) your own solution (AEM/dispatcher farms spread across the globe) or put a CDN in front of your application stack. The first one may sound tempting, but on second thought you quickly realize it's too much hassle and you would rather go for CDN. Regardless of the solution a set of problems stays the same.

Back in the old days you could just cache (almost) everything, as your website was pretty much static, but currently it's much more complicated. Your AEM stack is built from dynamic components that fetch data from 3rd party apps, there's a search engine under the hood and all crucial content is available for logged-in users only. To be even worse your resources are updated multiple times a day. Is it even possible to leverage CDN for that type of websites?

Have you ever tried to cache customized content that is available for authenticated users? Or authorize them at the edge? Or maybe you were crazy enough to implement CDN, not only for content served from AEM publish, but also in front of your authoring? In my talk I'd like to present you how we integrated AEM app that serves content to users distributed all over the world with heavily customizable content delivery network (Fastly).

Avatar for Jakub Wądołowski

Jakub Wądołowski

June 24, 2015
Tweet

More Decks by Jakub Wądołowski

Other Decks in Technology

Transcript

  1. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 1 When dispatcher caching is not enough… Jakub Wądołowski Senior Systems Engineer @ Cognifide twitter.com/jwadolowski
  2. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 2 Agenda  The What  What was the problem about?  The Why  Why we decided to go for Content Delivery Network (CDN)?  The How  How it was implemented?
  3. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 4 It all started in 2012… www.flickr.com/photos/nasahqphoto/16327416694
  4. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 5 To be perfectly honest, initially it was rather like that… www.flickr.com/photos/garryknight/5703519506
  5. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 6 The client  EU pharmaceutical company  75 offices across the globe  Over 40 000 employees  Medical products available worldwide (180+ countries) www.flickr.com/photos/worak/2258271659
  6. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 7 Requirements  Country specific brochureware websites for medical products  iPad app for sales representatives  Single point for content entry  Multiple integration points (SSO, user/device authentication, etc.)  CQ 5.5, upgrade to AEM 6.1 in progress
  7. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 8 Main components Brochureware website iPad app AEM Authoring
  8. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 9 Logical Architecture  Single datacenter in London  REST-like API for iPad app  Integrations with local and remote services
  9. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 10 Initially it was just Spain, Argentina and Sweden
  10. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 11 6 months later the number of countries was tripled
  11. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 12 To finally reach 21 and it is still not over
  12. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 14 “Our team in Argentina complains that the app feels slow. They can’t download presentations sometimes. Could you please investigate that?” Mr B. www.flickr.com/photos/r4vi/8640618489
  13. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 15 Problems  Latency, latency, latency…  Way too high round trip times (RTT)  Timeouts  Broken streams  Connection resets  Poor Internet connections in some areas
  14. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 17 When initial excitement was gone…  How we’re going to sync the content (both ways)?  What about deployments?  Do we have enough licenses to set up the new stack in a proper way?  What’s the best way to implement content sharding?  How long it will take to implement all of these things?
  15. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 18 PoC conclusion www.flickr.com/photos/geishaboy500/2496995573
  16. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 19 The road to CDN  We can’t just cache more on dispatcher  This is a very well known problem  Let’s use the right tool to solve the problem the right way  Content Delivery Network (CDN) is the way to go!
  17. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 20 CDN definition • “(…) CDN is a large distributed system of servers deployed in multiple data centers across the Internet. The goal of a CDN is to serve content to end-users with high availability and high performance. CDNs serve a large fraction of the Internet content today (..).”, Wikipedia
  18. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 22 CDN, right? www.flickr.com/photos/pictures-of-money/16678590844
  19. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 23 That's not necessarily true nowadays… www.flickr.com/photos/halfrain/14410890555
  20. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 24 Why Fastly?  Pay-as-you-go model  Powered by Varnish  Highly customizable (ability to upload your own VCL)  150 ms to purge – globally  ~5 sec to change a config through the web API  SSD powered servers connected to T1 networks  Real-time insight what’s happening (graphs, logs, etc)  Great support
  21. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 28 Uhh… ok, how should I start? www.flickr.com/photos/kleuske/8004416109
  22. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 29 The logs! www.flickr.com/photos/martinbamford/5638834940
  23. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 30 Logs and content structure  grep, awk, sed - all of these are your friends  Count your requests  Leverage the power of log monitoring tools (ELK, Splunk, etc.)  Plan your content structure carefully
  24. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 31 Look for patterns www.flickr.com/photos/wwarby/4915777722
  25. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 32 Request patterns  If it is a GET request and starts with /bin/myapp/v[1-2]/a_string.json then it is X  All requests to /content/something/*/_jcr_content.zip end with 302 to /some/path/to/file.zip
  26. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 33 Assign these patterns to multiple buckets www.flickr.com/photos/ddebold/15991919514
  27. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 34 Content groups/buckets  Public content  Private content  Content available for authorized users only
  28. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 35 Varnish in 1 slide!  Reverse HTTP proxy  In-memory time based cache  Blazing-fast  Big “state” machine  Varnish Configuration Language (VCL)  Full control of HTTP flow
  29. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 36 General caching rules  Cacheable methods: GET, HEAD  Cacheable response codes: 200, 203, 300, 301, 302, 410, 404  “Cache-Control: private” if not defined otherwise
  30. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 37 Let’s start with the iPad app www.flickr.com/photos/pestoverde/15048774061
  31. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 38 iPad content  2 content groups  8 request patterns  TTL varies from 10 minutes to 7 days  35/65 dynamic/static content (frequently changing JSON files vs PDFs/PNGs)  All REST API responses are private
  32. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 39 Private content  Private content is cacheable  What makes HTTP response private?  It is tied up with user session – in other words HTTP request carried unique authorization cookie
  33. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 40 Is it really safe to cache that type of content? www.flickr.com/photos/hyku/368912557
  34. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 41 Private cache  Varnish cache is a key-value store  Default key: req.url + req.http.host  req.url + req.http.host + sessionId = private cache space - voila!
  35. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 44 Dynamic means uncacheable? www.flickr.com/photos/gsfc/7402445224
  36. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 45 Dynamic content  Cache usually brings some trade-off  Updates won’t be instantaneous  TTL has to expire, or  a purge request has to be triggered  CDN is the way to go if you accept this delay
  37. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 46 Content purging www.flickr.com/photos/librariesrock/13522859053
  38. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 49 Content purging  Fastly exposes purge REST API  Purge URL  Purge Key  Purge All  Purge vs Soft Purge
  39. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 50 Results www.flickr.com/photos/89228431@N06/11322953266
  40. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 51 iPad app statistics  Hit ratio: 48,4%  Cache coverage: 65,3%  Requests: 83K
  41. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 52 What about the speed? www.flickr.com/photos/129341635@N02/16609174727
  42. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 53 Speed boost  Presentation download  Europe: up to 21% faster  South America: up to 50% faster  APAC: up to 83% faster  API responses  Europe: up to 60% faster  South America: up to 40% faster  APAC: up to 55% faster
  43. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 54 Issues? www.flickr.com/photos/giuseppemilo/15414290956
  44. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 55 Crimes against cacheability www.flickr.com/photos/alancleaver/4121423119
  45. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 56 Crimes against cacheability  Adding Set-Cookie to every response  Auth cookie is not revoked in the browser after logout  TBD
  46. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 57 “iPad app performance is much better now! But we still have some issues with authoring. It is really slow in some countries.” Mr B. www.flickr.com/photos/r4vi/8640618489
  47. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 58 CDN in front of authoring?  I was rather skeptical  Way too dynamic to be considered cacheable?  What kind of improvement we might get? 5-10%? Is it worth it?  Don’t know how, but it has been decided to roll things out 
  48. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 59 CDN + AEM Author  3 content groups  36 request patterns  TTL up to 14 days  Mostly dynamic + static web GUI resources  A lot of assets common for every logged in user
  49. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 60 Authorized only! www.flickr.com/photos/rudyjuanito/5170435542
  50. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 61 Authorize at the edge  CDN knows nothing about user session  The goal is to cache common content for successfully authorized users  Authorize them at the edge!
  51. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 62 Auth tokens www.flickr.com/photos/cfortier/426610972
  52. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 63 Auth tokens  2nd auth cookie (token), readable by CDN  HMAC function  2 auth cookies are tied together  Reference implementation: https://github.com/fastly/token-functions  Private key shared between AEM and CDN  CDN can evaluate user session without request to AEM
  53. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 65 www.flickr.com/photos/spacexphotos/16169087563
  54. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 66 Author statistics  Hit ratio: 96,4%  Cache coverage: 45,1%  Requests: 97K
  55. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 67 Crimes against cacheability  Adding Set-Cookie to every response  Auth cookie is not revoked in the browser after logout  “Vary: Cookie” usage
  56. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 68 Summary www.flickr.com/photos/andrewhurley/6254409229
  57. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 69 Summary  Traffic growth is no longer an issue  Over 2 TB monthly reaches CDN servers  ~5,5 million HTTP requests per month  just ~570 GB was passed through to AEM  License, budget and time savings  More than satisfying results  Very small changes in the AEM app itself  Happy client 
  58. The future of digital marketing. London, Poland, Copenhagen. © 23/06/2015

    Page 70 [email protected] github.com/jwadolowski twitter.com/jwadolowski linkedin.com/in/kubawadolowski/en www.flickr.com/photos/jeffdjevdet/18027482924