Slide 1

Slide 1 text

Scaling Craft Sites for Large Launches Matt Weinberg 2018/09/28 Vector Media Group

Slide 2

Slide 2 text

Intro

Slide 3

Slide 3 text

Goals for the talk ● Address “things to think about” when launching mission-critical Craft sites ● Explain various approaches to scaling, high uptime, and redundancy ● Provide a pre-launch checklist ● Help everyone launch faster Craft sites with more uptime

Slide 4

Slide 4 text

Who am I? ● Co-founder and President of Development at Vector Media Group ● 50 people, based in NYC ● Craft service partners ● Significant amount of Craft work, including very large sites

Slide 5

Slide 5 text

Who am I? VectorMediaGroup.com [email protected] @mrw on twitter

Slide 6

Slide 6 text

1. Performant: quickly answer real-world traffic 2. Redundant + Horizontally scalable: handle traffic surges and survive partial service degradations 3. Visible: easily predict and monitor performance under various scenarios 4. Secure: follows best practices What should “scaling” mean to you?

Slide 7

Slide 7 text

Performance Quickly answer real-world traffic

Slide 8

Slide 8 text

Performance ● Many factors involved in a user’s total load speed. Asset loading, CSS structure, etc. ○ Andrew Welch’s “Making a Craft CMS Website That FLIES” presentation yesterday ● Next few slides concentrate only on TTFB - time to first byte. ○ Other factors are very important for users but a little different from the scaling questions here ● Reducing web server load makes everything else easier

Slide 9

Slide 9 text

TTFB (Time To First Byte) ● Time between initial request and getting any response back ● Craft/PHP compiles, parses, and renders the templates ● Lower TTFB means: ○ Faster site for users ○ Server freeing up resources more quickly ● Measure in Chrome or webpagetest.org

Slide 10

Slide 10 text

● Static assets (CSS, JS, images, etc…) will have much faster load than Craft templates ● Template Target: ○ Good: <200ms ○ Better: <100ms ○ Best: <75ms ● Fixing low hanging fruit is good but don’t obsess about a 10ms difference TTFB (Time To First Byte)

Slide 11

Slide 11 text

● Cache aggressively. Anything that can be cached, should be. ● Cache globally where possible. ● Use custom cache keys for geo and other differences ● Add config constants if needed Lowering TTFB in Craft

Slide 12

Slide 12 text

Lowering TTFB in Craft

Slide 13

Slide 13 text

Lowering TTFB in Craft ● Make sure OPCache (PHP’s built-in bytecode cache) is turned on in php.ini. ● Offload as much as you can from the frontend servers ○ Use CDNs such as Cloudflare, CloudFront, Google Cloud CDN, Akamai, etc to serve static content (JS/CSS/images) ○ Use hosted cloud services for DB/cache/queue. Trust the experts. ■ A managed host like Arcustech or a cloud DB service from Amazon (RDS/Aurora) or Google (Cloud SQL) is best for most sites without ops engineers.

Slide 14

Slide 14 text

Redundancy + Horizontal scalability Handle traffic surges and survive partial service degradations

Slide 15

Slide 15 text

Assume multiple tiers Load Balancer Craft Craft Craft DB Memory Cache DB Memory Cache

Slide 16

Slide 16 text

Assume multiple tiers ● From day 1: assume different server tiers, and Craft living on multiple front-end servers ● Load balancer means the failure of one frontend server doesn’t bring the site down ● Staging site (and local!) should implement this ● Multiple FE servers has a few implications for build process

Slide 17

Slide 17 text

● Earlier: “Offload as much as you can from the frontend servers” ● Architecting this from the beginning will make it a lot easier later ● Everyone: use cloud or NAS object storage for assets ○ Recommend: Amazon S3, Google Cloud Storage ■ Name your bucket as a domain (e.g. assets.domain.com) ○ These are slow; front them with a CDN ● Using good managed hosting? Much of the rest will be handled Assume multiple tiers

Slide 18

Slide 18 text

Assume multiple tiers ● Using unmanaged hosting? ○ Load Balancer ■ Recommend: Amazon ALB, Google CLB, HAProxy ○ DB ■ Recommend: Amazon RDS, Google Cloud SQL ■ Consider PostgreSQL ● DataGrip by JetBrains ■ Consider Aurora/Aurora Serverless ■ Consider MariaDB or Percona Server ○ Memory cache ■ Recommend: Redis

Slide 19

Slide 19 text

Assume multiple frontends Shared memory cache

Slide 20

Slide 20 text

Assume multiple frontends Shared session storage

Slide 21

Slide 21 text

Assume multiple frontends Shared external queue

Slide 22

Slide 22 text

Assume multiple frontends If you’re using the same in-memory cache for more than one of caching, sessions, and queuing, you can make the config less repetitive using Yii components.

Slide 23

Slide 23 text

Assume multiple frontends ● NAS/shared file storage is another option ● We’ve used this successfully with Craft before ● Amazon EFS, Google Filestore, or any NAS/NFS mount ● Carefully check performance. Can be extremely slow if not configured correctly. ○ Consider provisioned or premium throughput options

Slide 24

Slide 24 text

Reverse proxy caching Load Balancer Craft Craft Craft DB Memory Cache DB Memory Cache Reverse proxy cache

Slide 25

Slide 25 text

Reverse proxy caching ● At some point you’ll want to start thinking about this ● Helps take as much load as possible off your application servers ● Cloudflare is very popular with Craft sites and has built-in caching ○ Using Arcustech? You’ll get a faster connection even on the free plan because of Railgun ● Varnish is very popular as well, but complex. ○ Fastly is a much easier to use hosted Varnish provider

Slide 26

Slide 26 text

Down the road... ● Integrate more dedicated services like Elasticsearch or Algolia ● Create an off-thread background caching and cache-breaking system ○ Including stale serving

Slide 27

Slide 27 text

Visibility Easily predict and monitor performance under various scenarios

Slide 28

Slide 28 text

Launch checklist ● Launch checklist with responsible people identified ● Each item should be assigned to 1 person, not a team ● Capture everything: Analytics tags, backups, SSL certificates, domain redirects, debugging turned off, load testing complete, etc...

Slide 29

Slide 29 text

Load testing ● Open Source: we like Siege and Bees With Machine Guns ● Commercial: we like Load Impact ○ Can subscribe for just a month ○ Chrome extension to record typical user paths and all requests

Slide 30

Slide 30 text

Load testing ● Load test before all launches ● More than just hitting a single URL a few times ● Simulate hitting all a page’s assets concurrently ● Test from different geo regions ● Increase traffic until it “breaks” ○ Knowing your site’s limit helps guide business decisions ○ First fix by optimizing low-hanging fruit (performance), then fix by adding hardware (horizontal scalability)

Slide 31

Slide 31 text

Load testing Load Impact: site limit reached

Slide 32

Slide 32 text

Identifying bottlenecks ● Craft debugging toolbar is a big help ● Slow query logging on your database is important ○ Tip: use debugging profiler to see template generating a query. Add unique limits to your entries queries (“.entries().limit(23)”) to help identify them in the debugger and DB logs ● New Relic is very helpful here ● Stackdriver/CloudWatch ● Create a Craft plugin as a performance dashboard

Slide 33

Slide 33 text

Load testing Load Impact: site limit reached

Slide 34

Slide 34 text

Security Follow best practices

Slide 35

Slide 35 text

Security ● Craft has great protections built-in ○ Twig escaping ○ One-click updates ○ Proper password hashing ○ Sane minimum server requirements ○ DB query parameterization ○ etc...

Slide 36

Slide 36 text

● If you can, IP lock the control panel ○ Cloudflare calls this “Zone Lockdown” ○ Can also be done in .htaccess/nginx config, or a plugin ● If you can’t IP lock the control panel: ○ Consider turning ‘preventUserEnumeration’ to true ■ Possibly more confusing for some users but more secure ○ Change cpTrigger from “admin”, which is often scanned ● Consider tying Craft login to SSO - Google, Okta, etc… ○ Via plugin, Cloudflare Access, or similar ● Weak user passwords are your biggest threat Security

Slide 37

Slide 37 text

Thank You + Q&A ● Contact me any time with any questions ● [email protected] ● @mrw ● https://www.VectorMediaGroup.com