Slide 1

Slide 1 text

What we can learn from CDNs about Web Development, Deployment, and Performance Tyler McMullen, CTO & Mad Scientist WebPerf Meetup, London March 1st

Slide 2

Slide 2 text

Lesson: People Use Their CDNs Wrong

Slide 3

Slide 3 text

CDNs offer a toolset • The black box approach isn’t always good • Configuration isn’t trivial – And a lot still depends on configuration • Can’t depend on the CDN to solve all your problems • Don’t exacerbate your problems!

Slide 4

Slide 4 text

http://bigqueri.es/t/sites-that-deliver-images-using-gzip-deflate-encoding/220

Slide 5

Slide 5 text

Gzipping Images • Not a very good thing for performance – Extra bytes – Extra work for the browser • But was this the Surrogate’s fault?

Slide 6

Slide 6 text

More Examples • Bad caching headers – max-age, s-maxage have a lot of power! – stale-if-error and stale-while-revalidate are rad! • Bad TCP connection management at origin • Not Gzipping (actual, compressible content) for origin fetches

Slide 7

Slide 7 text

With Great Power…

Slide 8

Slide 8 text

Lesson: Dynamic Content Is Really Interesting!

Slide 9

Slide 9 text

What Is Dynamic Content? • Stuff that’s not static! • With web traffic, generally the base HTML – Big deal because it’s blocking – And sometimes the largest object; longer download • Some AJAX • More…

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Blocking

Slide 12

Slide 12 text

Classically, with dynamic content… Caching

Slide 13

Slide 13 text

Caching vs. Invalidation

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

We tried…

Slide 18

Slide 18 text

Dynamic Content Caching Problems • Serving stale pages – Lack of good invalidation framework • Visibility • Logging

Slide 19

Slide 19 text

CDNs and Dynamic Content • Generally, handling dynamic content has been a matter of transport – Middle mile optimizations – TCP tweaks • Some edge micro caching, but not easy • ESI

Slide 20

Slide 20 text

Actually… • Dynamic content is more cacheable than we think • Static for short periods of time • Unpredictable invalidation – Standard HTTP caching rules aren’t good enough • Event-driven Content

Slide 21

Slide 21 text

So Many Benefits! • Performance – Faster time to first byte – Faster start render – Happy users! • Offload – Less work for our servers – Less bandwidth at origin

Slide 22

Slide 22 text

What Would Make It Better? • Programmatic Invalidation – Granular – Instantaneous • Control at the edge, and not just for web pages – Real-time log files – Imagine terminating beacons at the edge!

Slide 23

Slide 23 text

Lesson: Measurement is Still Hard

Slide 24

Slide 24 text

Client-side Measurement in CDNs • Cache hit ratio – How do you test and measure? • Long tail content? • DNS and edge node selection • TTFB out of datacenter – Memory hit vs disk hit vs mid-tier hit vs miss • RUM and synthetic (Cedexis, Catchpoint, etc)

Slide 25

Slide 25 text

Let’s Test It! • 3 Objects on the same CDN (anonymous) – Cedexis object – Small image from Alexa 5000 site – Long tail object: ~40 times every 3-4 hours • Use Catchpoint last mile clients in US – Test every 15 minutes – ~11,500 total tests across all test nodes • Focus measurement on: – Connect time (TCP) – Wait time (TTFB)

Slide 26

Slide 26 text

Cedexis Object Connect (median) Wait (median) Cedexis 14ms 19ms

Slide 27

Slide 27 text

Cedexis Object Alexa 5000 Connect (median) Wait (median) Cedexis 14ms 19ms Alexa 5000 14ms 24ms

Slide 28

Slide 28 text

Cedexis Object Alexa 5000 Connect (median) Wait (median) Cedexis 14ms 19ms Alexa 5000 14ms 24ms 26%

Slide 29

Slide 29 text

Cedexis Object Long Tail Alexa 5000 Connect (median) Wait (median) Cedexis 14ms 19ms Alexa 5000 14ms 24ms Long Tail 15ms 29ms

Slide 30

Slide 30 text

Cedexis Object Long Tail Alexa 5000 Connect (median) Wait (median) Cedexis 14ms 19ms Alexa 5000 14ms 24ms Long Tail 15ms 29ms 20%

Slide 31

Slide 31 text

Cedexis Object Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 481 14ms 19ms 6741 14ms 20ms Disk 428 12ms 24ms 9626 15ms 28ms 4692 14ms 31ms Miss 1 6ms 38ms 1355 16ms 51ms 28 13ms 45ms

Slide 32

Slide 32 text

Cedexis Object Alexa 5000 Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms

Slide 33

Slide 33 text

Cedexis Object Long Tail Alexa 5000 Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms

Slide 34

Slide 34 text

Cedexis Object Long Tail Alexa 5000 Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms 99.99% Mem: 96.27% Disk: 3.72%

Slide 35

Slide 35 text

Cedexis Object Long Tail Alexa 5000 Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms 99.99% Mem: 96.27% Disk: 3.72% 99.76% Mem: 58.82% Disk: 40.94%

Slide 36

Slide 36 text

Cedexis Object Long Tail Alexa 5000 Count TCP TTFB Count TCP TTFB Count TCP TTFB Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms 99.99% Mem: 96.27% Disk: 3.72% 99.76% 88.17% Mem: 58.82% Disk: 40.94% Mem: 4.19% Disk: 83.98%

Slide 37

Slide 37 text

Measurement! • Not only do I care about: – Cache hit rate – Long tail – Measuring the right thing • Fetching from disk could suck! – SSDs! • Caching ≠ Caching

Slide 38

Slide 38 text

Lesson: CDNs Are Not Solved!

Slide 39

Slide 39 text

We Don’t Cache As Much As We Should! • HTML and other dynamic content • Worse cache hit rate than we think – Especially for long tail content • Mobile Apps, APIs, etc

Slide 40

Slide 40 text

Lots of Room • Making changes still sucks • Can’t take some things for granted: – DNS – Routing – TCP – SCALE! • Plus: lots of room to be creative at the edge!

Slide 41

Slide 41 text

Thank you!