Slide 1

Slide 1 text

Scaling a web stack Lars Yencken / Data Scientist / 99designs 29 April, 2013

Slide 2

Slide 2 text

99designs Growing infrastructure Stability and robustness Performance Recap

Slide 3

Slide 3 text

99designs a.k.a. why you should listen

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

0 225000 450000 675000 900000 Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13 Designs submitted

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

$52,000,000 paid out to designers

Slide 17

Slide 17 text

$52,000,000 paid out to designers 30,000,000 visitors

Slide 18

Slide 18 text

$52,000,000 paid out to designers 30,000,000 visitors 1,100,000,000 pageviews

Slide 19

Slide 19 text

$52,000,000 paid out to designers 30,000,000 visitors 1,100,000,000 pageviews 35,000,000,000 HTTP requests

Slide 20

Slide 20 text

Growing infrastructure

Slide 21

Slide 21 text

App

Slide 22

Slide 22 text

App DB

Slide 23

Slide 23 text

DB App App DNS round robin

Slide 24

Slide 24 text

DB Cache App App reverse proxy layer

Slide 25

Slide 25 text

DB Cache Queue Worker App App async task queue

Slide 26

Slide 26 text

DB Cache Queue Worker App App optimized for latency optimized for throughput

Slide 27

Slide 27 text

Cache Queue Memcache Worker App App DB

Slide 28

Slide 28 text

Cache App App Cache App App App App Memcache Queue Worker remove single points of failure DB DB*

Slide 29

Slide 29 text

Cache App App Cache App App App App Memcache Queue Worker Balancer DB add flexibility to the cache layer DB*

Slide 30

Slide 30 text

Software as infrastructure

Slide 31

Slide 31 text

“make recipes, not servers”

Slide 32

Slide 32 text

• "Cloud" hosting on Amazon Web Services • Instead of few, highly-tuned servers, have many disposable servers • Tradeoff that favours flexibility

Slide 33

Slide 33 text

Challenges

Slide 34

Slide 34 text

Stability and robustness

Slide 35

Slide 35 text

Costs of instability • Lost customer business (direct & indirect) • Support burden and costs • Ops burden and costs

Slide 36

Slide 36 text

Redundant servers Cache App App Cache App App App App Balancer

Slide 37

Slide 37 text

Asynchronous tasks DB App App App App App App DB Queue Worker 3rd party services

Slide 38

Slide 38 text

Database replication App App App App App App DB Hot spare DB reader DB reader

Slide 39

Slide 39 text

Still difficult... • Testing failure tolerance between components: not trivial! • Avoiding correlated failures

Slide 40

Slide 40 text

Correlated failures

Slide 41

Slide 41 text

Performance

Slide 42

Slide 42 text

• Less traffic (experiments by Yahoo, Microsoft, Google; ranking) • Worse user experience • Higher hardware costs Costs of slow sites

Slide 43

Slide 43 text

Cacheing DB Cache App App Cache App App App App DB Memcache

Slide 44

Slide 44 text

Cacheing DB Cache App App Cache App App App App DB Memcache whole pages & images whole files on disk db queries & page fragments

Slide 45

Slide 45 text

Response time from cache (s)

Slide 46

Slide 46 text

Response time from cache (s) cache miss cache hit

Slide 47

Slide 47 text

3 orders of magnitude! Response time from cache (s) cache miss cache hit

Slide 48

Slide 48 text

Serving globally Cache Cache Balancer Content distribution network mysite.com media.mysite.com

Slide 49

Slide 49 text

Bundling static media 99designs.com/static/css/core.css 99designs.com/static/css/contest.css 99designs.com/static/css/marketing.css 99designs.com/bundle/css/core,contest,marketing.css

Slide 50

Slide 50 text

Difficulties • Norms for browsers and internet connections constantly change • Some strategies conflict with each-other • Measure, measure, measure!

Slide 51

Slide 51 text

Recap

Slide 52

Slide 52 text

Thanks! @larsyencken