Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling a web stack

Scaling a web stack

A talk on scaling a typical web stack, based on lessons learned at 99designs, given to an RMIT Systems Architecture class.

Lars Yencken

April 29, 2013
Tweet

More Decks by Lars Yencken

Other Decks in Programming

Transcript

  1. Scaling a web stack
    Lars Yencken / Data Scientist / 99designs
    29 April, 2013

    View full-size slide

  2. 99designs
    Growing
    infrastructure
    Stability
    and
    robustness
    Performance
    Recap

    View full-size slide

  3. 99designs
    a.k.a. why you should listen

    View full-size slide

  4. 0
    225000
    450000
    675000
    900000
    Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13
    Designs submitted

    View full-size slide

  5. $52,000,000 paid out to designers

    View full-size slide

  6. $52,000,000 paid out to designers
    30,000,000 visitors

    View full-size slide

  7. $52,000,000 paid out to designers
    30,000,000 visitors
    1,100,000,000 pageviews

    View full-size slide

  8. $52,000,000 paid out to designers
    30,000,000 visitors
    1,100,000,000 pageviews
    35,000,000,000 HTTP requests

    View full-size slide

  9. Growing
    infrastructure

    View full-size slide

  10. DB
    App
    App DNS round robin

    View full-size slide

  11. DB
    Cache
    App
    App
    reverse proxy
    layer

    View full-size slide

  12. DB
    Cache
    Queue
    Worker
    App
    App
    async task
    queue

    View full-size slide

  13. DB
    Cache
    Queue
    Worker
    App
    App
    optimized for
    latency
    optimized for
    throughput

    View full-size slide

  14. Cache
    Queue
    Memcache
    Worker
    App
    App
    DB

    View full-size slide

  15. Cache
    App
    App
    Cache
    App
    App
    App
    App
    Memcache Queue
    Worker
    remove single
    points of failure
    DB
    DB*

    View full-size slide

  16. Cache
    App
    App
    Cache
    App
    App
    App
    App
    Memcache Queue
    Worker
    Balancer
    DB
    add flexibility to
    the cache layer
    DB*

    View full-size slide

  17. Software as
    infrastructure

    View full-size slide

  18. “make recipes,
    not servers”

    View full-size slide

  19. • "Cloud" hosting on Amazon Web Services
    • Instead of few, highly-tuned servers, have
    many disposable servers
    • Tradeoff that favours flexibility

    View full-size slide

  20. Stability and
    robustness

    View full-size slide

  21. Costs of instability
    • Lost customer business (direct & indirect)
    • Support burden and costs
    • Ops burden and costs

    View full-size slide

  22. Redundant servers
    Cache
    App
    App
    Cache
    App
    App
    App
    App
    Balancer

    View full-size slide

  23. Asynchronous tasks
    DB
    App
    App
    App
    App
    App
    App
    DB
    Queue
    Worker
    3rd party
    services

    View full-size slide

  24. Database replication
    App
    App
    App
    App
    App
    App
    DB
    Hot spare
    DB reader
    DB reader

    View full-size slide

  25. Still difficult...
    • Testing failure tolerance between
    components: not trivial!
    • Avoiding correlated failures

    View full-size slide

  26. Correlated failures

    View full-size slide

  27. • Less traffic (experiments by Yahoo,
    Microsoft, Google; ranking)
    • Worse user experience
    • Higher hardware costs
    Costs of slow sites

    View full-size slide

  28. Cacheing
    DB
    Cache
    App
    App
    Cache
    App
    App
    App
    App
    DB
    Memcache

    View full-size slide

  29. Cacheing
    DB
    Cache
    App
    App
    Cache
    App
    App
    App
    App
    DB
    Memcache
    whole pages
    & images
    whole files
    on disk
    db queries &
    page fragments

    View full-size slide

  30. Response time from cache (s)

    View full-size slide

  31. Response time from cache (s)
    cache
    miss
    cache
    hit

    View full-size slide

  32. 3 orders of magnitude!
    Response time from cache (s)
    cache
    miss
    cache
    hit

    View full-size slide

  33. Serving globally
    Cache
    Cache
    Balancer
    Content
    distribution
    network
    mysite.com media.mysite.com

    View full-size slide

  34. Bundling static media
    99designs.com/static/css/core.css
    99designs.com/static/css/contest.css
    99designs.com/static/css/marketing.css
    99designs.com/bundle/css/core,contest,marketing.css

    View full-size slide

  35. Difficulties
    • Norms for browsers and internet connections
    constantly change
    • Some strategies conflict with each-other
    • Measure, measure, measure!

    View full-size slide

  36. Thanks!
    @larsyencken

    View full-size slide