Upgrade to Pro — share decks privately, control downloads, hide ads and more …

QuizUp: Zero to a Million Users in 8 Days

QuizUp: Zero to a Million Users in 8 Days

This is a talk I held at the AWS Summit in Stockholm 2014, telling the story of QuizUp, the choices that were made and why, and how those choices impacted the success of QuizUp (and our lives) during the days after launch.

More Decks by Steinn Eldjárn Sigurðarson

Other Decks in Technology

Transcript

  1. Reykjavík April 14th 2014

    View full-size slide

  2. QuizUp?
    • Social trivia game for iOS and android
    • Launched November 7th 2013
    • .. March 6th 2014 on Android
    • Currently 16+ million users
    • One of the fastest growing apps/networks

    View full-size slide

  3. QuizUp Background
    • .. started making small, topical trivia apps
    • such as:
    • Eurovision QuizUp
    • Twilight QuizUp
    • Math QuizUp
    • NatGeo QuizUp
    • First: proof-of-concepts for investors
    • Then: Satellites to pull users to QuizUp network

    View full-size slide

  4. Engineering Team
    • Small server team (3-4 people, depending on
    perspective)
    • Backgrounds in telecom, finance, design, 

    math and music
    • Me: f/oss devops guy, first in web-tech,

    then telecom (mobile)

    View full-size slide

  5. The story: iOS Launch
    • Expected 1M users in 2013
    • Got 1M users in 8 days
    • Capacity planning was hard
    • Executed all scaling strategies within a week

    View full-size slide

  6. How was this possible?
    • With careful planning
    • Being in the cloud

    View full-size slide

  7. Why and how: use “the cloud”?
    • Mostly in IaaS fashion (aws)
    • Prefer SaaS to in-house solutions
    • Allows a small team to accomplish a lot
    quickly
    • We also use Heroku for many internal apps
    • Intended to use more PaaS

    View full-size slide

  8. Why Amazon?
    • Team members had experience with it
    • Industry standard
    • Multiple locations
    • We developed in eu-west-1
    • Production is in us-east-1

    View full-size slide

  9. Why Amazon?
    • Great selection of “hardware"
    • Single point for many services:
    • EC2
    • S3
    • Route53
    • CloudFront
    • VPC
    • DynamoDB
    • … etc

    View full-size slide

  10. QuizUp Architecture
    • Inspired by
    • 12factor.net
    • Netflix Engineering
    • Most moving parts are scalable
    • Stateless “immutable” app servers
    • Sharded player data
    • Scalable datastores

    View full-size slide

  11. QuizUp app server stack
    nginx
    uwsgi
    Flask+SQLAlchemy
    Postgres Redis ElasticSearch
    Elastic Load Balancer
    HTTP
    HTTPS
    clients

    View full-size slide

  12. QuizUp Architecture
    • Worse is (often) better
    • Optimizing is a luxury problem
    • Outsource to SaaS what we can
    • Pusher, DataDog, Pingdom, PagerDuty,
    Travis, Sentry
    • …etc

    View full-size slide

  13. QuizUp Architecture
    • A lot inherited from “legacy”
    • Large monolithic quizup-server API
    implemented in python
    • Decoupling now
    • Separate services
    • Routing requests to different ELBs

    View full-size slide

  14. QuizUp Services

    View full-size slide

  15. How did we prepare?
    • Metrics, metrics, metrics
    • Code freeze — an entire *week* before launch!
    • Load testing (locust, 20x m1.small nodes)
    • 5 weeks of beta
    • Force-update and graceful maintenance
    features built into API and clients
    • Coordinate with Infrastructure vendor:
    • prewarm ELBs
    • Increased instance limits

    View full-size slide

  16. LAUNCH!
    me :-)

    View full-size slide

  17. Growth?
    Sharding events

    View full-size slide

  18. Downtime?
    • Went from 1 hi1.4xlarge database master to 8,
    in 6 days.
    • Yes there was downtime:
    • First db sharding ~2 days after launch (29m)
    • Second sharding 5 days after launch (90m)
    • Third sharding 6 days after launch (40m)

    View full-size slide

  19. The bright side

    View full-size slide

  20. What lessons did we learn?
    • Monitoring and metrics PAY OFF
    • Tools to help deal with users
    • Invest in configuration management
    • Dynamic configuration with switches/throttles

    View full-size slide