QuizUp: Zero to a Million Users in 8 Days

QuizUp: Zero to a Million Users in 8 Days

This is a talk I held at the AWS Summit in Stockholm 2014, telling the story of QuizUp, the choices that were made and why, and how those choices impacted the success of QuizUp (and our lives) during the days after launch.

Transcript

  1. 2.

    QuizUp? • Social trivia game for iOS and android •

    Launched November 7th 2013 • .. March 6th 2014 on Android • Currently 16+ million users • One of the fastest growing apps/networks
  2. 3.

    QuizUp Background • .. started making small, topical trivia apps

    • such as: • Eurovision QuizUp • Twilight QuizUp • Math QuizUp • NatGeo QuizUp • First: proof-of-concepts for investors • Then: Satellites to pull users to QuizUp network
  3. 4.

    Engineering Team • Small server team (3-4 people, depending on

    perspective) • Backgrounds in telecom, finance, design, 
 math and music • Me: f/oss devops guy, first in web-tech,
 then telecom (mobile)
  4. 5.

    The story: iOS Launch • Expected 1M users in 2013

    • Got 1M users in 8 days • Capacity planning was hard • Executed all scaling strategies within a week
  5. 7.

    Why and how: use “the cloud”? • Mostly in IaaS

    fashion (aws) • Prefer SaaS to in-house solutions • Allows a small team to accomplish a lot quickly • We also use Heroku for many internal apps • Intended to use more PaaS
  6. 8.

    Why Amazon? • Team members had experience with it •

    Industry standard • Multiple locations • We developed in eu-west-1 • Production is in us-east-1
  7. 9.

    Why Amazon? • Great selection of “hardware" • Single point

    for many services: • EC2 • S3 • Route53 • CloudFront • VPC • DynamoDB • … etc
  8. 10.

    QuizUp Architecture • Inspired by • 12factor.net • Netflix Engineering

    • Most moving parts are scalable • Stateless “immutable” app servers • Sharded player data • Scalable datastores
  9. 11.
  10. 13.

    QuizUp Architecture • Worse is (often) better • Optimizing is

    a luxury problem • Outsource to SaaS what we can • Pusher, DataDog, Pingdom, PagerDuty, Travis, Sentry • …etc
  11. 14.

    QuizUp Architecture • A lot inherited from “legacy” • Large

    monolithic quizup-server API implemented in python • Decoupling now • Separate services • Routing requests to different ELBs
  12. 16.

    How did we prepare? • Metrics, metrics, metrics • Code

    freeze — an entire *week* before launch! • Load testing (locust, 20x m1.small nodes) • 5 weeks of beta • Force-update and graceful maintenance features built into API and clients • Coordinate with Infrastructure vendor: • prewarm ELBs • Increased instance limits
  13. 18.
  14. 20.

    Downtime? • Went from 1 hi1.4xlarge database master to 8,

    in 6 days. • Yes there was downtime: • First db sharding ~2 days after launch (29m) • Second sharding 5 days after launch (90m) • Third sharding 6 days after launch (40m)
  15. 22.

    What lessons did we learn? • Monitoring and metrics PAY

    OFF • Tools to help deal with users • Invest in configuration management • Dynamic configuration with switches/throttles
  16. 23.