Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling to Get the Whole World Running

Avatar for Steve Huff Steve Huff
October 10, 2017

Scaling to Get the Whole World Running

Talk given at Mobile@Scale 2017, describing Runkeeper's strategies for scaling a mobile app and backend.

Avatar for Steve Huff

Steve Huff

October 10, 2017
Tweet

More Decks by Steve Huff

Other Decks in Technology

Transcript

  1. Scaling to Get the Whole World Running Steve Huff [email protected]

    @hakamadare Lead Site Reliability Engineer Joe Bondi [email protected] @007i Co-Founder and CTO
  2. Runkeeper APIs Started simple: “core” most-critical API functions Mobile <->

    Server relationship Activity tracking User registration
  3. Runkeeper APIs Planned for extensibility and evolution Mobile <-> Server

    relationship Activity tracking User registration Training plans Social network / feed Challenges Subscriptions ... Goals Routes
  4. • Know your expected traffic patterns / client behavior •

    Monitor your actual traffic patterns / client behavior • Log metrics to surface and identify bottlenecks • Recognize badly-behaving mobile clients, and fix! • Avoid self DDoS’ing! ◦ API calls made during app launch or home screen ◦ API calls made in loops (N+1) ◦ API calls made from push notifications at large surge volume Log, measure, monitor
  5. Best way to scale a database is to not use

    one How does your database grow? Trip points-data Trip summary records
  6. • Have a client-side + server-side strategy • Find queries

    that benefit from caching ◦ Measure hit / miss - know how it’s working Caching - be fast and efficient Local cache (etag / retrofit) CDN Web App CloudFront Redis Postgres
  7. • Queues help manage write actions at large volumes •

    Queue anything that can be queued • Monitor queue length to know when there’s an issue Queues - are a savior
  8. Migrations - change the engines while in-flight 1. Dual-write: Deploy

    new writes while old way stays up and running 2. Backfill historical data 3. Incremental deploy (via server- or client-side config) 4. Cleanup (remove code and data) Migrating things to the “new way”
  9. • Backwards compatibility ◦ How long to support old versions

    of apps? • Time needed to get new version out to users devices • Wireless connectivity failures ◦ Design for re-trying - though avoid self-DDoS ◦ Re-transmission of data already successfully received by server Challenges specific to mobile apps
  10. 1. Thanks for inviting us, and to all of you

    for attending! 2. Questions? Steve Huff [email protected] @hakamadare Lead Site Reliability Engineer Joe Bondi [email protected] @007i Co-Founder and CTO Thank you!