Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rewriting Parse.com

Rewriting Parse.com

Presented at RootConf 2015.
Video at https://www.youtube.com/watch?v=YXAwSHYdOqc

Abhishek Kona

May 14, 2015
Tweet

More Decks by Abhishek Kona

Other Decks in Technology

Transcript

  1. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 1/30
    Rewriting Parse.com
    RootConf, Bangalore
    15 May 2015

    View full-size slide

  2. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 2/30
    Who is this Guy?
    Abhishek Kona
    Software Engineer at Parse.com, Facebook
    Ex-Flipkart
    @sheki

    View full-size slide

  3. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 3/30
    What is the talk about?
    What are the (scaling) problems we had at Parse.com ?
    How we solved them?
    Did we learn anything?

    View full-size slide

  4. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 4/30
    What is Parse.com?
    Developer platform to build mobile apps.
    Backend-As-A-Service, build an app not backend.
    Works for IOS, Android, JS, React, React-Native, Windows, PHP ...
    Acquired by Facebook in 2013.

    View full-size slide

  5. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 5/30
    Parse - Circa 2013.
    ~60K apps.
    10 Engineers.
    Ruby on Rails App (like every company out of YCombinator)

    View full-size slide

  6. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 6/30
    Parse - Right now.
    500K apps built on Parse.
    100% Year-On-Year traffic growth.
    Primarily a Go Stack.

    View full-size slide

  7. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 7/30
    Parse.com - Issues (2013)
    Uptime ~90%
    Single popular app can take down Parse.com
    Unmanageable codebase

    View full-size slide

  8. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 8/30
    Listing down our Problems - "Beast List"
    Create a Checklist of all issues preventing us having an uptime of 99.9+%
    Came up with software / tools we can build.
    Some concrete issues
    Unicorn, our Ruby HTTP server was a resource hog.
    Large deploy times.

    View full-size slide

  9. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 9/30
    We decided to Rewrite in Go.

    View full-size slide

  10. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 10/30
    Why Rewrite?
    Could not understand the Ruby codebase.
    Estimated performance win - huge.
    New codebase will be statically typed.

    View full-size slide

  11. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 11/30
    Why Go?
    Statically typed programming language with good concurrency support.
    Outperforms Ruby - build and execution time.
    Our second choice was C#,

    View full-size slide

  12. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 12/30
    Status of the Rewrite
    Took 3-4 Engineers 1.5 years to complete. It works.

    View full-size slide

  13. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 13/30
    How did the rewrite help?
    Got rid of Unicorn.
    We can add capacity quickly, deploy speeds went up.
    Readable codebase (for now).

    View full-size slide

  14. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 14/30
    What did we learn?
    No silver bullet.
    Mostly about managing the pain.

    View full-size slide

  15. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 15/30
    Monoliths are all right.

    View full-size slide

  16. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 16/30
    Monoliths
    Micro-Services are all a rage, but it is quicker to build/test a single binary.
    We built Parse.com mostly as a monolith, inspired by Facebook.com
    Micro-services work if there are multiple teams managing different services.

    View full-size slide

  17. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 17/30
    Proxies - you probably need them.
    Connections consume precious memory on the DB.
    Proxies help effectively manage connections across app servers.
    Side-effect - you can monitor your database perf from a central place.
    We wrote our own proxy for Mongo in Go: github.com/facebookgo/dvara
    (https://github.com/facebookgo/dvara)

    View full-size slide

  18. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 18/30
    Metrics
    When in doubt measure everything.
    We started with Ganglia -> UI froze after 100 metrics.
    We use Facebooks Scuba and ODS.
    Find a metrics service, hopefully you don't have to write one.
    www.facebook.com/notes/facebook-engineering/under-the-hood-data-diving-with-
    scuba/10150599692628920 (https://www.facebook.com/notes/facebook-engineering/under-the-hood-data-diving-with-scuba/10150599692628920)

    View full-size slide

  19. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 19/30
    Shadow Live Traffic
    Real bugs show under live traffic.
    We had a mechanism to run live traffic on a test cluster.
    Tools to compare results from test and prod cluster invaluable.
    Shadowed traffic for months for some endpoints before we released to 100% of
    users.
    Our setup -> a custom Go HTTP proxy to send requests to test and prod clusters.
    Works great for Read APIs.
    Complicated setup for Write APIs with database snapshot and DB compare.

    View full-size slide

  20. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 20/30
    Throttles
    First line of defense - capability to block any backend, client.
    Our throttling was Simple Memcache based counters.
    Currently evolving into Auto-throttling.

    View full-size slide

  21. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 21/30
    Gatekeeper / Decider
    Feature flags / production hooks to control roll out of new code to a fraction of
    users/traffic.
    Good way to get confidence.
    Our in house Go system is called Decider -> based of Redis.
    Important to clean up old code after the roll out to avoid code smells.

    View full-size slide

  22. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 22/30
    Deploys
    Our philosophy - every engineer should deploy when needed.
    We moved away from a fixed Monday release to release all the time.
    Deploy many small changes as often as possible.
    Deployctl - In house tool written in Python to deploy Go (Zookeeper based).
    Deploy locking and canarying.

    View full-size slide

  23. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 23/30
    Cockpit
    Admin HTTP service on every binary for debugging.
    Exposes Health Checks / Git version / build time / uptime.
    Can connect pprof over it (thanks GoLang).
    Can activate verbose logging on a particular server - logs every request response
    pair.

    View full-size slide

  24. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 24/30
    Context
    Pass global context object through out our codebase.
    Context can be used to tag along ReqID, AppID
    We use context to pass in a ReqID, that is added to query comment on Mongo,
    helps us track back a request from a slow query in the log.
    Support context objects when writing a new library.
    Golang has great context package golang.org/x/net (golang.org/x/net)
    .

    View full-size slide

  25. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 25/30
    Own your Database
    Sooner or later DB will be the bottleneck.
    Understand the internals of your Database from the start.
    Query planner, db caches - row/block cache, Indexing trade-offs, major locks.
    Start hacking on the DB codebase, you can add custom metrics - usually easier
    than it seems.
    Parse.com+RocksDB team at Facebook built a new storage engine for Mongo -
    Mongo-Rocks.
    blog.parse.com/announcements/mongodb-rocksdb-parse/ (http://blog.parse.com/announcements/mongodb-
    rocksdb-parse/)

    View full-size slide

  26. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 26/30
    About our Codebase
    Dependency Injection - only at boot time github.com/facebookgo/inject
    (https://github.com/facebookgo/inject)
    .
    Lots of small libraries github.com/facebookgo/ (https://github.com/facebookgo/)
    .
    Try not to fork - we submit patches upstream.

    View full-size slide

  27. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 27/30
    Tests
    Integration tests > unit tests.
    Our go test suite takes less than 2min to run.
    Parallel test runs are beautiful.
    We boot multiple mongo/memcache instances in memory in our test binary.
    github.com/facebookgo/mgotest (http://github.com/facebookgo/mgotest)
    github.com/facebookgo/mctest (https://github.com/facebookgo/mctest)

    View full-size slide

  28. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 28/30
    Closing Thoughts
    Rewrite is not the worst idea.
    GO is great.
    User Parse.com for your next app.

    View full-size slide

  29. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 29/30
    Thank you

    View full-size slide

  30. 5/26/2015 Rewriting Parse.com
    http://127.0.0.1:3999/rootconf_2015.slide#1 30/30

    View full-size slide