Scaling Riak to 25MM Ops/Day at Kiip

Armon Dadgar @armondadgar Mitchell Hashimoto @mitchellh

API Flow Session Start Moment Reward Session End 0..n times

The Numbers x million unique devices per day About 4
API calls per session = ~25 million API calls per day

The Journey of Scale A Story of MongoDB * Let’s
talk about our journey scaling, specifically with MongoDB. * We started with MySQL, but switched before we had any real traffic to MongoDB.

1. Write Limit Hit by Analytics * Analytics sent hundreds
of atomic updates per second. * Hit limit w/ global write lock. * Solution: Aggregate over 10 seconds, send small bursts of updates, resulting in lower lock % on average. Solution: Aggregate over 10 seconds.

2. Too Many Reads (1000s/s) * We were reading too
much, hit max throughput of MongoDB. * Solution: Cache everywhere. Solution: Heavy Caching.

3. Slow, Uncachable Queries Example: “Has device X played game
Y today/last week/this month/in all time?” * Touches lots of data * Requires lots of index space * Not cachable * MongoDB just... slow. Solution: Bloom filters. Solution: Bloom filters!

4. Write Limit Hit, Again Basic model updates were hitting
MongoDB’s write throughput limit. Solution: Use two distinct MongoDB clusters for disjunct datasets to avoid global write lock. One for analytics (heavy writes). One for everything else. Solution: Two clusters (lol global write lock)

5. Index Size Hit Memory Limits We didn’t vertically scale
because we’re pretty operationally frugal and the data was growing very fast. ETL = Extract/Transform/Load, archive data to S3, remove from main DB. Solution: ETL, Reap old data.

6. ETL Overwhelmed ETL of 24 hours of data took
longer than 24 hours to extract, limited by MongoDB read throughput. We decided to let it break and continue reaping data. Solved in the future by continuous ETL solution separate from our main DB. Solution: Punted, solved by custom solution

7. Central Bottleneck by Mongo Noticed that _all_ API response
times were directly correlated to write load of MongoDB. Our only choice left here was to look into a new DB solution. Solution: Research new DBs!

Researching a new DB

RDBMS In the cloud, without horizontal scalability, I/O would hit
a limit REAL fast. Didn’t want to deal with custom sharding layer.

Cassandra Our cofounders are from Digg. Enough said.

HBase Saw PyCodeConf talk about system at Mozilla based on
HBase. We talked to speaker: * Operational nightmare * Took 1 year * No JVM experience at Kiip Not reasonable, for us.

CouchDB * No auto horizontal scaling, you have to do
it at the app level. * Features weren’t compelling (master/master syncing with phones, CouchApps, etc.). * We didn’t know anyone who used it.

Riak * Attracted to solid academic foundation * Visited and
talked with Basho developers. * Confident 100% in Basho team before even using product. * Meetups showed real world usage at scale + dev & ops happiness.

Data Migration

Identify Fast-Growing Data •Data we needed horizontally scalable •Session/Device data
grew at exponential rate. •Move that data first, keep the rest in MongoDB (for now).

Identify Fast-Growing Data Session Growth

Session Migration

Sessions First • Obviously K/V •Key: UUID, Value: JSON blob.
•Larger and faster growing than devices.

Data Access Patterns • By UUID (key) for all API
calls • Fraud: By device ID and IP of session. • 2i compatible ✓

Update ORM • Added Riak backing store driver • No
application-level changes were necessary • Riak Python client pains Python client pains: * Protocol buffer interface buggy * No keep alive (fixed) * Poor error handling (partially fixed, needs work)

Migrate • Write new data to Riak • Read from
Riak, fallback to MongoDB if missing • After one week, remove MongoDB read-only Didn’t migrate data because ETL sent it to S3 anyways.

Device Migration

Devices • Huge • Growing • But... not obviously K/V.

Not Obviously K/V • Canonical ID (UUID), assigned by us.
• Vendor ID (ADID, UDID, etc.), assigned by device vendor. • Uniqueness constraint on each, so 2i not possible.

Uniqueness in Riak, Part 1 Device Key: Canonical ID Value:
JSON Blob Device_UUID Key: Vendor ID Value: Canonical ID Simulate uniqueness using If-None-Match Cross fingers and hope consistency isn’t too bad.

Part 1: Results FAILURE

Part 1: Results • Latency: At least 200ms, at most
2000ms • Map/Reduce JS VMs quickly overwhelmed • Hundreds of inconsistencies per hour

Uniqueness, Part 2 • Just don’t do it. • Canonical
ID = SHA1(Vendor ID) • Backfill old data (30MM rows, days of backfill) • Success, use Riak as a K/V store!

Riak In Production Our experience over 3 months.

DISCLAIMER Riak has been extremely solid. However, there are minor
pain points that could and have been addressed.

Scale Early * Latencies explode under heavy I/O. Attempting to
add a new node adds more I/O pressure for handoff. * Add new nodes early. * Hard to know when just beginning. Watch your FSM latencies carefully. Scaling at the red line is painful.

2i is slow, don’t use in real time * Normal
EC2 get: 5ms * 2i EC2 get: 2000ms Fine for occasional background queries, not okay for queries on live requests.

JS Map/Reduce is slow, easily overwhelmed. Slow, to be expected,
so don’t use for live requests. JS VMs take a lot of RAM, limited quantity, you can run out very quickly. Riak currently doesn’t handle this well, but they’re working on it.

LevelDB: More Levels, More Pain Each additional level adds a
disk seek, which is killer in the cloud. We use it because we need 2I. In EC2 ephemeral, each additional disk seek adds about 10ms

Riak Control Unusable with slow internet connection due to PJAX
bullshit. Really bad for Ops people on the road (MiFis, international, etc.). Otherwise great. Basho is aware of the problem. Requires low-latency connection

Operational Issues, Part 1

Operational Issues, Part 2 • Cluster state under exceptional conditions
doesn’t converge. • Add/Remove the same node many times (usually do to automation craziness) • EC2 partial node failures + LevelDB?

Killing MongoDB So much fire.

Non K/V Data • Not fast growing • Rich querying
needed • Solution: PostgreSQL • Highly recommended.

Geo • We actually still use MongoDB, for now. •
Will move to PostGIS eventually. • Not high pressure, low priority.

Closing Remarks • Scaling is hard • Nothing is a
magic bullet • Look for easy wins that matter. • Rinse and repeat, converge to a scalable system.

Closing Remarks For horizontally scalable key/value data, Riak is the
right choice.

Thanks! Q/A?

Scaling Riak to 25MM Ops/Day at Kiip

Scaling Riak to 25MM Ops/Day at Kiip

More Decks by Mitchell Hashimoto

Other Decks in Programming

Featured

Transcript