This talk goes over how we scaled one part of our technology stack at Kiip over the last 18 months, and how we ended up on Riak for this specific use case.
Scaling Riak to
25MM Ops/Day at Kiip
x million unique devices per day
About 4 API calls per session
= ~25 million API calls per day
The Journey of Scale
A Story of MongoDB
* Let’s talk about our journey
scaling, specifically with
* We started with MySQL, but
switched before we had any real
traffic to MongoDB.
1. Write Limit Hit by Analytics
* Analytics sent hundreds of
atomic updates per second.
* Hit limit w/ global write
* Solution: Aggregate over 10
seconds, send small bursts of
updates, resulting in lower lock
% on average.
Solution: Aggregate over 10 seconds.
2. Too Many Reads (1000s/s)
* We were reading too much,
hit max throughput of
* Solution: Cache everywhere.
Solution: Heavy Caching.
3. Slow, Uncachable Queries
Example: “Has device X played game
Y today/last week/this month/in all
* Touches lots of data
* Requires lots of index space
* Not cachable
* MongoDB just... slow.
Solution: Bloom filters.
Solution: Bloom filters!
4. Write Limit Hit, Again
Basic model updates were hitting
MongoDB’s write throughput limit.
Solution: Use two distinct
MongoDB clusters for disjunct
datasets to avoid global write lock.
One for analytics (heavy writes).
One for everything else.
Solution: Two clusters (lol global write lock)
5. Index Size Hit Memory Limits
We didn’t vertically scale because
we’re pretty operationally frugal
and the data was growing very
ETL = Extract/Transform/Load,
archive data to S3, remove from
Solution: ETL, Reap old data.
6. ETL Overwhelmed
ETL of 24 hours of data took longer
than 24 hours to extract, limited
by MongoDB read throughput.
We decided to let it break and
continue reaping data. Solved in the
future by continuous ETL solution
separate from our main DB.
Solution: Punted, solved by custom solution
7. Central Bottleneck by Mongo
Noticed that _all_ API response
times were directly correlated to
write load of MongoDB.
Our only choice left here was to
look into a new DB solution.
Solution: Research new DBs!
Researching a new DB
In the cloud, without
horizontal scalability, I/O
would hit a limit REAL fast.
Didn’t want to deal with
custom sharding layer.
Our cofounders are from Digg.
Saw PyCodeConf talk about system
at Mozilla based on HBase. We
talked to speaker:
* Operational nightmare
* Took 1 year
* No JVM experience at Kiip
Not reasonable, for us.
* No auto horizontal scaling, you
have to do it at the app level.
* Features weren’t compelling
(master/master syncing with
phones, CouchApps, etc.).
* We didn’t know anyone who used
* Attracted to solid academic foundation
* Visited and talked with Basho developers.
* Confident 100% in Basho team before
even using product.
* Meetups showed real world usage at scale
+ dev & ops happiness.
Identify Fast-Growing Data
•Data we needed horizontally scalable
•Session/Device data grew at exponential rate.
•Move that data first, keep the rest in MongoDB
Identify Fast-Growing Data
• Obviously K/V
•Key: UUID, Value: JSON blob.
•Larger and faster growing than devices.
Data Access Patterns
• By UUID (key) for all API calls
• Fraud: By device ID and IP of session.
• 2i compatible
• Added Riak backing store driver
• No application-level changes were necessary
• Riak Python client pains Python client pains:
* Protocol buffer interface
* No keep alive (fixed)
* Poor error handling (partially
fixed, needs work)
• Write new data to Riak
• Read from Riak, fallback to MongoDB if
• After one week, remove MongoDB read-only
data because ETL
sent it to S3
• But... not obviously K/V.
Not Obviously K/V
• Canonical ID (UUID), assigned by us.
• Vendor ID (ADID, UDID, etc.), assigned by
• Uniqueness constraint on each, so 2i not
Uniqueness in Riak, Part 1
Key: Canonical ID
Value: JSON Blob
Key: Vendor ID
Value: Canonical ID
Simulate uniqueness using If-None-Match
Cross fingers and hope consistency isn’t too bad.
Part 1: Results
Part 1: Results
• Latency: At least 200ms, at most 2000ms
• Map/Reduce JS VMs quickly overwhelmed
• Hundreds of inconsistencies per hour
Uniqueness, Part 2
• Just don’t do it.
• Canonical ID = SHA1(Vendor ID)
• Backfill old data (30MM rows, days of backfill)
• Success, use Riak as a K/V store!
Riak In Production
Our experience over 3 months.
Riak has been extremely solid.
However, there are minor pain points that
could and have been addressed.
* Latencies explode under
heavy I/O. Attempting to add a
new node adds more I/O
pressure for handoff.
* Add new nodes early.
* Hard to know when just
beginning. Watch your FSM
Scaling at the red line is painful.
2i is slow, don’t use in real time
* Normal EC2 get: 5ms
* 2i EC2 get: 2000ms
Fine for occasional background
queries, not okay for queries
on live requests.
JS Map/Reduce is slow,
Slow, to be expected, so
don’t use for live requests.
JS VMs take a lot of RAM,
limited quantity, you can
run out very quickly. Riak
currently doesn’t handle
this well, but they’re
working on it.
LevelDB: More Levels, More Pain
Each additional level adds a
disk seek, which is killer in
We use it because we need 2I.
In EC2 ephemeral, each
additional disk seek adds
Unusable with slow internet
connection due to PJAX bullshit.
Really bad for Ops people on the
road (MiFis, international, etc.).
Basho is aware of the problem.
Requires low-latency connection
Operational Issues, Part 1
Operational Issues, Part 2
• Cluster state under exceptional conditions
• Add/Remove the same node many times
(usually do to automation craziness)
• EC2 partial node failures + LevelDB?
So much fire.
Non K/V Data
• Not fast growing
• Rich querying needed
• Solution: PostgreSQL
• Highly recommended.
• We actually still use MongoDB, for now.
• Will move to PostGIS eventually.
• Not high pressure, low priority.
• Scaling is hard
• Nothing is a magic bullet
• Look for easy wins that matter.
• Rinse and repeat, converge to a scalable
For horizontally scalable key/value data,
Riak is the right choice.