Slide 1

Slide 1 text

Migrating MongoDB to Cassandra Denver/Boulder Big Data Meetup June 18th, 2014

Slide 2

Slide 2 text

Keeps all your contacts in one place and keeps them automatically up to date. Michael Rose [email protected] Follow me on Twitter: @Xorlev Senior Platform Engineer

Slide 3

Slide 3 text

I work on the Enrichment team. We crawl the public web for contact data and resolve it into people. E.g. the “keeping it up to date” part of contact management. We offer this via API

Slide 4

Slide 4 text

Slide 5

Slide 5 text

MongoDB has a lot of good uses

Slide 6

Slide 6 text

Storing 3TB of rapidly growing data is not one of them

Slide 7

Slide 7 text

Especially when it’s really billions of key- value pairs

Slide 8

Slide 8 text

The Story • 2011 Techstars, 6 people, we started with MongoDB. Focused on building a MVP. • MongoDB was new hot tech • MVP was success, moved on to new products, didn’t worry about Mongo • We kept building and growing

Slide 9

Slide 9 text

That’s not doomed for failure… • Hit performance inflection point, too painful to shard Mongo, decided to vertically scale Graphic representation of excellent decision

Slide 10

Slide 10 text

What’s wrong with MongoDB? • MongoDB slow due to high lock percentages. • Mongo has a per-database shared-exclusive lock, preference to writers • Per database as of 2.2! Whole server before • Not per-collection • We needed to buy time - early startup • We only had ~300GB of data at the time. • Enter hi1.4xlarge - 2TB of SSD

Slide 11

Slide 11 text

“2TB of SSD should be enough” — Me

Slide 12

Slide 12 text

2TB of SSD wasn’t enough

Slide 13

Slide 13 text

State of MongoDB • SSDs were able to serve the data (~8ms @ 99.5th) • But we kept adding data (it happens, weird) • When we had the bandwidth to handle it a year later, we were already approaching 2TB of data • Dirty solution: Second MongoDB cluster and handle “sharding” at the app layer

Slide 14

Slide 14 text

“Sharding” • New & updated writes went to new cluster • Reads went to both, chose new if available

Slide 15

Slide 15 text

This was ugly, and we feel bad. But it worked.

Slide 16

Slide 16 text

We bought some time, what are our options? • Cassandra • Sharded MongoDB (new cluster) • DynamoDB • Sharded RDBMSes (MySQL, Postgres, Oracle) • Other?

Slide 17

Slide 17 text

Weighing the options • No experience with Cassandra, but heard good things. Netflix’s usage was a big pro for us. • We knew MongoDB was bad for our write load already. • DynamoDB: Complexity around values > 64KB. Uncertain costs, but also probably would have been a solid choice. • RDBMSes: No relational benefit, really just delegating down to underlying storage engine anyways with KV data. But stable. • Other tech: Too young, no experience

Slide 18

Slide 18 text

Cassandra was the best choice for us. Resilience Fault-tolerance Linear scalability Disk happiness

Slide 19

Slide 19 text

Cassandra Pros • Simple operationally in AWS • Very resilient. Easy multi-AZ deployments • All our clusters are deployed in 3 ASGs, 1 per AZ • Machine fails? ASG replaces and it autobootstraps c/o Priam • Mostly transparent failure handling. Node failures aren’t emergencies, just a way of life on AWS • Linear storage scalability • More storage? Double the ring. • BigTable-like, we have experience with HBase which is also like BigTable. CQL3 even hides this from us.

Slide 20

Slide 20 text

Cassandra Pros • Read scalability with replicas • Add more machines & increase replication factor if we need tighter latencies • We don’t need perfect consistency. Enter eventual consistency. • We write & read at LOCAL_QUORUM. 2 nodes. • Our data compresses well, on-disk compression helps us do more with less.

Slide 21

Slide 21 text

Cassandra Cons • Yet another database • Little experience beyond experimentation • Not ACID (but we didn’t need it) • Write-optimized not read-optimized • We’re about 60/40 r/w, it works decently • Cassandra is still a young technology. It has incredible backing from DataStax, Netflix, Facebook (again) and other organizations though.

Slide 22

Slide 22 text

Con: We had no operational experience with Cassandra, and that’s scary • We knew MongoDB & MySQL pretty well, warts and all, but not Cassandra • Our second MongoDB cluster bought us time • We moved a less critical, higher-throughput cache layer from HBase to Cassandra • Learned about the weakness of the Cassandra Hadoop tools

Slide 23

Slide 23 text

Conversion Steps

Slide 24

Slide 24 text

Conversion Steps 1. Start writing to Cassandra and Mongo concurrently 2. Backfill the data with MapReduce job 3. Verify integrity of Cassandra, move reads 4. Stop writing to MongoDB

Slide 25

Slide 25 text

Conversion is painful • It’s worth using a BSON file export (mongodump) & Mongo’s BSONInputFormat • Cursoring over the dataset is incredibly slow • Files on HDFS are Hadoop’s bread and butter

Slide 26

Slide 26 text

Conversion is painful • Ended up using an offline mongodump to BSON. Indexed this with bson_splitter and pushed it to S3. • MapReduce job converted that to SequenceFiles (efficient KV format) • Wrote rows interactively using Netflix’s Astyanax client from the reducers. • Had issues after - read-ahead set too high

Slide 27

Slide 27 text

Our Setup Today • 3 clusters, 3 nodes, 9 nodes, and 12 nodes • Cluster per workload • m1.xlarges. are far cheaper than hi1.4xlarges • 4x800GB disks in RAID0 • All data stored in dmcrypt • Cassandra 1.2.16 (2.x soon maybe) • Priam runs along side C* doing token 
 management and daily backups to S3 • Approaching 12TB over 4B records 
 between all three • We don’t miss SSDs that much

Slide 28

Slide 28 text

Client choice: Astyanax • Astyanax is Netflix’s Cassandra client • Uses Thrift tables or CQL3 over Thrift • More feature-rich than DataStax client • Token-aware GETs == less latency • Beta can use DataStax client under the hood for native protocol

Slide 29

Slide 29 text

Parting thoughts • For us, Cassandra was a year-long decision that aligned well to our goal of resilience and performance. • It wasn’t a leap for us, we knew the data model. Be sure that it aligns to your goals before plunging in. • There’s always going to be something that goes wrong. For us, it was disk tuning. Plan ahead, newer databases haven’t had time to mature. I’d call them “fiddly.” • Nothing is ever easy. There is always friction. Don’t be seduced into tech you don’t need. • Corollary: Definition of need is flexible.

Slide 30

Slide 30 text

Q & A • Thanks for listening everyone! Feel free to ask questions here, shoot me an email ([email protected]), or hit me up on Twitter @Xorlev ! • Obligatory: We’re hiring, check us out. AOL keyword “fullcontact”