Slide 1

Slide 1 text

StudyBlue, Inc. StudyBlue October 18, 2011 StudyBlue and MongoDB: Implementation 101 1

Slide 2

Slide 2 text

StudyBlue, Inc. Overview • Who am I? • Who is StudyBlue? • Why MongoDB? • How did we leverage MongoDB? • What lessons did we learn? • Q&A 2

Slide 3

Slide 3 text

StudyBlue, Inc. Who am I? • Sean Laurent • [email protected] • Director of Operations at StudyBlue, Inc. 3

Slide 4

Slide 4 text

StudyBlue, Inc. studyblue.com 4

Slide 5

Slide 5 text

StudyBlue, Inc. • Bottom-up attempt to improve student outcomes • Online service for storing, studying, sharing and ultimately mastering course material • Digital backpack for students • Freemium business model About StudyBlue 5

Slide 6

Slide 6 text

StudyBlue, Inc. StudyBlue Usage • Many simultaneous users • Rapid growth • Cyclical usage 6

Slide 7

Slide 7 text

StudyBlue, Inc. The Challenge 7

Slide 8

Slide 8 text

StudyBlue, Inc. Flashcard Scoring • Track flashcard scoring • Every single card • Every single user • Forever • Provide aggregate statistics • Flashcard deck • Folder • Overall • Focus on content mastery 8

Slide 9

Slide 9 text

StudyBlue, Inc. Scoring Results 9

Slide 10

Slide 10 text

StudyBlue, Inc. The Problem • Existing PostgreSQL database • Reasonably large number of cards • Large number of users • Users base increasing rapidly • Shift in usage - increasing faster than users • Time on site • Decks per user • Average deck size • Study sessions per user 10

Slide 11

Slide 11 text

StudyBlue, Inc. Additional Requirements • Support sustained rapid growth • Highly available • Minimize maintenance costs • Active community • Done yesterday 11

Slide 12

Slide 12 text

StudyBlue, Inc. Why Mongo? 12

Slide 13

Slide 13 text

StudyBlue, Inc. Alternatives • Amazon Simple DB • Far too simple • Cassandra • Difficult to add nodes and rebalance • Column families cannot be modified w/out restart • CouchDB • Difficult to add nodes and rebalance • Redis • No native support for sharding/partitioning • Master/slave only - no automatic failover 13

Slide 14

Slide 14 text

StudyBlue, Inc. MongoDB for the Win • Highly available • Replica sets • Automatic failover • Shards • Works across replica sets • Easy to add additional shards • Node addition • Read performance degradation when adding nodes • “hidden” flag • No down time 14

Slide 15

Slide 15 text

StudyBlue, Inc. More winning • Atomic insert & replace • Read balancing across slaves • BSON/JSON document model • It just works. Seriously. 15

Slide 16

Slide 16 text

StudyBlue, Inc. Implementation 16

Slide 17

Slide 17 text

StudyBlue, Inc. DevOps • Amazon EC2 • Separate dev, test and production environments • Operations testing • Replication • Failover • Scripting & automation • Creation • Cloning 17

Slide 18

Slide 18 text

StudyBlue, Inc. Development • 100% Java • Existing PostgreSQL database • System of record • Synchronization issues 18

Slide 19

Slide 19 text

StudyBlue, Inc. SQL Integration & Synchronization • PostgreSQL considered system of record • Asynchronous event driven • Web servers queue change events • Scoring server processes events • Query PostgreSQL • Update MongoDB 19

Slide 20

Slide 20 text

StudyBlue, Inc. Architecture 20

Slide 21

Slide 21 text

StudyBlue, Inc. MongoDB Schema • Many shallow collections vs monolithic deep collection • Leverage existing SQL knowledge • Simplify SQL integration 21

Slide 22

Slide 22 text

StudyBlue, Inc. Schema Design • Two collections used together to map relationships • Folder containing Deck • Decks in a Folder • Decks containing a Card • Cards in a Deck • Folders arranged in tree structure, • One row per folder that points to its parent. • Multiple queries required to build tree • Postgres primary keys are used instead of object ids 22

Slide 23

Slide 23 text

StudyBlue, Inc. 23

Slide 24

Slide 24 text

StudyBlue, Inc. Document Scores Example 24

Slide 25

Slide 25 text

StudyBlue, Inc. Slave Reads • SlaveOk set to true for most data retrieval • Scoring calculations use Primary to ensure correctness 25

Slide 26

Slide 26 text

StudyBlue, Inc. Data migration • One-time process • Postgres to MongoDB • Ruby scripts • Separate server 26

Slide 27

Slide 27 text

StudyBlue, Inc. Key Issues 27

Slide 28

Slide 28 text

StudyBlue, Inc. Summary • Amazon EC2/EBS • Java API • MapReduce • Replication • Partitioning / Shards • Performance 28

Slide 29

Slide 29 text

StudyBlue, Inc. • Plan for failure • “When” not “if” • EBS performance • Inconsistent • Limited by bandwidth • 60GB minimum • RAID-0 Amazon EC2 & EBS 29

Slide 30

Slide 30 text

StudyBlue, Inc. Java API • Not perfect • Verbose • Type safety • Failover requires retry • Up to 1 minute delay • Read-only requests • “slaveOk” works • Burden on developer 30

Slide 31

Slide 31 text

StudyBlue, Inc. Map Reduce • Perfect for aggregation • Not used by StudyBlue • Not needed (yet) • Difficult with multiple collections • Reduce limited to masters • Keep scalability simple • Under consideration 31

Slide 32

Slide 32 text

StudyBlue, Inc. Replication • Automated failover • Read scaling • Maintenance • Easy setup & configuration • “Seed” node(s) for clients 32

Slide 33

Slide 33 text

StudyBlue, Inc. Partitioning in the Cloud • Operations perspective • Dynamic changes in machines • Config servers track machines • Each node in replica set knows other nodes • Avoids restarting applications when Mongo servers change • Easy scaling • Local shard servers • Config servers store redundant copies • Two-phase commit 33

Slide 34

Slide 34 text

StudyBlue, Inc. Useful EC2 Instance Types • Config servers • t1.micro or m1.small Name Memory CU I/O m2.xlarge 17.1 GB 6.5 (2 cores x 3.25) medium m2.2xlarge 34.2 GB 13 (4 cores x 3.25) high m2.4xlarge 68.4 GB 26 (8 cores x 3.25) high cc1.4xlarge 23 GB 33.5 (2 x Xeon X5570) very high • Mongo replica nodes • Depends on memory needs • m2.xlarge, m2.2xlarge, m2.4xlarge or cc1.4xlarge 34

Slide 35

Slide 35 text

StudyBlue, Inc. Performance Issues • Missing indexes • Performance terrible without indexes • Index on the fly • Store array sizes in collection • OR vs IN • Redundant updates • Events not consolidated 35

Slide 36

Slide 36 text

StudyBlue, Inc. Lessons Learned 36

Slide 37

Slide 37 text

StudyBlue, Inc. • Amazon great, but plan for failure • Leverage test platforms • Use replica sets & partitions early • Indexes critical • Use IN instead of OR • Java API cumbersome, but solid • Design schema carefully Key Lessons 37

Slide 38

Slide 38 text

StudyBlue, Inc. Q & A 38

Slide 39

Slide 39 text

StudyBlue, Inc. Contact us Web: http://www.studyblue.com Twitter: @StudyBlue Email: [email protected] 39