Slide 1

Slide 1 text

Built to scale – Cloud Computing and NoSQL databases Sridhar Nanjundeswaran, @snanjund MongoDB, Inc.

Slide 2

Slide 2 text

2 10Gen is now MongoDB 280+ employees 500+ customers Over $81 million in funding Offices in New York, Palo Alto, Washington DC, London, Dublin, Barcelona and Sydney

Slide 3

Slide 3 text

3 Public Cloud Forecasts $16.7 $34.6 $18.2 $52.5 $24.9 $94.5 $28.2 $72.8 $- $10.0 $20.0 $30.0 $40.0 $50.0 $60.0 $70.0 $80.0 $90.0 $100.0 2011 2012 2013 2014 2015 Gartner Ovum Forrester IDC In billions of dollars. What is included here?

Slide 4

Slide 4 text

4 Why should I consider it? Focus on your core Flexibility Agility Cost

Slide 5

Slide 5 text

5 Deployment Models Private Public Hybrid

Slide 6

Slide 6 text

6 The aaS’s aka Service Model

Slide 7

Slide 7 text

7 • Shared • Self-service • Elastic scaling • Use based pricing What is common?

Slide 8

Slide 8 text

NoSQL - History repeats itself??

Slide 9

Slide 9 text

9 What Do You Remember about 1969?

Slide 10

Slide 10 text

10 nothing? hold that thought

Slide 11

Slide 11 text

11 • IBM’s IMS (1969) – Developed as part of the Apollo Project • IDS (Integrated Data Store), navigational database, 1973 • High performance but: – Forced developers to worry about both query design and schema design upfront – Made it hard to change anything mid-stream Back to the Future: NoSQL?

Slide 12

Slide 12 text

12 • Designed to overcome these deficiencies – Decoupled query design from schema design – Allowed developers to focus on schema design – Could be confident that you could query the data as you wanted later • 30 years of dominance later… Enter SQL

Slide 13

Slide 13 text

13 … the present …

Slide 14

Slide 14 text

14 RDBMS Is Like a Spreadsheet

Slide 15

Slide 15 text

15 With “Relations” Between Rows

Slide 16

Slide 16 text

16 Lots of relations. Lots of rows.

Slide 17

Slide 17 text

17 It Hides What You’re Really Doing

Slide 18

Slide 18 text

18 It Makes Development Hard Relational Database Object Relational Mapping Application Code XML Config DB Schema

Slide 19

Slide 19 text

19 And Makes Things Hard to Change New Table New Table New Column Name Pet Phone Email New Column 3 months later…

Slide 20

Slide 20 text

20 RDBMS Scale = Bigger Computers “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012

Slide 21

Slide 21 text

21 This Was a Problem for Google Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html 250,000+ MBP’s == 4.1 miles 2010 Search Index Size: 100,000,000 GB New data added per day 100,000+ GB Databases they could use 0

Slide 22

Slide 22 text

22 And for Facebook 2010: 13,000,000 queries per second

Slide 23

Slide 23 text

23 And for Facebook 2010: 13,000,000 queries per second TPC Top Results TPC #1 DB: 504161 tps

Slide 24

Slide 24 text

24 And for Facebook 2010: 13,000,000 queries per second TPC Top Results TPC #1 DB: 504161 tps Top 10 combined: 1,370,368 tps

Slide 25

Slide 25 text

25 The world is changing Variety of Data • Unstructured data • Semi-structured data • Polymorphic data Volume/Velocity of Data • Petabytes of data • Trillions of records • Millions of queries per second Agile Development • Iterative • Short development cycles • New workloads New Architectures • Horizontal scaling • Commodity servers • Cloud computing

Slide 26

Slide 26 text

26 Shift in What We’re Computing

Slide 27

Slide 27 text

27 Living in the Post-transactional Future Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL) Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS) Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL)

Slide 28

Slide 28 text

28 “Systems of Engagement are built by front-line developers using modern languages who are driven by time to market, the need for rapid deployment and iteration….They value solutions that make it easy for them to deploy their application code with as little friction as possible.” (Forrester 2013) Shift in How We Develop Applications

Slide 29

Slide 29 text

29 Developers Are More Productive Application Code Relational Database Object Relational Mapping XML Config DB Schema

Slide 30

Slide 30 text

30 Developers Are More Productive Application Code Relational Database Object Relational Mapping XML Config DB Schema

Slide 31

Slide 31 text

31

Slide 32

Slide 32 text

32 … why are people using nosql – some examples…

Slide 33

Slide 33 text

33 RDBMS Agility and Flexibility MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }

Slide 34

Slide 34 text

34 Serves targeted content to users using MongoDB- powered identity system Example Problem Why MongoDB Results • 20M+ unique visitors per month • Rigid relational schema unable to evolve with changing data types and new features • Slow development cycles • Easy-to-manage dynamic data model enables limitless growth, interactive content • Support for ad hoc queries • Highly extensible • Rapid rollout of new features • Customized, social conversations throughout site • Tracks user data to increase engagement, revenue

Slide 35

Slide 35 text

35 Scalability Auto-Sharding • Increase capacity as you go • Commodity and cloud architectures • Improved operational simplicity and cost visibility

Slide 36

Slide 36 text

36 Manages a wide range of content and services for its web properties using MongoDB Case Study Problem Why MongoDB Results • Trouble dealing with a huge variety of content • MySQL unable to keep up with performance and scalability requirements • Problems compounded by integrating information from T- Mobile joint venture • Move from 6 billion rows in RDBMS to simplicity of 1 document • Automated failover and ability to add nodes without downtime • “Blazingly fast” query performance: “blown away by [MongoDB’s] performance” • Significant performance gains despite big increase in volume and variety of data • Greater agility, faster development iteration • Saved £2m in licenses and hardware

Slide 37

Slide 37 text

37 Developer/Ops Savings • Ease of Use • Agile development • Less maintenance Hardware Savings • Commodity servers/cloud • Internal storage (no SAN) • Scale out, not up Software/Support Savings • No upfront license • Cost visibility for usage growth Better Total Cost of Ownership (TCO) DB Alternative

Slide 38

Slide 38 text

38 Stores one of world’s largest record repositories and searchable catalogues in MongoDB Case Study Problem Why MongoDB Results • One of world’s largest record repositories • Move to SOA required new approach to data store • RDBMS could not support centralized data mgt and federation of information services • Fast, easy scalability • Full query language • Complex metadata storage • Delivers high scalability, fast performance, and easy maintenance, while keeping support costs low • Will scale to 100s of TB by 2013, PB by 2020 • Searchable catalogue of varied data types • Decreased SW and support costs

Slide 39

Slide 39 text

39 Better Data Locality Performance In-Memory Caching In-Place Updates

Slide 40

Slide 40 text

40 Uses MongoDB to safeguard over 6 billion images served to millions of customers Case Study Problem Why MongoDB Results • 6B images, 20TB of data • Brittle code base on top of Oracle database – hard to scale, add features • High SW and HW costs • JSON-based data model • Agile, high performance, scalable • Alignment with Shutterfly’s services- based architecture • 5x cost reduction • 9x performance improvement • Faster time-to-market • Dev cycles in weeks vs. tens of months

Slide 41

Slide 41 text

41 … the future…

Slide 42

Slide 42 text

42 NoSQL Adoption First NoSQL Project Multiple NoSQL Projects Multiple NoSQL Projects NoSQL Centre of Excellence NoSQL First Policy

Slide 43

Slide 43 text

43 NoSQL: The New Normal RDBMSs Meet Requirements Key/Value or Column Stores Meet Requirements Document Store Meets Requirements

Slide 44

Slide 44 text

Is Polyglot the new future?

Slide 45

Slide 45 text

45 General Purpose, High Performance Source: DB-Engines, Aug2013 Database Popularity Jobs, Searches, Mentions, Etc.

Slide 46

Slide 46 text

Cloud + NoSQL – Marriage made in heaven ?

Slide 47

Slide 47 text

47 Easy experimentation ? Replication Database Cluster

Slide 48

Slide 48 text

48 Shard 1 Easy Scaling Shard 2

Slide 49

Slide 49 text

49 Shard 1 Easy Scaling Shard 2 Capture

Slide 50

Slide 50 text

50 Shard 1 Easy Scaling Shard 2 Play

Slide 51

Slide 51 text

51 Shard 1 Easy Scaling Shard 2 Shard 3

Slide 52

Slide 52 text

52 Easy Recovery

Slide 53

Slide 53 text

53 Easy Recovery

Slide 54

Slide 54 text

54 Easy Recovery

Slide 55

Slide 55 text

55 Easy Recovery

Slide 56

Slide 56 text

56 • Database as a Service • Easier for devops • Centers of excellence DBaaS

Slide 57

Slide 57 text

57 Cloud NoSQL Focus on your core Developer Productivity due to focus Flexibility Flexibility Agility Agility Cost Cost Performance Scalability Cloud and NoSQL advantages - recap

Slide 58

Slide 58 text

58 Cloud + NoSQL

Slide 59

Slide 59 text

59 All my problems solved?

Slide 60

Slide 60 text

60 • Vendor lock in? • Capabilities – DR – Added services • Cost • May not be optimized for your workload • Security • Change control What to watch for?

Slide 61

Slide 61 text

61 • Consider multiple vendors – Companies that explicitly do that • Capabilities – Use multiple cloud vendors • Cost – Analyze and understand – Private Cloud + Public Cloud to expand What can I do?

Slide 62

Slide 62 text

Sridhar Nanjundeswaran @snanjund