MongoDB World 2016: MongoDB and Google Cloud

MongoDB and Google Cloud Platform Sandeep Parikh Head of Solutions,
Americas East @crcsmnky

Agenda MongoDB Deployment Architectures The Finer Points of Configuration Deploying
MongoDB on Google Cloud Platform Integrating with Google Cloud Platform

Google Cloud Platform 3 MongoDB Deployment Architectures

Google Cloud Platform 4 Replica Sets Across Zones

Google Cloud Platform 5 Replica Sets Across Regions

Google Cloud Platform 6 Sharded Cluster Across Regions

Google Cloud Platform 7 The Finer Points of Configuration

Google Cloud Platform 8 Mostly available single CPU 3.75 GB
of RAM per vCPU Higher CPU relative to memory 0.9 GB of RAM per vCPU Higher memory per core 6.5 GB of RAM per vCPU Machine Types Standard High Memory High Compute Shared Core Custom Balanced CPU and memory configurations 3.75 GB of RAM per vCPU Independently scale CPU and RAM Max 6.5 GB of RAM per vCPU

Google Cloud Platform 9 Mostly available single CPU 3.75 GB
of RAM per vCPU Higher CPU relative to memory 0.9 GB of RAM per vCPU Higher memory per core 6.5 GB of RAM per vCPU Machine Types Standard High Memory High Compute Shared Core Custom Balanced CPU and memory configurations 3.75 GB of RAM per vCPU Independently scale CPU and RAM Max 6.5 GB of RAM per vCPU Good for getting started Best for MongoDB workloads Skip it, you probably don’t need the compute Configure one that fits your working set g1-small suitable for Arbiter, maybe

Google Cloud Platform 10 Disks Standard SSD Local SSD Persistent
storage Max 3,000 Read Max 15,000 Write $0.04 per GB Up to 64 TB Persistent storage Max 15,000 Read Max 15,000 Write $0.17 per GB Up to 64 TB Ephemeral storage Max 680,000 Read Max 360,000 Write $0.218 per GB 375 GB only

Google Cloud Platform 11

Google Cloud Platform 12 Storage Bits IOPS scale with size
500GB PD-SSD is the sweet spot Better off with fewer, larger volumes No separate data/journal/log Data is encrypted at-rest Automatically, once it leaves the instance Standard SSD

Google Cloud Platform 13 Deploying MongoDB on Google Cloud Platform

Google Cloud Platform 14 Manually Deploying MongoDB

Google Cloud Platform 15 Google Cloud Launcher

Google Cloud Platform 16 MongoDB Cloud Manager

Google Cloud Platform 17 MongoDB Cloud Manager How do you
automate this?

Google Cloud Platform 18 Cloud Deployment Manager Provision, configure your
deployment Configuration as code Declarative approach to configuration Template-driven Supports YAML, Jinja, and Python Use schemas to constrain parameters References control order and dependencies

Google Cloud Platform 19 Bootstrapping MongoDB Cloud Manager Deployment Manager
Template

Google Cloud Platform 20 Bootstrapping Cloud Manager Schema, Configuration &
Template Posted on Github https://github.com/GoogleCloudPlatform/mongodb-cloud-manager Three Compute Engine instances, each with 500 GB PD-SSD MongoDB Cloud Manager automation agent pre-installed and configured $ gcloud deployment-manager deployments create mongodb-cloud-manager \ --config mongodb-cloud-manager.jinja \ --properties mmsGroupId=MMSGROUPID,mmsApiKey=MMSAPIKEY

21 Defines required properties for deployment machineType, zone, mmsGroupId, mmsApiKey
Use supplied defaults or override at runtime Constrain input by type and regex or filter Schema

22 Imports instance template Three resources of instance template Pass
in properties from schema Configuration

23 Two resources, disk and instance Inherits properties from parent
template References ensure creation order Instance Template

Google Cloud Platform 24 Integrating with Google Cloud Platform

Google Cloud Platform 25 MongoDB in Google Cloud Ecosystem

Google Cloud Platform 26 MongoDB in Google Cloud Ecosystem

Google Cloud Platform 27 Downstream Use Cases Backups Data Warehouse
Analytics Applications ETL Machine Learning

Google Cloud Platform 28 Downstream Use Cases Backups Data Warehouse
Analytics Applications ETL Machine Learning

Google Cloud Platform 29 Google Research in Data Technologies 2012
2013 2002 2004 2006 2008 2010 GFS MapReduce BigTable Colossus Dremel Flume Megastore Spanner Millwheel PubSub F1 Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html

Google Cloud Platform 30 Google Research in Data Technologies 2012
2013 2002 2004 2006 2008 2010 Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html Cloud Storage Dataproc Bigtable Cloud Storage BigQuery Dataflow Datastore Spanner Dataflow PubSub F1

Google Cloud Platform 31 Backup Snapshots > db.adminCommand({fsync:1, lock:true}) $
sudo sync $ sudo fsfreeze -f /mnt/your-disk $ gcloud compute disks snapshot DISK $ sudo fsfreeze -f /mnt/your-disk > db.fsyncUnlock() Disk Snap Snap

Google Cloud Platform 32 Cloud Storage Standard (hot) & Nearline
(warm) mongodump BSON: Pure backup/restore, Hadoop mongoexport JSON, CSV: Cloud Dataflow, BigQuery Downstream Backups

Google Cloud Platform 33 Managed Hadoop and Spark with Cloud
Dataproc Separation of storage and compute Spin up clusters of any size in ~90 seconds Preemptible VMs are 70% cheaper Per-minute billing Run multiple clusters segregated by job or function Run against backups or via Hadoop Connector or Spark Connector Analytics

Google Cloud Platform 34 Extract, Transform, Load Batch and Stream
data processing with Cloud Dataflow Intuitive data-processing framework Fully-managed - No-Ops Autoscaling mid-job Dynamic rebalancing mid-job Pull data from multiple sources for ETL jobs

Google Cloud Platform 35 Data Warehousing Petabyte-scale data warehousing with
BigQuery Supports SQL and JSON fields Fast and independently scales storage and compute No setup or administration Stream in up to 100,000 rows/sec using mongobq Import JSON or CSV from Cloud Storage Run Dataflow jobs to transform and insert into BigQuery

Google Cloud Platform 36 Applications Run apps via multiple platforms
Compute Engine using standard instances Container Engine for Kubernetes-native apps App Engine Flex for Dockerized apps

Google Cloud Platform 37 Machine Learning Machine learning at scale
with Cloud ML Powerful image analysis Powerful speech recognition Fast, dynamic translation Trainable, scalable linear and logistic regression

Google Cloud Platform 38 node Kubernetes • MongoDB in Kubernetes
is…..non-trivial • Possible today with shipping Kubernetes • But some potential issues around Pod rescheduling and persistent volumes in 1.2 • Some good recipes out there to solve now • PetSet: improved support for stateful services, coming in Kubernetes 1.3 node master node node node node node

Build What’s Next

Google Cloud Platform 40 Questions, Comments, Resources @crcsmnky https://cloud.google.com/solutions/deploy-mongodb https://github.com/GoogleCloudPlatform/mongodb-cloud-manager

MongoDB World 2016: MongoDB and Google Cloud

MongoDB World 2016: MongoDB and Google Cloud

More Decks by Sandeep Parikh

Other Decks in Technology

Featured

Transcript