Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB World 2016: MongoDB and Google Cloud

MongoDB World 2016: MongoDB and Google Cloud

Sandeep Parikh

June 28, 2016
Tweet

More Decks by Sandeep Parikh

Other Decks in Technology

Transcript

  1. MongoDB and Google Cloud Platform
    Sandeep Parikh
    Head of Solutions, Americas East
    @crcsmnky

    View Slide

  2. Agenda
    MongoDB Deployment Architectures
    The Finer Points of Configuration
    Deploying MongoDB on Google Cloud Platform
    Integrating with Google Cloud Platform

    View Slide

  3. Google Cloud Platform 3
    MongoDB Deployment Architectures

    View Slide

  4. Google Cloud Platform 4
    Replica Sets Across Zones

    View Slide

  5. Google Cloud Platform 5
    Replica Sets Across Regions

    View Slide

  6. Google Cloud Platform 6
    Sharded Cluster Across Regions

    View Slide

  7. Google Cloud Platform 7
    The Finer Points of Configuration

    View Slide

  8. Google Cloud Platform 8
    Mostly available
    single CPU
    3.75 GB of RAM
    per vCPU
    Higher CPU
    relative to
    memory
    0.9 GB of RAM
    per vCPU
    Higher memory
    per core
    6.5 GB of RAM
    per vCPU
    Machine Types
    Standard High Memory High Compute Shared Core Custom
    Balanced CPU
    and memory
    configurations
    3.75 GB of RAM
    per vCPU
    Independently
    scale CPU and
    RAM
    Max 6.5 GB of
    RAM per vCPU

    View Slide

  9. Google Cloud Platform 9
    Mostly available
    single CPU
    3.75 GB of RAM
    per vCPU
    Higher CPU
    relative to
    memory
    0.9 GB of RAM
    per vCPU
    Higher memory
    per core
    6.5 GB of RAM
    per vCPU
    Machine Types
    Standard High Memory High Compute Shared Core Custom
    Balanced CPU
    and memory
    configurations
    3.75 GB of RAM
    per vCPU
    Independently
    scale CPU and
    RAM
    Max 6.5 GB of
    RAM per vCPU
    Good for getting
    started
    Best for
    MongoDB
    workloads
    Skip it, you
    probably don’t
    need the
    compute
    Configure one
    that fits your
    working set
    g1-small suitable
    for Arbiter,
    maybe

    View Slide

  10. Google Cloud Platform 10
    Disks
    Standard SSD
    Local
    SSD
    Persistent storage
    Max 3,000 Read
    Max 15,000 Write
    $0.04 per GB
    Up to 64 TB
    Persistent storage
    Max 15,000 Read
    Max 15,000 Write
    $0.17 per GB
    Up to 64 TB
    Ephemeral storage
    Max 680,000 Read
    Max 360,000 Write
    $0.218 per GB
    375 GB only

    View Slide

  11. Google Cloud Platform 11

    View Slide

  12. Google Cloud Platform 12
    Storage Bits
    IOPS scale with size
    500GB PD-SSD is the sweet spot
    Better off with fewer, larger volumes
    No separate data/journal/log
    Data is encrypted at-rest
    Automatically, once it leaves the instance
    Standard SSD

    View Slide

  13. Google Cloud Platform 13
    Deploying MongoDB on Google Cloud Platform

    View Slide

  14. Google Cloud Platform 14
    Manually Deploying MongoDB

    View Slide

  15. Google Cloud Platform 15
    Google Cloud Launcher

    View Slide

  16. Google Cloud Platform 16
    MongoDB Cloud Manager

    View Slide

  17. Google Cloud Platform 17
    MongoDB Cloud Manager
    How do you
    automate this?

    View Slide

  18. Google Cloud Platform 18
    Cloud Deployment Manager
    Provision, configure your deployment
    Configuration as code
    Declarative approach to configuration
    Template-driven
    Supports YAML, Jinja, and Python
    Use schemas to constrain parameters
    References control order and dependencies

    View Slide

  19. Google Cloud Platform 19
    Bootstrapping MongoDB Cloud Manager
    Deployment
    Manager
    Template

    View Slide

  20. Google Cloud Platform 20
    Bootstrapping Cloud Manager
    Schema, Configuration & Template
    Posted on Github https://github.com/GoogleCloudPlatform/mongodb-cloud-manager
    Three Compute Engine instances, each with 500 GB PD-SSD
    MongoDB Cloud Manager automation agent pre-installed and configured
    $ gcloud deployment-manager deployments create mongodb-cloud-manager \
    --config mongodb-cloud-manager.jinja \
    --properties mmsGroupId=MMSGROUPID,mmsApiKey=MMSAPIKEY

    View Slide

  21. 21
    Defines required properties for deployment
    machineType, zone, mmsGroupId, mmsApiKey
    Use supplied defaults or override at runtime
    Constrain input by type and regex or filter
    Schema

    View Slide

  22. 22
    Imports instance template
    Three resources of instance template
    Pass in properties from schema
    Configuration

    View Slide

  23. 23
    Two resources, disk and instance
    Inherits properties from parent template
    References ensure creation order
    Instance Template

    View Slide

  24. Google Cloud Platform 24
    Integrating with Google Cloud Platform

    View Slide

  25. Google Cloud Platform 25
    MongoDB in Google Cloud Ecosystem

    View Slide

  26. Google Cloud Platform 26
    MongoDB in Google Cloud Ecosystem

    View Slide

  27. Google Cloud Platform 27
    Downstream Use Cases
    Backups
    Data Warehouse
    Analytics
    Applications
    ETL
    Machine Learning

    View Slide

  28. Google Cloud Platform 28
    Downstream Use Cases
    Backups
    Data Warehouse
    Analytics
    Applications
    ETL
    Machine Learning

    View Slide

  29. Google Cloud Platform 29
    Google Research in Data Technologies
    2012 2013
    2002 2004 2006 2008 2010
    GFS
    MapReduce
    BigTable Colossus
    Dremel Flume
    Megastore
    Spanner
    Millwheel
    PubSub
    F1
    Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html
    The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html

    View Slide

  30. Google Cloud Platform 30
    Google Research in Data Technologies
    2012 2013
    2002 2004 2006 2008 2010
    Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html
    The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2009 http://research.google.com/pubs/pub35290.html
    Cloud Storage
    Dataproc
    Bigtable Cloud Storage
    BigQuery Dataflow
    Datastore
    Spanner
    Dataflow
    PubSub
    F1

    View Slide

  31. Google Cloud Platform 31
    Backup Snapshots
    > db.adminCommand({fsync:1, lock:true})
    $ sudo sync
    $ sudo fsfreeze -f /mnt/your-disk
    $ gcloud compute disks snapshot DISK
    $ sudo fsfreeze -f /mnt/your-disk
    > db.fsyncUnlock()
    Disk Snap
    Snap

    View Slide

  32. Google Cloud Platform 32
    Cloud Storage
    Standard (hot) & Nearline (warm)
    mongodump
    BSON: Pure backup/restore, Hadoop
    mongoexport
    JSON, CSV: Cloud Dataflow, BigQuery
    Downstream Backups

    View Slide

  33. Google Cloud Platform 33
    Managed Hadoop and Spark with Cloud Dataproc
    Separation of storage and compute
    Spin up clusters of any size in ~90 seconds
    Preemptible VMs are 70% cheaper
    Per-minute billing
    Run multiple clusters segregated by job or function
    Run against backups or via Hadoop Connector or Spark Connector
    Analytics

    View Slide

  34. Google Cloud Platform 34
    Extract, Transform, Load
    Batch and Stream data processing with Cloud Dataflow
    Intuitive data-processing framework
    Fully-managed - No-Ops
    Autoscaling mid-job
    Dynamic rebalancing mid-job
    Pull data from multiple sources for ETL jobs

    View Slide

  35. Google Cloud Platform 35
    Data Warehousing
    Petabyte-scale data warehousing with BigQuery
    Supports SQL and JSON fields
    Fast and independently scales storage and compute
    No setup or administration
    Stream in up to 100,000 rows/sec using mongobq
    Import JSON or CSV from Cloud Storage
    Run Dataflow jobs to transform and insert into BigQuery

    View Slide

  36. Google Cloud Platform 36
    Applications
    Run apps via multiple platforms
    Compute Engine using standard instances
    Container Engine for Kubernetes-native apps
    App Engine Flex for Dockerized apps

    View Slide

  37. Google Cloud Platform 37
    Machine Learning
    Machine learning at scale with Cloud ML
    Powerful image analysis
    Powerful speech recognition
    Fast, dynamic translation
    Trainable, scalable linear and logistic regression

    View Slide

  38. Google Cloud Platform 38
    node
    Kubernetes
    ● MongoDB in Kubernetes is…..non-trivial
    ● Possible today with shipping Kubernetes
    ● But some potential issues around Pod rescheduling and
    persistent volumes in 1.2
    ● Some good recipes out there to solve now
    ● PetSet: improved support for stateful services, coming in
    Kubernetes 1.3
    node
    master node
    node
    node node node

    View Slide

  39. Build What’s Next

    View Slide

  40. Google Cloud Platform 40
    Questions, Comments, Resources
    @crcsmnky
    https://cloud.google.com/solutions/deploy-mongodb
    https://github.com/GoogleCloudPlatform/mongodb-cloud-manager

    View Slide