Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Yugabyte DB - The Distributed SQL Database for Kubernetes

Yugabyte DB - The Distributed SQL Database for Kubernetes

YugabyteDB, an open-source, distributed SQL database that is designed to support all PostgreSQL features with full wire-compatibility and address exactly these needs - massive scalability, high availability and geo-distribution of data. It’s unique sharding and replication architecture that makes it a perfect fit for Kubernetes-based orchestration.
In this session, we will take a look at the design goals and a hands-on demo of YugaByte DB on top of Kubernetes and how it can be deployed on multiple public clouds and across geographic regions. We will also discuss the best practices, you should employ when deploying distributed SQL databases like Yugabyte DB on Kubernetes. Topics will include best practices and strategies concerning deployment orchestration, persistent storage, scaling, and monitoring.

Bio
Amey is a Principal Data Architect at YugaByte DB with a deep passion for Data Analytics and Cloud-Native technologies. In his current role, he collaborates with Fortune 500 firms to architect their business applications with scalable microservices and geo-distributed, fault-tolerant data backend. Prior to joining YugaByte, he spent 5 years at Pivotal as Platform Data Architect and has helped enterprise customers across multiple industry verticals to extend their analytical capabilities using Pivotal/Open Source Big Data platforms. He is originally from Mumbai, India and has a Master's degree in Computer Science from the University of Pennsylvania(UPenn), Philadelphia.
Twitter: @ameybanarse
LinkedIn: linkedin.com/in/ameybanarse/

AMEY BANARSE

March 07, 2020
Tweet

More Decks by AMEY BANARSE

Other Decks in Technology

Transcript

  1. Yugabyte DB - Distributed SQL
    database on Kubernetes
    Amey Banarse
    Principal Data Architect
    Yugabyte Inc.
    @ameybanarse

    View Slide

  2. 2
    © 2019 All rights reserved.
    Introduction
    2

    Amey Banarse
    Principal Data Architect, Yugabyte Inc.

    Pivotal ♦
    FINRA ♦
    NYSE
    University of Pennsylvania (UPenn)
    @ameybanarse
    about.me/amey

    View Slide

  3. Kubernetes is massively popular in Fortune 500s
    ● Walmart - Edge Computing
    KubeCon 2019 https://www.youtube.com/watch?v=sfPFrvDvdlk
    ● Target - Data @ Edge
    https://tech.target.com/2018/08/08/running-cassandra-in-kubernetes
    -across-1800-stores.html
    ● eBay - Platform Modernization
    https://www.ebayinc.com/stories/news/ebay-builds-own-servers-intends
    -to-open-source/

    View Slide

  4. Data on K8s Ecosystem is evolving rapidly

    View Slide

  5. Why Data services on K8s?
    Containerized data workloads running on Kubernetes offer several advantages over
    traditional VM / bare metal based data workloads including but not limited to:
    ● Better cluster resource utilization
    ● Portability between cloud and on-premises
    ● Frictionless multi-tenancy with versioning
    ● Simple and selective instant upgrades
    ● Robust automation framework can be embedded inside CRDs (Custom
    Resource Definitions) or commonly referred as ‘K8S Operator’.

    View Slide

  6. 6
    Yugabyte Confidential © 2020 All rights reserved.
    A brief history of Yugabyte
    Builders of multiple popular DBs
    Part of Facebook’s Cloud-Native DB Evolution
    Yugabyte team dealt with this growth first hand
    Massive geo-distributed deployment given global users
    Worked with world class infra team to solve these issues
    Yugabyte founding team ran Facebook’s
    public-cloud scale DBaaS
    +1 Trillion
    ops/day
    +100 Petabytes
    data set sizes

    View Slide

  7. 7
    Yugabyte Confidential © 2020 All rights reserved.
    What is Distributed SQL?
    SQL &
    Transactions
    SQL
    Massive
    Scalability
    Geo
    Distribution
    Ultra Resilience
    A Revolutionary Database Architecture

    View Slide

  8. 8
    Yugabyte Confidential © 2020 All rights reserved.
    Open source, high performance, cloud native, distributed SQL database
    100% Apache 2.0
    Low Latency
    (Sub-ms)
    Kubernetes &
    Multi-Cloud

    View Slide

  9. 9
    Yugabyte Confidential © 2020 All rights reserved.
    Designing the Perfect Distributed SQL DB
    PostgreSQL more popular than MongoDB Aurora much more popular than Spanner
    bit.ly/distributed-sql-deconstructed
    Amazon Aurora Google Spanner
    A highly available MySQL
    and PostgreSQL-compatible
    relational database service
    Not scalable but HA
    All RDBMS features
    PostgreSQL & MySQL
    The first horizontally scalable,
    strongly consistent, relational
    database service
    Scalable and HA
    Missing RDBMS features
    New SQL syntax

    View Slide

  10. tablet 1’
    tablet 1’
    tablet 1’
    Self-Healing, Fault-Tolerant
    Auto Sharding & Rebalancing
    ACID Transactions
    Global Data Distribution
    High Throughput, Low Latency
    YCQL
    SQL-Based Flexible Schema API
    YSQL
    Distributed Postgres API
    Distributed SQL APIs
    Natively runs in
    Containers, self service
    Design Follows a Layered Approach

    View Slide

  11. 11
    Yugabyte Confidential © 2020 All rights reserved.
    o YSQL - Fully relational SQL API that is
    wire compatible with PostgreSQL
    o YCQL - Optimized Cassandra Query
    Language API
    o DocDB
    – High-performance distributed
    Document store
    – Offers strong consistency and
    multi row ACID transactions
    Design Follows a Layered Approach

    View Slide

  12. 12
    © 2019 All rights reserved.
    All Nodes are Identical


    YugabyteDB
    Query Layer
    YugabyteDB
    Query Layer
    YugabyteDB
    Query Layer
    DocDB
    Storage Layer
    DocDB
    Storage Layer
    DocDB
    Storage Layer
    Can connect to ANY node
    Add/remove nodes anytime
    YugabyteDB Node YugabyteDB Node YugabyteDB Node
    microservices
    platform

    View Slide

  13. Scaling out PostgreSQL with YugabyteDB

    View Slide

  14. 14
    © 2019 All rights reserved.
    1. Single Region, Multi-Zone
    Availability Zone 1
    Availability Zone 2 Availability Zone 3
    Consistent Across Zones
    No WAN Latency But No
    Region-Level Failover/Repair
    2. Single Cloud, Multi-Region
    Region 1
    Region 2 Region 3
    Consistent Across Regions
    Cross-Region WAN Latency with
    Auto Region-Level Failover/Repair
    3. Multi-Cloud, Multi-Region
    Cloud 1
    Cloud 2 Cloud 3
    Consistent Across Clouds
    Cross-Cloud WAN Latency with
    Auto Cloud-Level Failover/Repair
    Deployment Topologies

    View Slide

  15. 15
    Yugabyte Confidential © 2020 All rights reserved.
    Deploying Geo-Distributed DB on K8s
    ● Scalable & Highly
    Available data tier
    ● Business Continuity
    ● Geo-Partitioning &
    Data Compliance

    View Slide

  16. 16
    Yugabyte Confidential © 2019 All rights reserved.
    Yugabyte DB on K8S

    View Slide

  17. 17
    Yugabyte Confidential © 2019 All rights reserved.
    Yugabyte Platform
    Live Demo

    View Slide

  18. 18
    © 2018 All rights reserved.
    YugaByte DB Deployed as StatefulSets
    node2
    node1 node4
    node3
    yb-master
    StatefulSet
    yugabytedb
    yb-master-1 pod
    yugabytedb
    yb-master-0 pod
    yugabytedb
    yb-master-2 pod
    yb-tserver
    StatefulSet
    tablet 1’
    yugabytedb
    yb-tserver-1 pod
    tablet 1’
    yugabytedb
    yb-tserver-0 pod tablet 1’
    yugabytedb
    yb-tserver-3 pod
    tablet 1’
    yugabytedb
    yb-tserver-2 pod

    Local/Remote
    Persistent Volume
    Local/Remote
    Persistent Volume
    Local/Remote
    Persistent Volume
    Local/Remote
    Persistent Volume
    yb-masters
    Headless Service
    yb-tservers
    Headless Service
    App
    Clients
    Admin
    Clients

    View Slide

  19. 19
    © 2018 All rights reserved.
    Ensuring High Performance
    LOCAL STORAGE
    Since v1.10
    REMOTE STORAGE
    Lower latency, Higher throughput
    Recommended for workloads that do their own
    replication
    Pre-provision outside of K8s
    Use SSDs for latency-sensitive apps
    Higher latency, Lower throughput
    Recommended for workloads do not perform any
    replication on their own
    Provision dynamically in K8s
    Use alongside local storage for cost-efficient tiering
    Most used

    View Slide

  20. 20
    © 2018 All rights reserved.
    Configuring Data Resilience
    POD ANTI-AFFINITY MULTI-ZONE/REGIONAL/MULTI-REGION
    POD SCHEDULING
    Pods of the same type should not
    be scheduled on the same node
    Keeps impact of node failures to
    absolute minimum
    Multi-Zone - Tolerate zone failures
    for k8s worker nodes
    Regional – Tolerate zone failures for
    both k8s worker and master nodes
    Multi-Region* – Requires federation
    of k8s clusters or submariner.io

    View Slide

  21. 21
    © 2018 All rights reserved.
    Automating Day 2 Operations
    BACKUP & RESTORE
    Backups and restores are a
    database level construct
    YugaByte DB can perform
    distributed snapshot and copy
    to a target for a backup
    Restore the backup into an
    existing cluster or a new cluster
    with a different number of
    tservers
    ROLLING UPGRADES
    Supports two upgradeStrategies:
    onDelete (default) and
    rollingUpgrade
    Pick rolling upgrade strategy for
    DBs that support zero downtime
    upgrades such as YugaByte DB
    New instance of the pod
    spawned with same network id
    and storage
    HANDLING FAILURES
    Pod failure handled by K8S
    automatically
    Node failure has to be handled
    manually by adding a new slave
    node to K8S cluster
    Local storage failure has to be
    handled manually by mounting
    new local volume to K8S

    View Slide

  22. 22
    © 2018 All rights reserved.
    Extending StatefulSets with Operators
    https://github.com/yugabyte/yugabyte-operator
    Based on Custom Controllers that have direct
    access to lower level K8S API
    Excellent fit for stateful apps requiring human
    operational knowledge to correctly scale,
    reconfigure and upgrade while simultaneously
    ensuring high performance and data resilience
    Complementary to Helm for packaging
    CPU usage in the yb-tserver
    StatefulSet
    Scale yb-tserver by 1 pod
    CPU > 80% for 1min and
    max_threshold not
    exceeded

    View Slide

  23. 23
    © 2018 All rights reserved.
    YugabyteDB Quickstart
    docs.yugabyte.com/quick-start

    View Slide

  24. A platform built for a new way of thinking
    ➔ Event + Microservice first
    design
    ➔ Team autonomy with platform
    efficiency
    ➔ 100% Cloud Native operating
    model on k8s
    ➔ Turnkey multi-cloud
    ➔ Full Spring Data support

    View Slide

  25. 25
    Yugabyte Confidential © 2020 All rights reserved.
    Kubernetes-Native
    YugabyteDB from a user perspective

    View Slide

  26. 26
    Yugabyte Confidential © 2020 All rights reserved.
    Who Installs YugabyteDB?

    View Slide

  27. 27
    Yugabyte Confidential © 2020 All rights reserved.
    Key Differentiator
    Cloud-agnostic, Kubernetes-native, high
    performance
    ● Flexibility to run internal DBaaS on AWS or on-prem
    ● Integration with PKS, Service Broker and Marketplace for
    internal PaaS
    ● Multi-master deployment required
    ✓ This is a technical win
    ✓ Starting out with a key OLTP application
    ✓ Interested in building out
    cloud-agnostic private DBaaS to power
    private PaaS
    Other Appealing Features
    Scalability & High Availability
    Operational efficiency (zero-downtime
    day 2 operations)
    Full PostgreSQL support (eventually
    replace Oracle)
    Kubernetes-Ready
    A Large US Healthcare provider

    View Slide

  28. 28
    Yugabyte Confidential © 2020 All rights reserved.
    Key Differentiator
    Kubernetes-native, multi-region database
    ● Enable users to build stateful edge applications
    ● Move data across multiple geographies
    ● Deployments become cloud-agnostic
    ✓ Building an edge cloud solution
    ✓ Looked at multiple DBs before picking
    YugabyteDB
    ✓ Repeatable “telco” and “edge” play
    Other Appealing Features
    Scalability & High Availability
    Yugabyte Platform (turnkey solution for
    service providers)
    Low Latency
    Kubernetes-Ready
    A Large US Telco provider

    View Slide

  29. A classic Enterprise App
    scenario

    View Slide

  30. 30
    © 2018 All rights reserved.
    A Real-World Demo
    Yugastore – E-Commerce app on the cloud native stack
    Deployed on
    https://github.com/yugabyte/yugastore-java

    View Slide

  31. Yugastore - Kronos Marketplace

    View Slide

  32. Classic Enterprise Microservices Architecture
    CART
    MICROSERVICE
    PRODUCT
    MICROSERVICE
    API Gateway
    CHECKOUT
    MICROSERVICE
    UIU
    Yugabyte Cluster
    SQL
    CQL
    CQL
    UI APP
    REST
    .

    View Slide

  33. Istio Traffic Management for Microservices
    CART
    MICROSERVICE
    PRODUCT
    MICROSERVICE
    API
    Gateway
    CHECKOUT
    MICROSERVICE
    UIU
    UI APP
    Galley
    Citadel
    Pilot
    Istio
    Edge Proxy
    Istio Control Plane
    Istio Service Discovery
    Istio Edge Gateway
    Istio Route Configuration
    using Envoy Proxy

    View Slide

  34. Yugabyte Spark Integration
    ● Analytics and Aggregate Queries
    ● ML workloads - Recommendations and Rankings
    ● Uses pySpark, OSS Spark-Cassandra connectors
    Source Tables Derived Tables
    Enrichment /
    pre-aggregation
    Batch
    Aggregates

    View Slide

  35. 35
    © 2019 All rights reserved.
    Join the community
    yugabyte.com/slack
    We stars!
    github.com/yugabyte/yugabyte-db
    We’re Hiring!
    bit.ly/yugabyte-careers

    View Slide

  36. 36
    Yugabyte Confidential © 2020 All rights reserved.
    The default database for the cloud
    Thank You!

    View Slide