Slide 1

Slide 1 text

Yugabyte DB - Distributed SQL database on Kubernetes Nikhil Chandrappa Lead Software Engineer Yugabyte Inc. @nikhilmcn

Slide 2

Slide 2 text

2 © 2019 All rights reserved. Introduction 2  Nikhil Chandrappa Lead Software Engineer, YugabyteDB ♦ Pivotal ♦ Syracuse University @nikhilmcn

Slide 3

Slide 3 text

Kubernetes is massively popular in Fortune 500s ● Walmart - Edge Computing KubeCon 2019 https://www.youtube.com/watch?v=sfPFrvDvdlk ● Target - Data @ Edge https://tech.target.com/2018/08/08/running-cassandra-in-kubernetes -across-1800-stores.html ● eBay - Platform Modernization https://www.ebayinc.com/stories/news/ebay-builds-own-servers-intends -to-open-source/

Slide 4

Slide 4 text

Data on K8s Ecosystem is evolving rapidly

Slide 5

Slide 5 text

Why Data services on K8s? Containerized data workloads running on Kubernetes offer several advantages over traditional VM / bare metal based data workloads including but not limited to: ● Better cluster resource utilization ● Portability between cloud and on-premises ● Frictionless multi-tenancy with versioning ● Simple and selective instant upgrades ● Robust automation framework can be embedded inside CRDs (Custom Resource Definitions) or commonly referred as ‘K8S Operator’.

Slide 6

Slide 6 text

6 Yugabyte Confidential © 2020 All rights reserved. A brief history of Yugabyte Builders of multiple popular DBs Part of Facebook’s Cloud-Native DB Evolution Yugabyte team dealt with this growth first hand Massive geo-distributed deployment given global users Worked with world class infra team to solve these issues Yugabyte founding team ran Facebook’s public-cloud scale DBaaS +1 Trillion ops/day +100 Petabytes data set sizes

Slide 7

Slide 7 text

7 Yugabyte Confidential © 2020 All rights reserved. What is Distributed SQL? SQL & Transactions SQL Massive Scalability Geo Distribution Ultra Resilience A Revolutionary Database Architecture

Slide 8

Slide 8 text

8 Yugabyte Confidential © 2020 All rights reserved. Open source, high performance, cloud native, distributed SQL database 100% Apache 2.0 Low Latency (Sub-ms) Kubernetes & Multi-Cloud

Slide 9

Slide 9 text

9 Yugabyte Confidential © 2020 All rights reserved. Designing the Perfect Distributed SQL DB Aurora much more popular than Spanner bit.ly/distributed-sql-deconstructed Amazon Aurora Google Spanner A highly available MySQL and PostgreSQL-compatible relational database service Not scalable but HA All RDBMS features PostgreSQL & MySQL The first horizontally scalable, strongly consistent, relational database service Scalable and HA Missing RDBMS features New SQL syntax

Slide 10

Slide 10 text

10 Yugabyte Confidential © 2020 All rights reserved. o YSQL - Fully relational SQL API that is wire compatible with PostgreSQL o YCQL - Optimized Cloud Query Language API o DocDB – High-performance distributed Document store – Offers strong consistency and multi row ACID transactions Design Follows a Layered Approach

Slide 11

Slide 11 text

11 © 2019 All rights reserved. All Nodes are Identical … … YugabyteDB Query Layer YugabyteDB Query Layer YugabyteDB Query Layer DocDB Storage Layer DocDB Storage Layer DocDB Storage Layer Can connect to ANY node Add/remove nodes anytime YugabyteDB Node YugabyteDB Node YugabyteDB Node microservices platform

Slide 12

Slide 12 text

12 © 2019 All rights reserved. Inside The Hood - Yugabyte Cluster node1 node2 node3 node4 … Scale to as many nodes as needed Raft group leader (serves writes & strong reads) Raft group follower (serves timeline-consistent reads & ready for leader election) syscatalog yb-master1 YB-Master Service Manage shard metadata & coordinate config changes syscatalog yb-master2 syscatalog yb-master3 Cluster Administration Admin clients … yb-tserver1 tablet3 tablet2 tablet1 YB-TServer Service Store & serve app data in/from tablets (aka shards) yb-tserver2 yb-tserver3 yb-tserver4 … tablet4 tablet2 tablet1 … tablet4 tablet3 tablet1 … tablet4 tablet3 tablet2 App clients Distributed SQL API Distributed Txn Mgr Distributed Txn Mgr Distributed Txn Mgr Distributed Txn Mgr

Slide 13

Slide 13 text

Scaling out PostgreSQL with YugabyteDB Client Apps Client Apps

Slide 14

Slide 14 text

14 © 2019 All rights reserved. 1. Single Region, Multi-Zone Availability Zone 1 Availability Zone 2 Availability Zone 3 Consistent Across Zones No WAN Latency But No Region-Level Failover/Repair 2. Single Cloud, Multi-Region Region 1 Region 2 Region 3 Consistent Across Regions Cross-Region WAN Latency with Auto Region-Level Failover/Repair 3. Multi-Cloud, Multi-Region Cloud 1 Cloud 2 Cloud 3 Consistent Across Clouds Cross-Cloud WAN Latency with Auto Cloud-Level Failover/Repair Deployment Topologies

Slide 15

Slide 15 text

15 Yugabyte Confidential © 2020 All rights reserved. Deploying Geo-Distributed DB on K8s ● Scalable & Highly Available data tier ● Business Continuity ● Geo-Partitioning & Data Compliance

Slide 16

Slide 16 text

16 Yugabyte Confidential © 2019 All rights reserved. Yugabyte DB on K8S

Slide 17

Slide 17 text

17 Yugabyte Confidential © 2019 All rights reserved. Yugabyte Platform Live Demo

Slide 18

Slide 18 text

18 © 2018 All rights reserved. YugabyteDB Deployed as StatefulSets node2 node1 node4 node3 yb-master StatefulSet yugabytedb yb-master-1 pod yugabytedb yb-master-0 pod yugabytedb yb-master-2 pod yb-tserver StatefulSet tablet 1’ yugabytedb yb-tserver-1 pod tablet 1’ yugabytedb yb-tserver-0 pod tablet 1’ yugabytedb yb-tserver-3 pod tablet 1’ yugabytedb yb-tserver-2 pod … Local/Remote Persistent Volume Local/Remote Persistent Volume Local/Remote Persistent Volume Local/Remote Persistent Volume yb-masters Headless Service yb-tservers Headless Service App Clients Admin Clients

Slide 19

Slide 19 text

19 © 2018 All rights reserved. Ensuring High Performance LOCAL STORAGE Since v1.10 REMOTE STORAGE Lower latency, Higher throughput Recommended for workloads that do their own replication Pre-provision outside of K8s Use SSDs for latency-sensitive apps Higher latency, Lower throughput Recommended for workloads do not perform any replication on their own Provision dynamically in K8s Use alongside local storage for cost-efficient tiering Most used

Slide 20

Slide 20 text

20 © 2018 All rights reserved. Configuring Data Resilience POD ANTI-AFFINITY MULTI-ZONE/REGIONAL/MULTI-REGION POD SCHEDULING Pods of the same type should not be scheduled on the same node Keeps impact of node failures to absolute minimum Multi-Zone - Tolerate zone failures for k8s worker nodes Regional – Tolerate zone failures for both k8s worker and master nodes Multi-Region / Multi-Cluster – Requires network discovery between multi cluster

Slide 21

Slide 21 text

21 © 2018 All rights reserved. Automating Day 2 Operations BACKUP & RESTORE Backups and restores are a database level construct YugaByte DB can perform distributed snapshot and copy to a target for a backup Restore the backup into an existing cluster or a new cluster with a different number of TServers ROLLING UPGRADES Supports two upgradeStrategies: onDelete (default) and rollingUpgrade Pick rolling upgrade strategy for DBs that support zero downtime upgrades such as YugaByte DB New instance of the pod spawned with same network id and storage HANDLING FAILURES Pod failure handled by K8S automatically Node failure has to be handled manually by adding a new slave node to K8S cluster Local storage failure has to be handled manually by mounting new local volume to K8S

Slide 22

Slide 22 text

22 © 2018 All rights reserved. Extending StatefulSets with Operators https://github.com/yugabyte/yugabyte-operator Based on Custom Controllers that have direct access to lower level K8S API Excellent fit for stateful apps requiring human operational knowledge to correctly scale, reconfigure and upgrade while simultaneously ensuring high performance and data resilience Complementary to Helm for packaging CPU usage in the yb-tserver StatefulSet Scale yb-tserver by 1 pod CPU > 80% for 1min and max_threshold not exceeded

Slide 23

Slide 23 text

23 © 2018 All rights reserved. YugabyteDB Quickstart docs.yugabyte.com/quick-start

Slide 24

Slide 24 text

A classic Enterprise App scenario

Slide 25

Slide 25 text

25 © 2018 All rights reserved. A Real-World Demo Yugastore – E-Commerce app on the cloud native stack Deployed on https://github.com/yugabyte/yugastore-java

Slide 26

Slide 26 text

Yugastore - Kronos Marketplace

Slide 27

Slide 27 text

A platform built for a new way of thinking ➔ Event + Microservice first design ➔ Team autonomy with platform efficiency ➔ 100% Cloud Native operating model on k8s ➔ Turnkey multi-cloud ➔ Full Spring Data support

Slide 28

Slide 28 text

Classic Enterprise Microservices Architecture CART MICROSERVICE PRODUCT MICROSERVICE API Gateway CHECKOUT MICROSERVICE UIU Yugabyte Cluster YSQL YSQL YCQL UI APP REST .

Slide 29

Slide 29 text

Istio Traffic Management for Microservices CART MICROSERVICE PRODUCT MICROSERVICE API Gateway CHECKOUT MICROSERVICE UIU UI APP Galley Citadel Pilot Istio Edge Proxy Istio Control Plane Istio Service Discovery Istio Edge Gateway Istio Route Configuration using Envoy Proxy

Slide 30

Slide 30 text

30 © 2019 All rights reserved. Join the community yugabyte.com/slack We stars! github.com/yugabyte/yugabyte-db We’re Hiring! bit.ly/yugabyte-careers

Slide 31

Slide 31 text

31 Yugabyte Confidential © 2020 All rights reserved. The default database for the cloud Thank You!

Slide 32

Slide 32 text

32 Yugabyte Confidential © 2020 All rights reserved. Kubernetes-Native YugabyteDB from a user perspective

Slide 33

Slide 33 text

33 Yugabyte Confidential © 2020 All rights reserved. Key Differentiator Cloud-agnostic, Kubernetes-native, high performance ● Flexibility to run internal DBaaS on AWS or on-prem ● Integration with PKS, Service Broker and Marketplace for internal PaaS ● Multi-master deployment required ✓ This is a technical win ✓ Starting out with a key OLTP application ✓ Interested in building out cloud-agnostic private DBaaS to power private PaaS Other Appealing Features Scalability & High Availability Operational efficiency (zero-downtime day 2 operations) Full PostgreSQL support (eventually replace Oracle) Kubernetes-Ready A Large US Healthcare provider

Slide 34

Slide 34 text

34 Yugabyte Confidential © 2020 All rights reserved. Designing the Perfect Distributed SQL DB PostgreSQL more popular than MongoDB Aurora much more popular than Spanner bit.ly/distributed-sql-deconstructed Amazon Aurora Google Spanner A highly available MySQL and PostgreSQL-compatible relational database service Not scalable but HA All RDBMS features PostgreSQL & MySQL The first horizontally scalable, strongly consistent, relational database service Scalable and HA Missing RDBMS features New SQL syntax

Slide 35

Slide 35 text

Yugabyte Spark Integration ● Analytics and Aggregate Queries ● ML workloads - Recommendations and Rankings ● Uses pySpark, OSS Spark-Cassandra connectors Source Tables Derived Tables Enrichment / pre-aggregation Batch Aggregates