Slide 1

Slide 1 text

Cloud Native Meetup Tokyo #11 KubeCon + CloudNativeCon Recap 2019.12.10 @CyberAgent © cyberblack28 KubeCon + CloudNativeCon 2019 San Diego Recap ~ Vitess ~

Slide 2

Slide 2 text

Profile Name : Yutaka Ichikawa Twitter : cyberblack28 Hatena Blog : https://cyberblack28.hatenablog.com/ SpeakerDeck : https://speakerdeck.com/cyberblack28 Job Educational Solution Architect Developer Advocate / Technical Evangelist Infrastructure Engineer Frontend Engineer Community & Certification Publications #deepcn #rancherjp CKA KCM100 CKAD 2018 2019

Slide 3

Slide 3 text

Let’s Start Cloud Native AP Communications Co., Ltd

Slide 4

Slide 4 text

Information http://bit.ly/kubecon2018na_recap

Slide 5

Slide 5 text

1. Overview 2. What’s Vitess 3. Case Studies & Maintainer Track & Storage Sessions 4. Summary Agenda

Slide 6

Slide 6 text

Overview

Slide 7

Slide 7 text

Overview KubeCon + CloudNativeCon NA 2019 Vitess Sessions 1.Keynote Sessions Tuesday, November 19 • 9:20am - 9:45am Keynote: CNCF Project Updates - Bryan Liles, KubeCon + CloudNativeCon North America 2019 Co-Chair & Senior Staff Engineer, VMware http://bit.ly/kubecon2019na_vitess1 http://bit.ly/kubecon2019na_vitess_m1 Wednesday, November 20 • 9:27am - 9:32am Sponsored Keynote: Network, Please Evolve – Chapter 2 - Vijoy Pandey, Vice President/CTO Cloud, Cisco http://bit.ly/kubecon2019na_vitess1_2 http://bit.ly/kubecon2019na_vitess_m1_2

Slide 8

Slide 8 text

Overview KubeCon + CloudNativeCon NA 2019 Vitess Sessions 2.Case Studies Tuesday, November 19 • 11:50am - 12:25pm Scaling Resilient Systems: A Journey into Slack's Database Service - Rafael Chacon & Guido Iaquinti, Slack Thursday, November 21 • 2:25pm - 3:00pm Gone in 60 Minutes: Migrating 20 TB from AKS to GKE in an Hour with Vitess - Derek Perkins, Nozzle http://bit.ly/kubecon2019na_vitess2 http://bit.ly/kubecon2019na_vitess_m2 http://bit.ly/kubecon2019na_vitess_m3

Slide 9

Slide 9 text

Overview KubeCon + CloudNativeCon NA 2019 Vitess Sessions 3.Maintainer Track Sessions Tuesday, November 19 • 11:50am - 12:25pm How to Migrate a MySQL Database to Vitess - Sugu Sougoumarane & Morgan Tocker, PlanetScale Wednesday, November 20 • 2:25pm - 3:00pm Geo-partitioning with Vitess - Deepthi Sigireddi & Jitendra Vaidya, PlanetScale http://bit.ly/kubecon2019na_vitess3 http://bit.ly/kubecon2019na_vitess_m4 http://bit.ly/kubecon2019na_vitess4 http://bit.ly/kubecon2019na_vitess_m5

Slide 10

Slide 10 text

Overview KubeCon + CloudNativeCon NA 2019 Vitess Sessions 4.Storage Sessions Tuesday, November 19 • 3:20pm - 3:55pm Vitess: Stateless Storage in the Cloud - Sugu Sougoumarane, PlanetScale http://bit.ly/kubecon2019na_vitess5 http://bit.ly/kubecon2019na_vitess_m6

Slide 11

Slide 11 text

Overview Keynote Sessions Keynote: CNCF Project Updates - Bryan Liles, KubeCon + CloudNativeCon North America 2019 Co-Chair & Senior Staff Engineer, VMware

Slide 12

Slide 12 text

Overview November 5, 2019 Cloud Native Computing Foundation Announces Vitess Graduation Vitess is the eighth project to graduate, following Kubernetes, Prometheus, Envoy, CoreDNS, containerd, Fluentd, and Jaeger. Version is Vitess 4.0.1. Announcement : http://bit.ly/vitess_graduation Vitess graduated one year and nine months after becoming the CNCF Incubation Project in February 2018. 1.Adoption “Mission-critical production workloads running in real companies” 2.Maintainer Diversity “Identify long-term contributions from multiple organizations, then drill down into project details and test how to do for your design strategy.” 3.Project Health “Determining the appropriateness of project health”

Slide 13

Slide 13 text

Overview 2019 San Diego 2018 Seattle

Slide 14

Slide 14 text

Overview GitHub Investigate (As of December 2019)

Slide 15

Slide 15 text

Overview “Slack's Vitess introduction was due to the very rapidly changing business needs and a system that was flexible enough to accommodate those changes.” “Slack currently has a goal of about 35% migration to Vitess and 100% next year.”

Slide 16

Slide 16 text

Overview “JD.com is China's largest online shopping site. China's Black Friday sale has achieved a huge scale of about 4,000 key spaces, over 30,000 pods, and a QPS of 35 million (peak).”

Slide 17

Slide 17 text

Overview KubeCon + CloudNativeCon China 2019 Vitess Sessions Tuesday, June 25 • 11:00 - 11:35 Two Years with Vitess: How JD.com Runs the World's Largest Vitess - Xuhaihua & Jin Ke Xie , JD.com http://bit.ly/kubecon2019china_vitess1 http://bit.ly/kubecon2019china_vitess_m1

Slide 18

Slide 18 text

Overview “Launched startup company Vitess called Nozzle. All of their applications were run on Kubernetes and moved from AKS to GKE, realizing "No Vendor Lock-in" in Kubernetes and Vitess.

Slide 19

Slide 19 text

Overview Keynote Sessions Sponsored Keynote: Network, Please Evolve – Chapter 2 - Vijoy Pandey, Vice President/CTO Cloud, Cisco

Slide 20

Slide 20 text

Overview Abstracting the L2 / L3 network part and realizing communication functions according to Kubernetes functions

Slide 21

Slide 21 text

Overview “Until now, Technical Complexity & Organizational Complexity & Process Complexity is born.” “You want these DB pods to be able to securely communicate for DB sharding & replication.

Slide 22

Slide 22 text

Overview NSM-driven DB sharding & replication enables comprehensive, efficient communication, security and observability.

Slide 23

Slide 23 text

What’s Vitess

Slide 24

Slide 24 text

What’s Vitess

Slide 25

Slide 25 text

What’s Vitess A cloud-native database cluster system that achieves high availability and scale on a large scale with Sharding MySQL

Slide 26

Slide 26 text

What’s Vitess Architecture

Slide 27

Slide 27 text

What’s Vitess Vitess on Kubernetes

Slide 28

Slide 28 text

What’s Vitess vtgate A proxy server that routes queries from application to vttablet and returns the results to the client tablet mysqld and vttablet set vttablet Proxy server placed in front of MySQL (mysqld), also serves to protect MySQL from query rewriting, deduplication, and harmful queries vtctld HTTP server that serves as the window for management operations (GUI) of Vitess cluster vtctl Command line tool for managing Vitess cluster (CLI) Topology Metadata store that manages configuration information of Vitess cluster, Kubernetes supports etcd, and other than etcd supports ZooKeeper Technical Terms

Slide 29

Slide 29 text

What’s Vitess Sharding • Store data divided into two or more databases • Scale-out and performance improvement by adding Shard Sharding of Vitess • Vertical Sharding Store in multiple databases for each table • Horizontal Sharding Divide one table into multiple shards and store them in multiple databases

Slide 30

Slide 30 text

What’s Vitess Table Sharding VSchema is Sharding definition, routing information Refer to VTworkerVSchema and execute Sharding split processing Refer to VSchema and route to the appropriate Shard Keyspace is a logical database that combines multiple shards. Recognized as one database from application.

Slide 31

Slide 31 text

What’s Vitess Reference Docs • Vitess is a database clustering system for horizontal scaling of MySQL https://vitess.io/ • Vitess Twitter https://twitter.com/vitessio • CrashAcademy 「CNDJP 勉強会 #8 Vitessのパフォーマンスと運用性を検証してみた」 https://crash.academy/ng/video/412/1736 • Vitess Slack https://vitess.slack.com/ • Vitess Github https://github.com/vitessio/vitess

Slide 32

Slide 32 text

Case Studies & Maintainer Track & Storage Sessions

Slide 33

Slide 33 text

Case Study Slack

Slide 34

Slide 34 text

Case Studies & Maintainer Track & Storage Sessions In this talk, Rafael and Guido will share an overview about how Slack designed, built, scaled and then iterated to improve its distributed database service based on top of Vitess, now a CNCF project. The Databases team at Slack scaled a Vitess cluster from 0 to spikes of 2.7 Million queries per second. This journey has taught us how to operate a database cluster with more than 2000 nodes and expecting to growth to more than 3500 in the next 12 months.

Slide 35

Slide 35 text

Case Studies & Maintainer Track & Storage Sessions

Slide 36

Slide 36 text

Case Studies & Maintainer Track & Storage Sessions

Slide 37

Slide 37 text

Case Studies & Maintainer Track & Storage Sessions 1.Databases at Slack Current status Legacy Shards Vitess Shards In progress migration of our entire dataset to Vitess.

Slide 38

Slide 38 text

Case Studies & Maintainer Track & Storage Sessions Legacy Shards Vitess Shards Application level team-sharded active master-master MySQL setup. Master-replica MySQL setup fully managed by Vitess.

Slide 39

Slide 39 text

Case Studies & Maintainer Track & Storage Sessions Why are we migrating? • “Migrating to Vitess at (Slack) Scale” - Mike Demmer (https://www.percona.com/live/18/sessions/migrating-to-vitess-at-slack-scale) • “Designing and launching the next-generation database system at Slack: from whiteboard to production” - Guido Iaquinti (https://www.percona.com/live/18/sessions/designing-and-launching-the-next-generation-database-system-slack-from-whiteboard-to-production) • “Smooth scaling: Slack’s journey toward a new database” - Ameet Kotian (https://conferences.oreilly.com/velocity/vl-ny/public/schedule/detail/69885) For more details please see the presentations on the slide.

Slide 40

Slide 40 text

Case Studies & Maintainer Track & Storage Sessions tl;dr; shard size limits, inefficient resource distribution, operational overhead, single sharding model “While Slack users are on the rise, they are unable to scale quickly and flexibly and cannot meet business needs.” Why are we migrating?

Slide 41

Slide 41 text

Case Studies & Maintainer Track & Storage Sessions • Scaling and sharding flexibility without changing SQL (much) • MySQL core maintains operator and developer know-how • Proven at scale at YouTube and more recently others • Active developer community and approachable code base Why Vitess?

Slide 42

Slide 42 text

Case Studies & Maintainer Track & Storage Sessions Stats • Queries per day: 53+ billion • Storage provisioned: 7.5+ PB • Served by legacy infrastructure: ~60% • Served by Vitess: ~40% • Target: 70% served by Vitess by EOY Aim to complete the transition to Vitess within 2020 !!

Slide 43

Slide 43 text

Case Studies & Maintainer Track & Storage Sessions

Slide 44

Slide 44 text

Case Studies & Maintainer Track & Storage Sessions 2.Running databases in the cloud Immutable infrastructure Instance failure Durability through replication

Slide 45

Slide 45 text

Case Studies & Maintainer Track & Storage Sessions How we run Vitess EC2 Percona MySQL5.7 ASG for stateless components Ephemeral NVMe (no EBS)

Slide 46

Slide 46 text

Case Studies & Maintainer Track & Storage Sessions Not Kubernetes

Slide 47

Slide 47 text

Case Studies & Maintainer Track & Storage Sessions

Slide 48

Slide 48 text

Case Studies & Maintainer Track & Storage Sessions 3.Fault tolerance & isolation Slack cloud infrastructure • Amazon EC2 is hosted in multiple locations world-wide. • These locations are composed of Regions and Availability Zones (AZ’s). • Each Region is a separate geographic area. • AZ’s in a Region are connected through low-latency links.

Slide 49

Slide 49 text

Case Studies & Maintainer Track & Storage Sessions Vitess initial deployment • A single cell across multiple AZ’s (fundamental). • Global and local topology using the same Consul cluster (circumstantial). Topology : Vitess Key-Value Store Consul : Service Discovery

Slide 50

Slide 50 text

Case Studies & Maintainer Track & Storage Sessions

Slide 51

Slide 51 text

Case Studies & Maintainer Track & Storage Sessions Resilient systems • Minimize the blast radius. • Isolation is key. • Understand your dependencies.

Slide 52

Slide 52 text

Case Studies & Maintainer Track & Storage Sessions Current deployment • Isolated topologies (one dc for each AZ and one for the global topo). • Blast radius is mapped to physical infrastructure.

Slide 53

Slide 53 text

Case Studies & Maintainer Track & Storage Sessions We have benefited already • AZ failure during backup time. • Single cell was affected!

Slide 54

Slide 54 text

Case Studies & Maintainer Track & Storage Sessions Performance wins

Slide 55

Slide 55 text

Case Studies & Maintainer Track & Storage Sessions

Slide 56

Slide 56 text

Case Studies & Maintainer Track & Storage Sessions 4.Key Lessons Complex system failures • Complex systems are intrinsically dangerous systems. • Complex systems are heavily and successfully defended against failure. • Catastrophe is always just around the corner. • Complex systems contain changing mixtures of failures latent within them. How Complex Systems Fail – MIT (https://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf)

Slide 57

Slide 57 text

Case Studies & Maintainer Track & Storage Sessions Complex system failures Humility towards complexity. Reach out to other fields and learn from their experience.

Slide 58

Slide 58 text

Case Studies & Maintainer Track & Storage Sessions

Slide 59

Slide 59 text

Case Study Nozzle

Slide 60

Slide 60 text

Case Studies & Maintainer Track & Storage Sessions Gone in 60 Minutes: Migrating 20 TB from AKS to GKE in an Hour with Vitess - Derek Perkins, Nozzle • The holy grail of Cloud Native tech is to have zero vendor lock-in • migrate a high throughput production workload of 20 TB from Azure (AKS) to Google (GKE) in under an hour

Slide 61

Slide 61 text

Case Studies & Maintainer Track & Storage Sessions Vendor lock-in is preferably zero, but there are good cases. But right judgment is important.

Slide 62

Slide 62 text

Case Studies & Maintainer Track & Storage Sessions

Slide 63

Slide 63 text

Case Studies & Maintainer Track & Storage Sessions Provisioning / Autoscaling Database

Slide 64

Slide 64 text

Case Studies & Maintainer Track & Storage Sessions Queues

Slide 65

Slide 65 text

Case Studies & Maintainer Track & Storage Sessions Network Egress

Slide 66

Slide 66 text

Case Studies & Maintainer Track & Storage Sessions

Slide 67

Slide 67 text

Case Studies & Maintainer Track & Storage Sessions

Slide 68

Slide 68 text

Case Studies & Maintainer Track & Storage Sessions

Slide 69

Slide 69 text

Case Studies & Maintainer Track & Storage Sessions

Slide 70

Slide 70 text

Case Studies & Maintainer Track & Storage Sessions

Slide 71

Slide 71 text

Case Studies & Maintainer Track & Storage Sessions

Slide 72

Slide 72 text

Case Studies & Maintainer Track & Storage Sessions

Slide 73

Slide 73 text

Case Studies & Maintainer Track & Storage Sessions

Slide 74

Slide 74 text

Case Studies & Maintainer Track & Storage Sessions

Slide 75

Slide 75 text

Case Studies & Maintainer Track & Storage Sessions AKS GKE GCS Backup Restore cross-cluster networking for zero downtime

Slide 76

Slide 76 text

Case Studies & Maintainer Track & Storage Sessions

Slide 77

Slide 77 text

Case Studies & Maintainer Track & Storage Sessions AKS GCS Node Pool Internal App Deploy all internal applications GKE Deploy cert-manager external dns nginx ingress Set up node pools for dedicated Vitess tablets

Slide 78

Slide 78 text

Case Studies & Maintainer Track & Storage Sessions

Slide 79

Slide 79 text

Case Studies & Maintainer Track & Storage Sessions AKS GCS Internal App Scale down Node Pool Internal App GKE Shut down Backup Start

Slide 80

Slide 80 text

Case Studies & Maintainer Track & Storage Sessions

Slide 81

Slide 81 text

Case Studies & Maintainer Track & Storage Sessions

Slide 82

Slide 82 text

Case Studies & Maintainer Track & Storage Sessions

Slide 83

Slide 83 text

Case Studies & Maintainer Track & Storage Sessions "Google Cloud Platform drives our analytics and machine learning needs. With BigQuery and Cloud Machine Learning Engine on Google Kubernetes Engine, we have an insights platform that's customized for our performance, IT, and cost requirements." —Derek Perkins, Founder & CEO, Nozzle https://cloud.google.com/customers/nozzle/ Cloud Tasks GKE Bigquery

Slide 84

Slide 84 text

Mantainer Track

Slide 85

Slide 85 text

Case Studies & Maintainer Track & Storage Sessions How to Migrate a MySQL Database to Vitess - Sugu Sougoumarane & Morgan Tocker, PlanetScale • Vitess basics • a demo of live-migrating an existing MySQL installation into Vitess. → No Demo ! 1 2 3 4 5 0

Slide 86

Slide 86 text

Case Studies & Maintainer Track & Storage Sessions Geo-partitioning with Vitess - Deepthi Sigireddi & Jitendra Vaidya, PlanetScale • Problems and solutions in GDPR • Vitess approach to Geo-patitioning based on GDPR • Vitess custom sharding scheme demo

Slide 87

Slide 87 text

Case Studies & Maintainer Track & Storage Sessions GDPR(General Data Protection Regulation) Rules aimed at strengthening and integrating data protection for all individuals within the European Union. Custom Sharding Scheme is one of the ways Vitess responds to GDPR's request to “Localize data storage locations in the country of residence of users”. There will be such rules outside the EU.

Slide 88

Slide 88 text

Case Studies & Maintainer Track & Storage Sessions Custom Sharding Scheme Demo in Four reagions & Eight contries

Slide 89

Slide 89 text

Storage Sessions

Slide 90

Slide 90 text

Case Studies & Maintainer Track & Storage Sessions Vitess: Stateless Storage in the Cloud - Sugu Sougoumarane, PlanetScale Design principles for making Vitess Cloud Native 1 2 3 4 0 0 5

Slide 91

Slide 91 text

Bonus

Slide 92

Slide 92 text

Bounus Vitess as a Service https://planetscale.com/

Slide 93

Slide 93 text

Summary

Slide 94

Slide 94 text

Summary • Vites is graduation with v4 !! • The number of Vitess hires is increasing over the past year • Not Kubernetes + Vitess & Kubernetes + Vitess case studies • Gained knowledge that it is necessary to think about GDPR

Slide 95

Slide 95 text

Thank you !!