Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[SF DevOps meetup 06/06/19] TiDB Operator

[SF DevOps meetup 06/06/19] TiDB Operator

This deck was delivered at the SF DevOps meetup on June 6, 2019, to introduce TiDB, TiKV and how to run the various components of the TiDB platform on Kubernetes using the operator pattern.

Avatar for Kevin Xu

Kevin Xu

June 06, 2019
Tweet

More Decks by Kevin Xu

Other Decks in Technology

Transcript

  1. A little about PingCAP • Founded in April 2015 by

    3 infrastructure engineers • Created and maintains TiDB, TiKV • Offices throughout North America and China
  2. PingCAP.com Mobike + TiDB • 200 million users • 200

    cities • 9 million smart bikes • ~35 TB in TiDB
  3. Technical Inspiration TiDB is a NewSQL database that speaks the

    MySQL protocol It is not based on the MySQL source code It is an ACID/strongly consistent database The inspiration is Google Spanner + F1 It separates SQL processing and storage into separate components Both of them are independently scalable The SQL processing layer is stateless It is designed for both Transaction and Analytical Processing (HTAP)
  4. Use Cases 1. Approaching the maximum size for MySQL on

    a single server. Debating whether or not to shard. 2. Already sharded MySQL, but having a hard time doing analytics on up-to-date data.
  5. TiDB TiDB Region 1 L TiKV Node 1 Region 2

    Region 3 Region 4 Region 2 L TiKV Node 3 Region 3 Region 4 L Region 1 Region 4 TiKV Node 2 Region 3 L Region 2 Region 1 TiKV Cluster PD Cluster TiDB Core
  6. TiDB HTAP: Row + Column Storage Spark Cluster TiDB TiDB

    Region 1 TiKV Node 1 Region 2 Region 3 Region 4 Region 2 TiKV Node 3 Region 3 Region 4 Region 1 Region 4 TiKV Node 2 Region 3 Region 2 Region 1 TiFlash Node 2 TiFlash Extension Cluster TiKV Cluster TiSpark Worker TiSpark Worker TiFlash Node 1
  7. Operator History • Operator pattern pioneered by CoreOS...now Red Hat...now

    IBM • Introduced in 2016, Operator Framework in 2018 ◦ First 2: etcd operator, Prometheus operator • TiDB Operator (2017); Predated Operator Framework
  8. Why Do We (As in TiDB) Care? • Manage multiple

    clusters (multi-tenancy) • Safe scaling (up or down, in or out) • Use different types of Network or Local Storage (different performance) • Automatic monitoring • Rolling updates • Automatic failover • *Multi-Cloud* (as long as it has k8s)
  9. Why Should YOU Care? • Manages stateful applications: ◦ databases,

    caches, monitoring system, etc. • Encodes application domain knowledge ◦ Extension of your SRE team • Kubernetes-enabled Hybrid / Multi-Cloud • Growing popularity in database community: ◦ https://thenewstack.io/databases-operators-bring-stateful-workloads-to-kubernet es/
  10. Resources -- CRD • Custom Resource Definition (CRD): ◦ An

    application-specific YAML file ◦ End user writes the domain operation logic in CRD ◦ Simple to implement and deploy • (There is another way): ◦ API Aggregation: ▪ More control, more powerful but… ▪ Hard to deploy, not well-supported by k8s engines
  11. Cluster State -- StatefulSet StatefulSet... • Guarantees ordering and uniqueness

    of pods ◦ pd -> tikv -> tidb • Gives “sticky” identity -- network and storage • *No* interchangeable pods ◦ always map the same volume to the same pod • Stable since Kubernetes 1.9
  12. TiDB TiDB Region 1 L TiKV Node 1 Region 2

    Region 3 Region 4 Region 2 L TiKV Node 3 Region 3 Region 4 L Region 1 Region 4 TiKV Node 2 Region 3 L Region 2 Region 1 TiKV Cluster PD Cluster TiDB Core
  13. How TiDB manages state -- Custom Controller Spec: component: image:

    replicas: ... Status: image replicas state CRD (provided by user) Custom Controller Cluster State
  14. TiDB Node PD StatefulSet TiKV StatefulSet TiDB StatefulSet TidbCluster CRD

    Pod PVC PV Pod PVC PV Pod PVC PV CRD Sync TidbCluster Controller Watch API Operator Change Detection Reconcile Spec: component: replicas: ...