Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introducing TiDB and TiKV

Avatar for Kevin Xu Kevin Xu
October 30, 2018

Introducing TiDB and TiKV

This deck was delivered at the SF Kubernetes meetup at the Microsoft office to introduce TiDB, an open source NewSQL database, and its sister project TiKV, an open source distributed transactional key-value store and a CNCF member project.

Avatar for Kevin Xu

Kevin Xu

October 30, 2018
Tweet

More Decks by Kevin Xu

Other Decks in Technology

Transcript

  1. Agenda • History and Community • Technical Walkthrough • Use

    Case with Mobike • Q&A • (Time Permitting) TiDB on Google Kubernetes Engine
  2. A little about me • General Manager of Global Strategy

    and Operations • Studied CS and Law at Stanford • Program in Javascript, Python, and (more recently) learning Rust
  3. A little about PingCAP • Founded in April 2015 by

    3 infrastructure engineers • Offices throughout North America and China
  4. PingCAP.com Our Product: the TiDB Platform • TiDB Platform (Ti

    = Titanium) ◦ TiDB (SQL Layer) ◦ TiKV (Key-Value Storage) ◦ TiSpark (Spark plugin to TiKV) • Open source from Day 1 ◦ GA 1.0: October 2017 ◦ GA 2.0: April 2018
  5. PingCAP.com Cloud-Native Architecture TiDB TiDB TiDB Application via MySQL Protocol

    TiKV TiKV TiKV TiKV TiKV TiKV Worker Worker Worker Spark Driver ... ... ... Spark SQL Spark Cluster PD Cluster DistSQL API KV API DistSQL API PD PD PD Metadata TSO / Data Location Data Location
  6. PingCAP.com TiKV (in CNCF): The Storage Foundation Region 5 Region

    1 Region 3 TiKV node 1 Store 1 Region 4 gRPC Region 1 Region 2 TiKV node 2 Store 2 Region 3 gRPC Region 3 Region 1 Region 5 TiKV node 3 Store 3 gRPC Region 5 Region 1 Region 2 TiKV node 4 Store 4 gRPC Client PD 1 PD 2 PD 3 Placement Driver Raft Group Region 4 Region 4
  7. PingCAP.com TiDB: The (My)SQL Layer Node1 Node2 Node3 Node4 MySQL

    Network Protocol SQL Parser Cost-based Optimizer Coprocessor API ODBC/JDBC MySQL Client Any ORM which supports MySQL TiDB TiKV
  8. PingCAP.com Join Support • Hash Join (fastest; if table <=

    50 million rows) • Sort Merge Join (join on indexed column or ordered data source) • Index Lookup Join (join on indexed column; ideally after filter, result < 10,000 rows) • Chosen based on Cost-base Optimizer:
  9. PingCAP.com TiSpark: Complex OLAP Spark Exec Spark Exec Spark Driver

    Spark Exec TiKV TiKV TiKV TiKV TiSpark TiSpark TiSpark TiSpark TiKV Placement Driver (PD) gRPC Distributed Storage Layer gRPC retrieve data location retrieve real data from TiKV
  10. PingCAP.com Placement Driver • Provide a God’s view of the

    entire cluster • Store metadata, balancing workload, issue timestamps • Also a cluster with embedded etcd Placement Driver Placement Driver Placement Driver Raft Raft Raft
  11. PingCAP.com Transaction Model • Timestamp Oracle Service (from Google’s Percolator)

    • 2-Phase commit protocol (2PC) • Problem: Single point of failure • Solution: PD HA cluster Placement Driver Placement Driver Placement Driver Raft Raft Raft
  12. PingCAP.com TiDB Operator • Operator pattern inspired by CoreOS...(now Redhat...(now

    IBM)) • Boostraps TiDB cluster and simplifies/automates: ◦ Deployment ◦ Scaling ◦ Scheduling ◦ Auto-Failover ◦ Upgrade • Open Sourced ◦ https://github.com/pingcap/tidb-operator
  13. PingCAP.com TiDB Operator API Server Controller Manager Scheduler Kubernetes Core

    TiDB Controller Manager TiDB Cluster Controller PD Controller TiKV Controller TiDB Controller TiDB Scheduler: TiDB Cloud Manager API Gateway Control Plane Cost Controller Kube Scheduler TiDB Scheduler
  14. PingCAP.com Cloud Native Tools • Prometheus ◦ (maintains Rust implementation:

    https://github.com/pingcap/rust-prometheus) • gRPC ◦ (maintains Rust implementation: https://github.com/pingcap/grpc-rs) • etcd
  15. PingCAP.com Mobike + TiDB • 200 million users • 200

    cities • 9 million smart bikes • ~30 TB / day
  16. PingCAP.com Scenario #1: Locking/Unlocking • Locking and unlocking of smart

    bikes generates massive data • Smooth experience is the key to user retention • TiDB supports this system by alerting administrators when the success rate of locking/unlocking drops, within minutes • Quickly find malfunctioning bikes
  17. PingCAP.com Scenario #2: Real-Time Analysis • Synchronize TiDB with MySQL

    instances using Syncer (proprietary tool) • TiDB + TiSpark empower real-time analysis with horizontal scalability • No need for Hadoop + Hive
  18. PingCAP.com Scenario #3: Mobike Store • An innovative loyalty program

    that must be on 24x7x365 • TiDB provides: ◦ High-concurrency for peak or promotional season ◦ Permanent storage ◦ Horizontal scalability • No interruption as business evolves
  19. PingCAP.com Thank You! 20% OFF KubeCon: KCNA18SPR [email protected] @kevinsxu; @pingcap

    TiDB Cloud Early Access: https://www.pingcap.com/ tidb-cloud/ TiDB Academy Sign-up: www.pingcap.com/tidb-ac ademy/
  20. PingCAP.com CBO 101 Network cost Memory cost CPU cost In

    TiDB, the default memory factor is 5 and CPU factor is 0.8. For example: Operator Sort(r), its cost would be: TiDB will maintain histogram of data
  21. PingCAP.com Relational -> KV ID Name Email 1 Edward [email protected]

    2 Tom [email protected] ... user/1 Edward,[email protected] user/2 Tom,[email protected] ... In TiKV -∞ +∞ (-∞, +∞) Sorted map “User” Table Some region...
  22. PingCAP.com Index Structure Row: Key: tablePrefix_rowPrefix_tableID_rowID (IDs are assigned by

    TiDB, all int64) Value: [col1, col2, col3, col4] Index: Key: tablePrefix_idxPrefix_tableID_indexID_ColumnsValue_rowID Value: [null] Keys are ordered by byte array in TiKV, so can support SCAN Every key is appended a timestamp, issued by Placement Driver
  23. PingCAP.com Guaranteeing Correctness • Formal proof using TLA+ ◦ a

    formal specification and verification language to reason about and prove aspects of complex systems • Raft • TSO/Percolator • 2PC • See details: https://github.com/pingcap/tla-plus
  24. PingCAP.com MySQL Compatibility - Summary • Compatibility with MySQL 5.7

    ◦ Joins, subqueries, DML, DDL etc. • On the roadmap: ◦ Views, Window Functions • Missing: ◦ Stored Procedures, Triggers, Events, Fulltext pingcap.com /docs/sql/mysql-compatibility/
  25. PingCAP.com MySQL Compatibility - Nuanced • Some features work differently

    ◦ Auto Increment ◦ Optimistic Locking • TiDB works better with smaller transactions ◦ Recommended to batch updates, deletes, inserts to 5000 rows pingcap.com /docs/sql/mysql-compatibility/