The Citus Data Scale Out Story | PostgresOpen SV 2018 | Ozgun Erdogan, Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot
| Citus Data The Citus Data Scale Out Story Ozgun Erdogan & Marco Slot PostgresOpen SV | Sep 06, 2018 | San Francisco

| Citus Data 2 The story before the story… Citus Data Co-Founders, Left to Right Ozgun Erdogan, Sumedh Pathak, Umur Cubukcu Photo credit: Willy Johnson 2017

| Citus Data Our early days at Amazon spent working on distributed systems 3

| Citus Data Each service needs to elastically scale Scaling compute is easy Scaling data is hard NoSQL ?

| Citus Data Hidden costs of NoSQL What if a relational database could scale out ? TRANSACTIONS JOINS DATABASE CONSTRAINTS Re-architect application (more code) to provide missing database functionality

| Citus Data What better way to scale a relational database than to extend the world’s best database?

| Citus Data 7 Citus Scales Out PostgreSQL COORDINATOR NODE Table metadata SELECT FROM table_1001 SELECT FROM table_1003 SELECT FROM table_1002 SELECT FROM table_1004 WORKER NODE 1 Table_1001 Table_1003 WORKER NODE 2 Table_1002 Table_1004 WORKER NODE ‘N’ Each node PostgreSQL instance with Citus installed Replication not shown in diagram • • • • • • SELECT create_distributed_table(‘table’, ‘device_id’);

| Citus Data Reasons to love Citus—let’s count the ways 8

| Citus Data 9 Easy to scale out a Citus database cluster

| Citus Data • Citus parallelizes query across cluster of machines. • Citus comes with 3 distributed executors. Query Parallelization

| Citus Data • Distributed transactions could lead to distributed deadlocks. • To avoid deadlocks, we introduced a distributed deadlock detector. Distributed transactions in Citus

| Citus Data One important gap remains

| Citus Data 13 Citus Architecture COORDINATOR NODE Table metadata SELECT FROM table_1001 SELECT FROM table_1003 SELECT FROM table_1002 SELECT FROM table_1004 WORKER NODE 1 Table_1001 Table_1003 WORKER NODE 2 Table_1002 Table_1004 WORKER NODE ‘N’ Each node PostgreSQL instance with Citus installed Replication not shown in diagram • • • • • • SELECT create_distributed_table(‘table’, ‘device_id’);

| Citus Data 1.High write throughput 2.Query planning times for large deployments 3.High Availability (HA) for each node What if you could talk to any node?

| Citus Data • Citus coordinator routes queries to worker nodes. • For workloads that require high write throughput (10B+ writes per day), coordinator could become a bottleneck. High Write Throughput

| Citus Data • The coordinator plans analytical queries for distributed execution. Query Planning in Large Deployments >650K ½-1 shards seconds concurrency L

| Citus Data • When coordinator fails, your cluster can become unavailable up to 1 minute. That’s a long time. • When any node can coordinate, you get higher availability for on a per-shard basis. High Availability (HA) for each shard

| Citus Data • What if better HA were possible? • What if higher write throughput were possible? • What would such a solution’s performance look like? What if we had such a solution?

| Citus Data 0 50,000 100,000 150,000 200,000 250,000 4 8 12 # nodes 0 20 40 60 80 100 120 4 8 12 # nodes Transaction performance (TPS) Queries per hour (QPH) 19 Performance for prominent service today Transaction (OLTP) speed Analytics (OLAP) speed Prominent Postgres service Prominent Postgres service

| Citus Data 0 50,000 100,000 150,000 200,000 250,000 4 8 12 Transaction (OLTP) speed # nodes 0 20 40 60 80 100 120 4 8 12 # nodes Analytics (OLAP) speed Queries per hour (QPH) 5x faster 70x faster Citus Prominent Postgres service 20 Higher Performance with Citus MX Citus Prominent Postgres service Transaction performance (TPS)

| Citus Data Introducing Citus MX

| Citus Data Thank You. [email protected] [email protected] @citusdata Ozgun Erdogan, CTO & Co-Founder Marco Slot, Principal Engineer www.citusdata.com

| Citus Data What if super HA were possible? What if high write throughput were possible?

The Citus Data Scale Out Story | PostgresOpen S...

The Citus Data Scale Out Story | PostgresOpen SV 2018 | Ozgun Erdogan, Marco Slot

Citus Data

More Decks by Citus Data

Other Decks in Technology

Featured

Transcript

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot

PostgresOpen SV 2018 Keynote | Ozgun Erdogan & Marco Slot