Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PostgreSQL at any scale | Warsaw PostgreSQL User Group | Marco Slot

Citus Data
August 05, 2020

PostgreSQL at any scale | Warsaw PostgreSQL User Group | Marco Slot

Citus is an open source extension for PostgreSQL that turns it into a distributed database. Citus can shard or replicate tables across a cluster of PostgreSQL servers and transparently routes or parallelizes queries across the cluster, allowing you to horizontally scale your database without losing any of the powerful PostgreSQL functionality. Citus is especially suitable for scaling Software-as-a-Service (SaaS) applications and applications that require real-time analytics on large data volumes. In this talk, I'll describe the internals of Citus and show you how you can use it to power data-intensive applications. I'll also give a live demo using the Hyperscale (Citus) service on Azure.

Marco Slot is a Principal Software Engineer on the Citus team at Microsoft and is the lead engineer on the Citus extension. He has been working on PostgreSQL extensions including Citus, pg_cron, and pg_auto_failover since 2014 when he joined Citus Data. Prior to Citus Data, Marco did a PhD in cooperative self-driving cars at Trinity College Dublin and helped build CloudFront and Route 53 at Amazon Web Services.

Citus Data

August 05, 2020
Tweet

More Decks by Citus Data

Other Decks in Technology

Transcript

  1. APPLICATION COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn

    A Citus cluster consists of multiple PostgreSQL servers with the Citus extension.
  2. APPLICATION CREATE TABLE campaigns (…); SELECT create_distributed_table( 'campaigns','company_id'); METADATA COORDINATOR

    NODE WORKER NODES W1 W2 W3 … Wn CREATE TABLE campaigns_102 CREATE TABLE campaigns_105 CREATE TABLE campaigns_101 CREATE TABLE campaigns_104 CREATE TABLE campaigns_103 CREATE TABLE campaigns_106 How Citus distributes tables across the database cluster
  3. Distributed tables create_distributed_table('companies', 'company_id') Data is hash-partitioned across worker nodes

    based on distribution column values. Limitations: ž Unique constraints must include the distribution column ž Foreign keys/join (performance) limitations (more on that later) ž Missing features: Triggers, table inheritance (but you can use them...)
  4. INSERT INTO VALUES campaigns (company_id, …) ('Microsoft', …); METADATA W1

    W2 W3 … Wn INSERT INTO campaigns_102 VALUES … APPLICATION COORDINATOR NODE WORKER NODES hashtext('Microsoft') = 76743714 … [0,134217727] campaigns_102 [134217728,268435455] campaigns_103 …
  5. APPLICATION COPY FROM WITH campaigns STDIN (format 'csv'); METADATA COORDINATOR

    NODE WORKER NODES W1 W2 W3 … Wn COPY campaigns_102 FROM STDIN WITH (format 'binary') COPY campaigns_101 FROM STDIN WITH (format 'binary') COPY campaigns_103 FROM STDIN WITH (format 'binary') How Citus performs bulk loading across the database cluster Microsoft,2938,13,3.7,… Acme,386,13,12.8,… …
  6. Citus COPY performance 0 200000 400000 600000 800000 1000000 1200000

    1400000 1600000 Citus (160 cores) Citus (64 cores) PostgreSQL (64 cores) Rows loaded using COPY per second - higher is better Multi-threaded (10 threads) COPY with a primary key
  7. APPLICATION SELECT FROM GROUP BY company_id, sum(spend) AS campaign_spend campaigns

    company_id; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT company_id sum(spend), FROM campaigns_102 … SELECT company_id sum(spend) FROM campaigns_101 … SELECT company_id sum(spend), FROM campaigns_103 … How Citus distributes queries across the database cluster
  8. Citus parallel SELECT performance 0 50 100 150 200 250

    300 350 400 Citus (160 cores) Citus (64 cores) PostgreSQL (64 cores) Query time in seconds - lower is better Simple analytical query on 100GB of data
  9. How Citus handles transactions in a multi-node cluster BEGIN; UPDATE

    SET WHERE … COMMIT; campaigns start_date = '2020-03-17' company_id = 'Microsoft'; METADATA W1 W2 W3 … Wn BEGIN; UPDATE campaigns_102 SET …; … COMMIT; APPLICATION COORDINATOR NODE WORKER NODES
  10. Citus HammerDB TPC-C performance 606k 434k 207k 254k 0 100000

    200000 300000 400000 500000 600000 700000 Citus (256 cores) Citus (160 cores) Citus (64 cores) PostgreSQL (64 cores) New Order Transactions Per Minute (NOPM) - higher is better HammerDB TPC-C performance (1000 warehouses, 400 vusers)
  11. Distributed tables can be co-located create_distributed_table('campaigns', 'company_id', colocate_with := 'companies')

    Distributed tables can be co-located with other distributed tables. Same hash range is always on the same worker node. Co-location enables: ž full query push down when filtering all tables by distribution column ž foreign keys between distributed tables with distribution column ž efficient joins between distributed tables on distribution column
  12. Reference tables create_reference_table('categories') Reference tables are replicated to all worker

    nodes. Enables: ž Foreign keys from distributed (or reference) tables on any column ž Efficient joins with distributed (or reference) tables on any column
  13. Citus helps ASB onboard customers 20x faster “After migrating to

    Citus, we can onboard Vonto customers 20X faster, in 2 minutes vs. the 40+ minutes it used to take. And with the launch of Hyperscale (Citus) on Azure Database for PostgreSQL, we are excited to see what we can build next on Azure.” 100 GB+ data Multi-tenant SaaS application Milliseconds latency
  14. Benefits of Citus for real-time analytics Operation Achievable performance Bulk

    loading 1M rows/sec SELECT from raw data 10-100M rows/sec INSERT..SELECT into rollup 1-100M rows/sec SELECT on rollup table 1k-100k queries/sec
  15. Microsoft Windows relies on Citus for mission-critical decisions ž “Ship/no-ship

    decisions for Microsoft Windows are made using Hyperscale (Citus), where our team runs on-the-fly analytics on billions of JSON events with sub-second responses. ž Distributed PostgreSQL is a game changer.” 1 PB+ data 6M+ queries per day; 75% of queries completing < 0.2 secs