Save 37% off PRO during our Black Friday Sale! »

Citus 10 Open Source & Columnar Storage for Postgres | contributing today | Claire Giordano & Nils Dijk

Citus 10 Open Source & Columnar Storage for Postgres | contributing today | Claire Giordano & Nils Dijk

Citus 10 is out! A spectacular new release from our Citus open source team. Citus 10 gives you columnar storage for Postgres, Citus on a single node—plus, we’ve open sourced the shard rebalancer. Come see a demo & learn how the Citus extension gives you Postgres at any scale, from a single node to a distributed cluster. And how easy it is to give Citus a try.

024d6a0dd14fb31c804969a57a06dfbe?s=128

Citus Data

March 17, 2021
Tweet

Transcript

  1. None
  2. aka.ms/citus

  3. What is Citus? • Distributed tables • Reference tables •

    & more, as of Citus 10 Extension to Postgres (not a fork!) • Add nodes • Rebalance Simplicity & flexibility of using PostgreSQL, at scale • Scale transactional workloads • Scale analytical workloads • Mixed workloads too Multi-purpose:
  4. Why

  5. None
  6. planner, executor, transactions Background workers foreign data wrappers published in

    1986
  7. Why be an extension to Postgres (and not a fork?)

    Vast ecosystem
  8. Developers ❤ Postgres

  9. Why Citus, Reason #1: Postgres limited to single node Capacity

    / execution time issues: § Working set does not fit in memory § Reaching limits of network-attached storage (IOPS) / CPU § Analytical query takes too long § Data transformations are single-threaded (e.g. insert..select) § Autovacuum cannot keep up with transactional workload § …
  10. • Joins • Functions • Constraints • Indexes: B-tree, GIN,

    BRIN, & GiST • Partial Indexes • Other extensions • PostGIS • Rich datatypes • JSONB • Window functions • CTEs • Atomic update / delete • Partitioning • Interactive transactions • Open source • … Why Citus, Reason #2: Because Postgres includes:
  11. COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn A

    Citus cluster consists of multiple Postgres nodes with the Citus extension. CREATE EXTENSION citus; SELECT citus_add_node(…); SELECT citus_add_node(…); SELECT citus_add_node(…); CREATE EXTENSION citus; CREATE EXTENSION citus; CREATE EXTENSION citus;
  12. APPLICATION CREATE TABLE campaigns (…); SELECT create_distributed_table( 'campaigns', 'company_id'); METADATA

    COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn CREATE TABLE campaigns_102 CREATE TABLE campaigns_105 CREATE TABLE campaigns_101 CREATE TABLE campaigns_104 CREATE TABLE campaigns_103 CREATE TABLE campaigns_106 How Citus distributes tables across the database cluster
  13. APPLICATION SELECT FROM GROUP BY campaign_id, avg(spend) AS avg_campaign_spend campaigns

    campaign_id; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT company_id sum(spend), count(spend) … FROM campaigns_102 … SELECT company_id sum(spend), count(spend) … FROM campaigns_101 … SELECT company_id sum(spend), count(spend) … FROM campaigns_103 … How Citus distributes queries across the database cluster
  14. easy # run PostgreSQL with Citus on port 5500 docker

    run = citusdata/citus
  15. CREATE TABLE users( id bigserial primary key, name text); SELECT

    create_distributed_table( 'users', 'id’); SELECT count(*) FROM users; easy
  16. None
  17. None
  18. None
  19. aka.ms/citus10

  20. Citus Coordinator Citus Workers Citus Coordinator Citus single node Distributed

    Citus cluster
  21. slack.citusdata.com

  22. Columnar Storage Row-based storage

  23. CREATE TABLE events( ts timestamptz, i int, n numeric, s

    text); CREATE TABLE events_columnar( ts timestamptz, i int, n numeric, s text) USING columnar;
  24. Citus Columnar && Range Partitioning in Postgres CREATE TABLE events(

    ts timestamptz, i int, n numeric, s text) PARTITION BY RANGE (ts); CREATE TABLE events_2021_jan PARTITION OF events FOR VALUES FROM ('2021-01-01') TO ('2021-02-01'); CREATE TABLE events_2021_feb PARTITION OF events FOR VALUES FROM ('2021-02-01') TO ('2021-03-01');
  25. events table

  26. Citus Columnar && Range Partitioning in Postgres SELECT alter_table_set_access_method( 'events_2021_jan',

    'columnar');
  27. events table …

  28. events table …

  29. events table

  30. None
  31. None
  32. In Citus 10, we open sourced Citus Shard Rebalancer

  33. Easy to rebalance shards after adding a new Citus node

  34. What if shards get out-of-balance on existing nodes?

  35. Rebalancing shards to optimize for performance, too

  36. None
  37. Min Wei, Principal Engineer at Microsoft Distributed PostgreSQL is a

    game changer." aka.ms/blog-petabyte-scale-analytics
  38. aka.ms/azure-portal-postgres Try Citus on Azure

  39. Citus Newsletter aka.ms/citus-newsletter

  40. Questions? nils.dijk@microsoft.com claire.giordano@microsoft.com Citus repo on GitHub aka.ms/citus Citus Public

    Slack for open source Q&A slack.citusdata.com Citus Docs docs.citusdata.com Definitive Citus 10 blog post by Marco aka.ms/citus10 Download Citus open source citusdata.com/download/
  41. If need to scale Postgres, learn more about Citus 10

    As of Citus 10, now includes columnar compression We’ve open sourced the shard rebalancer too & Citus on a single node