Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Citus & Patroni: The Key to Scalable and Fault-Tolerant PostgreSQL

Citus & Patroni: The Key to Scalable and Fault-Tolerant PostgreSQL

Citus is an open source extension to PostgreSQL that enables you to scale out your database horizontally by sharding your data across many nodes, and Patroni is an open source tool for managing and automating PostgreSQL High Availability. Combined together three open source projects become a superhero – a scalable PostgreSQL cluster with self-healing capabilities.

In my presentation I will cover implementation details of Patroni & Citus integration, and do a live-demo of cluster deployment and showcase maintenance on Citus worker nodes without interrupting client connections.

Alexander Kukushkin

May 10, 2023
Tweet

More Decks by Alexander Kukushkin

Other Decks in Technology

Transcript

  1. 1

  2. Agenda for this Citus & Patroni talk @ Citus Con:

    An Event for Postgres 2023 ▪ What (business)problems do we solve? ▪ High Availability with Patroni ▪ Horizontal scalability with Citus ▪ PostgreSQL + Citus + Patroni for the win ▪ Live demo! ▪ Conclusion 1 2 3 4 5 6
  3. As an application developer I want... • A database •

    A highly available database • A highly available database that scales 5
  4. As an application developer I want... • A database •

    A highly available database • A highly available database that scales • A highly available database that scales horizontally! 6
  5. As an application developer I want... • A database •

    A highly available database • A highly available database that scales • A highly available database that scales horizontally! • Highly available database that scales horizontally for free! 7
  6. Application Level Sharding 12 Shard 1 Shard 2 Shard 3

    Shard 4 Application SELECT * FROM u WHERE id=4 SELECT * FROM u WHERE id=5 SELECT * FROM u WHERE id=6 SELECT * FROM u WHERE id=7 ID%4 == 0 ID%4 == 1 ID%4 == 2 ID%4 == 3
  7. Disadvantages of Application Level Sharding • Must be implemented in

    the application • Separate connection (pool) for every shard • Must be introduced from day 1 • Depending on implementation adding new shards is either impossible or hard • If new shards are added, they are heavily disbalanced • Rebalancing (without downtime) is a challenge • Schema changes must be deployed on all shards 13
  8. Sharding with Citus 14 Worker 1 Worker 2 Worker 3

    Worker 4 Application Coordinator
  9. Sharding with Citus: Pros & Cons Pros • Transparent sharding

    • A single connection (pool) • Query from any node! • Schema changes via coordinator • Start with small and scale-out • Rebalancing without downtime! 15 Cons • Increased query latency
  10. Citus HA with Patroni 3.0 16 primary standby Citus worker

    1 primary standby Citus coordinator primary standby Citus worker 2
  11. Enabling Citus support in Patroni 3.0 scope: my-citus-cluster # must

    be the same on all nodes citus: group: X # 0 for coordinator and 1, 2, 3, etc for workers database: citus # must be the same on all nodes 17
  12. Adding new workers to the Citus cluster 1. Adjust patroni.yaml,

    citus.group parameter 2. Start Patroni on the new node 3. SELECT * FROM citus_rebalance_start(); -- rebalance shards 4. SELECT * FROM citus_rebalance_status(); -- check rebalance progress 18
  13. Conclusion • Enterprise-grade sharding with Citus • Enterprise-grade HA with

    Patroni • Everything is open-source! • Start with small and scale • Vertically • Horizontally 21
  14. Thank you! • Citus open source repo on GitHub aka.ms/citus

    • Patroni repo on GitHub github.com/zalando/patroni • Alexander on Twitter @cyberdemn