Slide 1

Slide 1 text

Deep-dive Azure Database for PostgreSQL Hyperscale (Citus) Adam Wołk 2022-09-22

Slide 2

Slide 2 text

©Microsoft Corporation Azure Contents 1 2 3 4 NoSQL / RDBMS PostgreSQL Citus Azure for PostgreSQL Hyperscale (Citus)

Slide 3

Slide 3 text

NoSQL / RDBMS

Slide 4

Slide 4 text

©Microsoft Corporation Azure NoSQL • Non-relational • Schemaless • Document storage • Horizontal scale-out

Slide 5

Slide 5 text

©Microsoft Corporation Azure RDBMS • Relational • ACID guarantees • Rich data-types and indexing • Vertical scaling

Slide 6

Slide 6 text

©Microsoft Corporation Azure NoSQL vs RDBMS Hard with NoSQL • Ad-hoc queries • App encoded schema Hard with RDBMS • Horizontal scaling Both worlds are converging into a hybrid model. PostgreSQL: PostgreSQL 9.4 Press Kit

Slide 7

Slide 7 text

PostgreSQL

Slide 8

Slide 8 text

Why use PostgreSQL? Open-Source Developed in the open, free for use including commercial projects. Rich data types Over 20 data types in core, including advanced JSON support. Performance Variety of available indexing methods, including specialized ones for full text search and geo- spatial data. Reliability ACID compliant database, with over 30 years of development of the core engine. Hot standby failovers, replication and PITR all present. Extensibility Over 1000 extensions adding vast feature- sets to the core engine. Geo-spatial queries, new types and indexing methods, integration with existing services. Standard Compliant 170 out of 179 mandatory features of SQL:2016

Slide 9

Slide 9 text

©Microsoft Corporation Azure PostgreSQL industry adoption "For Professional Developers PostgreSQL just barely took over the first place spot from MySQL. Professional Developers are more likely than those learning to code to use Redis, PostgreSQL, Microsoft SQL Server, and Elasticsearch. MongoDB is used by a similar percentage of both Professional Developers and those learning to code and it’s the second most popular database for those learning to code (behind MySQL). This makes sense since it supports a large number of languages and application development platforms.” StackOverflow 2022 Developer Survey Stack Overflow Developer Survey 2022

Slide 10

Slide 10 text

©Microsoft Corporation Azure Rich JSON support • JSONB (binary JSON storage) • Indexing JSONB columns • Ad-hoc queries against JSONB data • JSONPath support PostgreSQL: Documentation: 14: 8.14. JSON Types

Slide 11

Slide 11 text

©Microsoft Corporation Azure Query JSON Data PostgreSQL: Documentation: 14: 8.14. JSON Types

Slide 12

Slide 12 text

Citus

Slide 13

Slide 13 text

Why use Citus? Open-Source Developed in the open, free for use including commercial projects. Postgres at any Scale Scale out Postgres by distributing your data & queries across a cluster. And it’s simple to add nodes & rebalance shards when you need to grow. Parallelized Performance Speed up queries by 20x to 300x (or more) through parallelism, keeping more data in memory, higher I/O bandwidth, and columnar compression. Run anywhere Self-host or run Citus in the cloud as a built-in option on Azure Database for PostgreSQL. True PostgreSQL Get all the benefits of PostgreSQL with the added magic of distributed tables. Vendor neutral Plain PostgreSQL, you can always take your data and go elsewhere. No vendor lock in.

Slide 14

Slide 14 text

©Microsoft Corporation Azure What is sharding? tenant_id A 1 1 Text 2 1 Text 3 2 Text 4 3 Text tenant_id A 1 1 Text 2 1 Text tenant_id A 3 2 Text tenant_id A 4 3 Text

Slide 15

Slide 15 text

©Microsoft Corporation Azure Citus Cluster Architecture

Slide 16

Slide 16 text

Azure for PostgreSQL Hyperscale (Citus)

Slide 17

Slide 17 text

Why use Azure for PostgreSQL Hyperscale (Citus)? Open-Source Running the same version of Citus available to everyone with the latest PostgreSQL version available. Cloud Infrastructure Focus on your application and forget about your database. Expert Support Team of Citus and PostgreSQL experts ready and capable to help with the most dreadful database issues you may encounter. Reliability HA, backups, PITR recovery, geo read replicas – without the pain of setting them up. Monitoring Insights into the performance and inner workings of your cluster. Regulation Compliance Benefit by leveraging on Azure compliance with global regulations.

Slide 18

Slide 18 text

©Microsoft Corporation Azure Will it scale? – Telemetry for all of Windows 2 clusters (54 nodes) • 32 nodes • 22 nodes • 3,456 cores • 27 TB of memory. • 1.6 PB of Azure Premium SSD Managed Disks • P75 – 90ms • P95 - <1s Architecting petabyte-scale analytics by scaling out Postgres on Azure with the Citus extension (microsoft.com)

Slide 19

Slide 19 text

©Microsoft Corporation Azure Will it scale? – COVID dashboard for all of UK • 250,000 to 300,000 hits per minute at peak. • concurrent users at peak is 60,000 to 100,000 • 7.5 billion records • 5 million data points returned in under 10 seconds • Cache invalidation durations never exceed 2 minutes. UK COVID-19 dashboard built using Postgres and Citus for millions of users - Microsoft Tech Community

Slide 20

Slide 20 text

©Microsoft Corporation Azure Citus 11.1 Release Party

Slide 21

Slide 21 text

© Copyright Microsoft Corporation. All rights reserved. Thank you.