Building PostgreSQL apps at scale with Hyperscale (Citus) | Microsoft Build 2019 | Craig Kerstiens, Sunil Kamath

Craig Kerstiens Sunil Kamath Mathew Stokes

PostgreSQL is more popular than ever loved wanted https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog https://db-engines.com/en/blog_post/76
DBMS of the Year

Why Postgres? JSONB hstore Arrays GIN index PostGIS Concurrent indexing
GiST index B-tree index MVCC Time series Safety Proven track record SP-GiST index KNN BRIN index Listen/notify CTEs Window functions Transactional DDL Foreign data wrappers Extensions Fast column addition

Focus on your application, not your database Enterprise-grade managed services
for PostgreSQL On-premises PostgreSQL IaaS Azure VMs with PostgreSQL PaaS Azure Database for PostgreSQL Datacenter management Hardware O/S provision /patching Database provision/ Patch/Scaling Virtualization Data Applications High availability /DR/Backups Datacenter management Hardware Virtualization O/S Database provision/ Patch/Scaling Data Applications High availability /DR/Backups Data Applications Datacenter management Hardware Virtualization O/S Database provision/ Patch/Scaling High availability/ DR/Backups Intelligent performance/security Managed by Microsoft Managed by customer Machine learning capability

High performance scale-out with Hyperscale (Citus) Intelligent performance optimization Flexible
and open Fully managed and secure Single Server Hyperscale (Citus) NEW Build or migrate your workloads with confidence

Single Server Hyperscale (Citus) NEW Worry-free PostgreSQL in the cloud
with an architecture that is built to scale out Example use cases • Scaling PostgreSQL multi-tenant, SaaS applications • Real-time operational analytics • Building high throughput transactional apps Community-based single node PostgreSQL with built-in High Availability Example use cases • Transactional and operational analytics workloads • Apps requiring JSON, geospatial support or full-text search • Greenfield apps built with modern frameworks

Cache rules everything around me

Cache hit ratio SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio FROM pg_statio_user_tables;

Cache hit ratio name | ratio ----------------+------------------------ cache hit rate
| 0.99

Index hit ratio SELECT relname, 100 * idx_scan / (seq_scan
+ idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 ORDER BY n_live_tup DESC;a

Index hit ratio relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events
| 0 | 669917 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059 authorizations | 0 | 0 delayed_jobs | 23 | 0

Table cache hit ratio target > 99%

https://github.com/savjani/postgres- assets/blob/master/SQL%20Notebooks/Postgres_database_health_check_notebook.ipynb

Customized recommendations Performance troubleshooting Data visualization Intelligent performance

azure_sys azure_mai ntenance postgres UserDB

I HAVE NO IDEA

Stay current with PostgreSQL innovations Blazing performance Simplified infrastructure Scale
out horizontally

Under the covers data is sharded APPLICATION SELECT FROM
WHERE AND count(*) ads JOIN campaigns ON ads.company_id = campaigns.company_id ads.designer_name = ‘Isaac’ campaigns.company_id = ‘Elly Co’ ; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT … FROM ads_1001, campaigns_2001 … It’s logical to place shards containing related rows of related tables together on the same nodes Join queries between related rows can reduce the amount of data sent over the network

Create a table CREATE TABLE github_events ( event_id bigint, event_type
text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, user_id bigint, org jsonb, created_at timestamp ); CREATE TABLE github_users ( user_id bigint, url text, login text, avatar_url text, gravatar_id text, display_login text );

Hyperscale (Citus) helps ASB onboard customers 20x faster

Flexible and open High performance scale-out with Hyperscale (Citus) Fully
managed and secure Intelligent performance optimization • • • • • •

More OSS DBs at Data Showcase Demo

Building PostgreSQL apps at scale with Hypersca...

Building PostgreSQL apps at scale with Hyperscale (Citus) | Microsoft Build 2019 | Craig Kerstiens, Sunil Kamath

Citus Data

More Decks by Citus Data

Other Decks in Technology

Featured

Transcript

Craig Kerstiens Sunil Kamath Mathew Stokes

PostgreSQL is more popular than ever loved wanted https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog https://db-engines.com/en/blog_post/76

Why Postgres? JSONB hstore Arrays GIN index PostGIS Concurrent indexing

Focus on your application, not your database Enterprise-grade managed services

High performance scale-out with Hyperscale (Citus) Intelligent performance optimization Flexible

Single Server Hyperscale (Citus) NEW Worry-free PostgreSQL in the cloud

Cache rules everything around me

Cache hit ratio SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit,

Cache hit ratio name | ratio ----------------+------------------------ cache hit rate

Index hit ratio SELECT relname, 100 * idx_scan / (seq_scan

Index hit ratio relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events

Table cache hit ratio target > 99%

https://github.com/savjani/postgres- assets/blob/master/SQL%20Notebooks/Postgres_database_health_check_notebook.ipynb

Customized recommendations Performance troubleshooting Data visualization Intelligent performance

azure_sys azure_mai ntenance postgres UserDB

azure_sys azure_mai ntenance postgres UserDB

I HAVE NO IDEA

9 TB

Stay current with PostgreSQL innovations Blazing performance Simplified infrastructure Scale

Under the covers data is sharded APPLICATION SELECT FROM

Create a table CREATE TABLE github_events ( event_id bigint, event_type

# \dt List of relations Schema | Name | Type

Hyperscale (Citus) helps ASB onboard customers 20x faster

Flexible and open High performance scale-out with Hyperscale (Citus) Fully

More OSS DBs at Data Showcase Demo