Pro Yearly is on sale from $80 to $50! »

Building PostgreSQL apps at scale with Hyperscale (Citus) | Microsoft Build 2019 | Craig Kerstiens, Sunil Kamath

Building PostgreSQL apps at scale with Hyperscale (Citus) | Microsoft Build 2019 | Craig Kerstiens, Sunil Kamath

Postgres is a powerful open source database with different features and tooling. If you are new to Postgres, how do you ensure it’s performant and scale? We’ll start with the basics you need to know as an app developer, then dig into how to optimize apps and database to achieve optimal performance. We’ll look at things like cache hit ratio, index hit ratio and sharding with Citus as you scale your apps. This session will also include learnings from an architect of a leading banking institute who were able to successfully build a highly performing and scalable app. If you're coming from a SQL server we'll cover how things differ and the equivalent options of Postgres, including the rich set of Postgres extensions for deployment experience.

024d6a0dd14fb31c804969a57a06dfbe?s=128

Citus Data

May 06, 2019
Tweet

Transcript

  1. None
  2. Craig Kerstiens Sunil Kamath Mathew Stokes

  3. PostgreSQL is more popular than ever loved wanted https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog https://db-engines.com/en/blog_post/76

    DBMS of the Year
  4. Why Postgres? JSONB hstore Arrays GIN index PostGIS Concurrent indexing

    GiST index B-tree index MVCC Time series Safety Proven track record SP-GiST index KNN BRIN index Listen/notify CTEs Window functions Transactional DDL Foreign data wrappers Extensions Fast column addition
  5. Focus on your application, not your database Enterprise-grade managed services

    for PostgreSQL On-premises PostgreSQL IaaS Azure VMs with PostgreSQL PaaS Azure Database for PostgreSQL Datacenter management Hardware O/S provision /patching Database provision/ Patch/Scaling Virtualization Data Applications High availability /DR/Backups Datacenter management Hardware Virtualization O/S Database provision/ Patch/Scaling Data Applications High availability /DR/Backups Data Applications Datacenter management Hardware Virtualization O/S Database provision/ Patch/Scaling High availability/ DR/Backups Intelligent performance/security Managed by Microsoft Managed by customer Machine learning capability
  6. High performance scale-out with Hyperscale (Citus) Intelligent performance optimization Flexible

    and open Fully managed and secure Single Server Hyperscale (Citus) NEW Build or migrate your workloads with confidence
  7. Single Server Hyperscale (Citus) NEW Worry-free PostgreSQL in the cloud

    with an architecture that is built to scale out Example use cases • Scaling PostgreSQL multi-tenant, SaaS applications • Real-time operational analytics • Building high throughput transactional apps Community-based single node PostgreSQL with built-in High Availability Example use cases • Transactional and operational analytics workloads • Apps requiring JSON, geospatial support or full-text search • Greenfield apps built with modern frameworks
  8. None
  9. None
  10. None
  11. None
  12. None
  13. None
  14. None
  15. None
  16. None
  17. Cache rules everything around me

  18. Cache hit ratio SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit,

    sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio FROM pg_statio_user_tables;
  19. Cache hit ratio name | ratio ----------------+------------------------ cache hit rate

    | 0.99
  20. Index hit ratio SELECT relname, 100 * idx_scan / (seq_scan

    + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 ORDER BY n_live_tup DESC;a
  21. Index hit ratio relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events

    | 0 | 669917 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059 authorizations | 0 | 0 delayed_jobs | 23 | 0
  22. Table cache hit ratio target > 99%

  23. https://github.com/savjani/postgres- assets/blob/master/SQL%20Notebooks/Postgres_database_health_check_notebook.ipynb

  24. Customized recommendations Performance troubleshooting Data visualization Intelligent performance

  25. azure_sys azure_mai ntenance postgres UserDB

  26. azure_sys azure_mai ntenance postgres UserDB

  27. None
  28. I HAVE NO IDEA

  29. None
  30. None
  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. 9 TB

  39. Stay current with PostgreSQL innovations Blazing performance Simplified infrastructure Scale

    out horizontally
  40. Under the covers data is sharded ž APPLICATION SELECT FROM

    WHERE AND count(*) ads JOIN campaigns ON ads.company_id = campaigns.company_id ads.designer_name = ‘Isaac’ campaigns.company_id = ‘Elly Co’ ; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT … FROM ads_1001, campaigns_2001 … It’s logical to place shards containing related rows of related tables together on the same nodes Join queries between related rows can reduce the amount of data sent over the network
  41. Create a table CREATE TABLE github_events ( event_id bigint, event_type

    text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, user_id bigint, org jsonb, created_at timestamp ); CREATE TABLE github_users ( user_id bigint, url text, login text, avatar_url text, gravatar_id text, display_login text );
  42. # \dt List of relations Schema | Name | Type

    | Owner --------+----------------------+-------+------- public | github_events_102011 | table | citus public | github_events_102015 | table | citus public | github_events_102019 | table | citus public | github_events_102023 | table | citus public | github_events_102027 | table | citus public | github_events_102031 | table | citus public | github_events_102035 | table | citus public | github_events_102039 | table | citus public | github_users_102043 | table | citus public | github_users_102047 | table | citus public | github_users_102051 | table | citus public | github_users_102055 | table | citus public | github_users_102059 | table | citus public | github_users_102063 | table | citus public | github_users_102067 | table | citus public | github_users_102071 | table | citus (16 rows) # \dt List of relations Schema | Name | Type | Owner --------+----------------------+-------+------- public | github_events_102009 | table | citus public | github_events_102013 | table | citus public | github_events_102017 | table | citus public | github_events_102021 | table | citus public | github_events_102025 | table | citus public | github_events_102029 | table | citus public | github_events_102033 | table | citus public | github_events_102037 | table | citus public | github_users_102041 | table | citus public | github_users_102045 | table | citus public | github_users_102049 | table | citus public | github_users_102053 | table | citus public | github_users_102057 | table | citus public | github_users_102061 | table | citus public | github_users_102065 | table | citus public | github_users_102069 | table | citus (16 rows) Worker Node 1 Worker Node 2
  43. None
  44. Hyperscale (Citus) helps ASB onboard customers 20x faster

  45. None
  46. None
  47. None
  48. Flexible and open High performance scale-out with Hyperscale (Citus) Fully

    managed and secure Intelligent performance optimization • • • • • •
  49. More OSS DBs at Data Showcase Demo