Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Dive: Azure Database for PostgreSQL Hyperscale (Citus) | Azure Club Meeting | Adam Wolk

Deep Dive: Azure Database for PostgreSQL Hyperscale (Citus) | Azure Club Meeting | Adam Wolk

Despite the popularity of NoSQL solutions, relational databases remain the first choice for most companies, and in some cases they are irreplaceable. Nowadays, open-source solutions offer equal functionality to their closed counterparts while showing a very fast pace of development and avoiding vendor-lock-in.

In this presentation, you will get to know PostgreSQL, an open-source database. You will also learn about the "Citus" extension, which turns Postgres into a distributed database that scales horizontally and can easily compete with NoSQL solutions while maintaining full relational database functionality. The mentioned extension is also open source and free for commercial use.

The developer of the Citus extension is Microsoft, which actively supports the PostgreSQL community. We will tell you how we make Citus available in our cloud as Azure Database for PostgreSQL Hyperscale (Citus). You will learn how the Citus' built-in support for multi-tenancy will help you avoid the many problems common for SaaS applications.

Azure Database for PostgreSQL

September 22, 2022
Tweet

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Deep-dive
    Azure Database for PostgreSQL
    Hyperscale (Citus)
    Adam Wołk
    2022-09-22

    View Slide

  2. ©Microsoft Corporation
    Azure
    Contents 1
    2
    3
    4
    NoSQL / RDBMS
    PostgreSQL
    Citus
    Azure for PostgreSQL
    Hyperscale (Citus)

    View Slide

  3. NoSQL / RDBMS

    View Slide

  4. ©Microsoft Corporation
    Azure
    NoSQL
    • Non-relational
    • Schemaless
    • Document storage
    • Horizontal scale-out

    View Slide

  5. ©Microsoft Corporation
    Azure
    RDBMS
    • Relational
    • ACID guarantees
    • Rich data-types and indexing
    • Vertical scaling

    View Slide

  6. ©Microsoft Corporation
    Azure
    NoSQL vs RDBMS
    Hard with NoSQL
    • Ad-hoc queries
    • App encoded schema
    Hard with RDBMS
    • Horizontal scaling
    Both worlds are converging into a hybrid model.
    PostgreSQL: PostgreSQL 9.4 Press Kit

    View Slide

  7. PostgreSQL

    View Slide

  8. Why use PostgreSQL?
    Open-Source
    Developed in the
    open, free for use
    including commercial
    projects.
    Rich data types
    Over 20 data types in
    core, including
    advanced JSON
    support.
    Performance
    Variety of available
    indexing methods,
    including specialized
    ones for full text
    search and geo-
    spatial data.
    Reliability
    ACID compliant
    database, with over
    30 years of
    development of the
    core engine. Hot
    standby failovers,
    replication and PITR
    all present.
    Extensibility
    Over 1000 extensions
    adding vast feature-
    sets to the core
    engine. Geo-spatial
    queries, new types
    and indexing
    methods, integration
    with existing services.
    Standard Compliant
    170 out of 179
    mandatory features
    of SQL:2016

    View Slide

  9. ©Microsoft Corporation
    Azure
    PostgreSQL industry adoption
    "For Professional Developers PostgreSQL just barely took over the first place spot
    from MySQL. Professional Developers are more likely than those learning to code to
    use Redis, PostgreSQL, Microsoft SQL Server, and Elasticsearch.
    MongoDB is used by a similar percentage of both Professional Developers and
    those learning to code and it’s the second most popular database for those learning
    to code (behind MySQL). This makes sense since it supports a large number of
    languages and application development platforms.”
    StackOverflow 2022 Developer Survey
    Stack Overflow Developer Survey 2022

    View Slide

  10. ©Microsoft Corporation
    Azure
    Rich JSON support
    • JSONB (binary JSON storage)
    • Indexing JSONB columns
    • Ad-hoc queries against JSONB data
    • JSONPath support
    PostgreSQL: Documentation: 14: 8.14. JSON Types

    View Slide

  11. ©Microsoft Corporation
    Azure
    Query JSON Data
    PostgreSQL: Documentation: 14: 8.14. JSON Types

    View Slide

  12. Citus

    View Slide

  13. Why use Citus?
    Open-Source
    Developed in the
    open, free for use
    including commercial
    projects.
    Postgres at any Scale
    Scale out Postgres by
    distributing your data
    & queries across a
    cluster. And it’s
    simple to add nodes
    & rebalance shards
    when you need to
    grow.
    Parallelized
    Performance
    Speed up queries by
    20x to 300x (or more)
    through parallelism,
    keeping more data in
    memory, higher I/O
    bandwidth, and
    columnar
    compression.
    Run anywhere
    Self-host or run Citus
    in the cloud as a
    built-in option on
    Azure Database for
    PostgreSQL.
    True PostgreSQL
    Get all the benefits of
    PostgreSQL with the
    added magic of
    distributed tables.
    Vendor neutral
    Plain PostgreSQL, you
    can always take your
    data and go
    elsewhere. No vendor
    lock in.

    View Slide

  14. ©Microsoft Corporation
    Azure
    What is sharding?
    tenant_id A
    1 1 Text
    2 1 Text
    3 2 Text
    4 3 Text
    tenant_id A
    1 1 Text
    2 1 Text
    tenant_id A
    3 2 Text
    tenant_id A
    4 3 Text

    View Slide

  15. ©Microsoft Corporation
    Azure
    Citus Cluster Architecture

    View Slide

  16. Azure for PostgreSQL
    Hyperscale (Citus)

    View Slide

  17. Why use Azure for PostgreSQL Hyperscale (Citus)?
    Open-Source
    Running the same
    version of Citus
    available to everyone
    with the latest
    PostgreSQL version
    available.
    Cloud Infrastructure
    Focus on your
    application and
    forget about your
    database.
    Expert Support
    Team of Citus and
    PostgreSQL experts
    ready and capable to
    help with the most
    dreadful database
    issues you may
    encounter.
    Reliability
    HA, backups, PITR
    recovery, geo read
    replicas – without the
    pain of setting them
    up.
    Monitoring
    Insights into the
    performance and
    inner workings of
    your cluster.
    Regulation
    Compliance
    Benefit by leveraging
    on Azure compliance
    with global
    regulations.

    View Slide

  18. ©Microsoft Corporation
    Azure
    Will it scale? – Telemetry for all of Windows
    2 clusters (54 nodes)
    • 32 nodes
    • 22 nodes
    • 3,456 cores
    • 27 TB of memory.
    • 1.6 PB of Azure
    Premium SSD
    Managed Disks
    • P75 – 90ms
    • P95 - <1s
    Architecting petabyte-scale analytics by scaling out Postgres on
    Azure with the Citus extension (microsoft.com)

    View Slide

  19. ©Microsoft Corporation
    Azure
    Will it scale? – COVID dashboard for all of UK
    • 250,000 to 300,000 hits per minute at
    peak.
    • concurrent users at peak is 60,000 to
    100,000
    • 7.5 billion records
    • 5 million data points returned in under
    10 seconds
    • Cache invalidation durations never
    exceed 2 minutes.
    UK COVID-19 dashboard built using Postgres
    and Citus for millions of users - Microsoft Tech
    Community

    View Slide

  20. ©Microsoft Corporation
    Azure
    Citus 11.1 Release Party

    View Slide

  21. © Copyright Microsoft Corporation. All rights reserved.
    Thank you.

    View Slide