Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DEMO - High performance HTAP with Postgres & Hyperscale (Citus) | ACM SIGMOD/PODS 2020 | Marco Slot & Claire Giordano

DEMO - High performance HTAP with Postgres & Hyperscale (Citus) | ACM SIGMOD/PODS 2020 | Marco Slot & Claire Giordano

In this demo, we run a large-scale HTAP workload on Azure Database for PostgreSQL with the built-in Hyperscale (Citus) deployment option. Hyperscale (Citus) uses the open source Citus extension to Postgres to turn a cluster of PostgreSQL servers into a single distributed database that can shard or replicate Postgres tables across the cluster. Citus can simultaneously scale transaction throughput by routing transactions to the right server, and scale analytical queries and data transformations by parallelizing them across all of the servers in the database cluster. In combination with all the powerful Postgres features such as its different index types and other PostgreSQL extensions, this makes Hyperscale (Citus) able to run high performance HTAP workloads at scale.

We will show a side-by-side comparison of Hyperscale (Citus) and a single PostgreSQL server running a transactional workload generated by HammerDB, while simultaneously running analytical queries, and show how you get further speedups by pre-aggregating the data in parallel (using rollups) on the same Postgres database.

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Marco Slot Principal Engineer & Lead of
    Citus Open Source project/ with
    intro by Claire Giordano
    DEMO
    High performance HTAP
    with Postgres &
    Hyperscale (Citus)

    View full-size slide

  2. Hybrid
    Transactional
    Analytical
    Processing
    @clairegiordano / @marcoslot

    View full-size slide

  3. Hyperscale (Citus)
    now available as part of Azure
    Database for PostgreSQL

    View full-size slide

  4. Hyperscale (Citus)
    now available as part of Azure
    Database for PostgreSQL

    View full-size slide

  5. Citus
    extension
    to Postgres

    View full-size slide

  6. aka.ms/citus

    View full-size slide

  7. What is Citus? /// github.com/citusdata/citus
    ž Transforms Postgres into a distributed database

    View full-size slide

  8. What is Citus? /// github.com/citusdata/citus
    ž Transforms Postgres into a distributed database
    ž Distributes your data & queries

    View full-size slide

  9. What is Citus? /// github.com/citusdata/citus
    ž Transforms Postgres into a distributed database
    ž Distributes your data & queries
    ž Parallelism

    View full-size slide

  10. What is Citus? /// github.com/citusdata/citus
    ž Transforms Postgres into a distributed database
    ž Distributes your data & queries
    ž Parallelism
    ž All the cpu, memory, & disk of cluster

    View full-size slide

  11. Can you tell us a bit about what you
    will demo today?
    What’s the anatomy of the demo?
    @clairegiordano / @marcoslot

    View full-size slide

  12. Order Processing
    System for
    Warehouses

    View full-size slide

  13. What you will see in today’s HTAP database demo
    All running on Azure
    Side-by-side performance compare: Hyperscale (Citus) vs. single node
    Millisecond analytics queries with rollups
    Retail: Order processing system for warehouses (using HammerDB)

    View full-size slide

  14. What you will see in today’s HTAP database demo
    All running on Azure
    Side-by-side performance compare: Hyperscale (Citus) vs. single node
    Millisecond analytics queries with rollups
    Retail: Order processing system for warehouses (using HammerDB)

    View full-size slide

  15. A bit about HammerDB (it’s NOT a database)
    hammerdb.com

    View full-size slide

  16. What you will see in today’s HTAP database demo
    All running on Azure
    Side-by-side performance compare: Hyperscale (Citus) vs. single node
    Millisecond analytics queries with rollups
    Retail: Order processing system for warehouses (using HammerDB)

    View full-size slide

  17. What you will see in today’s HTAP database demo
    All running on Azure
    Side-by-side performance compare: Hyperscale (Citus) vs. single node
    Millisecond analytics queries with rollups
    Retail: Order processing system for warehouses (using HammerDB)

    View full-size slide

  18. What you will see in today’s HTAP database demo
    All running on Azure
    Side-by-side performance compare: Hyperscale (Citus) v. single node
    Millisecond analytics queries with rollups
    Retail: Order processing system for warehouses (using HammerDB)

    View full-size slide

  19. Demo: HTAP Database with Hyperscale (Citus)
    Marco Slot
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  20. Hyperscale (Citus)
    10-node cluster
    53 minutes
    10 sec
    Transactions
    Analytical query
    ~900K
    transactions/min
    ~40-50K
    transactions/min
    20x faster
    300x faster
    20 milliseconds
    Analytical query
    with rollups
    Single Postgres
    Server
    ~150,000x faster

    View full-size slide

  21. METADATA
    W7
    W6
    W5
    W4
    W10
    W9
    W8
    W3
    W2
    W1
    Hyperscale
    (Citus)
    10-node
    database cluster
    Coordinator
    CITUS WORKER
    NODES

    View full-size slide

  22. Power of HTAP with
    Hyperscale (Citus) on
    Azure Database for PostgreSQL

    View full-size slide

  23. Will all apps see the performance
    increase you just showed us?
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  24. It’s important to
    find a good
    distribution column,
    something that is
    common to all large
    tables
    SELECT create_distributed_table(
    'table_name',
    'distribution_column');
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  25. At the end of the demo, you called
    Citus an “almost anything” database.
    What did you mean?
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  26. As an extensible, relational database, Postgres is
    capable of so many things on a single server…

    View full-size slide

  27. By transforming Postgres
    into a distributed
    database, Hyperscale
    (Citus) makes Postgres
    capable of
    almost anything

    View full-size slide

  28. How best to get started with
    Hyperscale (Citus)?
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  29. Download
    Citus open
    source
    packages
    aka.ms/citus

    View full-size slide

  30. Multi-tenant
    (SaaS)
    tutorial
    aka.ms/hyperscale-citus-multi-tenant-tutorial

    View full-size slide

  31. Tutorial:
    Real-time
    analytics
    dashboard
    aka.ms/hyperscale-citus-real-time-tutorial

    View full-size slide

  32. Do you have a favorite blog
    post?
    @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata

    View full-size slide

  33. Architecting petabyte-scale analytics by scaling out
    Postgres on Azure with the Citus extension
    aka.ms/blog-petabyte-scale-analytics

    View full-size slide

  34. @clairegiordano / @marcoslot / @azuredbpostgres / @citusdata
    Petabyte-scale service architecture used by Windows

    View full-size slide

  35. Min Wei, Principal Engineer at Microsoft
    Distributed PostgreSQL
    is a game changer."
    source:
    https://aka.ms/blog-petabyte-scale-analytics

    View full-size slide

  36. © Copyright Microsoft Corporation. All rights reserved.
    Marco Slot &
    @marcoslot
    @citusdata
    @clairegiordano
    @AzureDBPostgres
    Claire Giordano
    Thank you!

    View full-size slide