Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multi-tenant SaaS apps made simple on Azure Cosmos DB for PostgreSQL | Citus Con 2023 | Adam Wolk & Alicja Kucjarcyzk

Multi-tenant SaaS apps made simple on Azure Cosmos DB for PostgreSQL | Citus Con 2023 | Adam Wolk & Alicja Kucjarcyzk

Cloud computing has given rise to a massive new market of SaaS applications built by ISV's and enterprises, however architecting these multi-tenant SaaS apps to optimize for underlying infrastructure cost—whilst still providing guarantees of isolation and performance—is hard.

In this session, Adam and Alicja will showcase a number of unique features in Azure Cosmos DB for PostgreSQL, a managed service powered by the open source Citus database extension. These Azure Cosmos DB for PostgreSQL features enable developers to build and scale multi-tenant SaaS apps easily with a variety of tenant management features built natively into the database.

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Multi-tenant SaaS apps made simple
    on Azure Cosmos DB for PostgreSQL
    Adam Wołk, Senior Program Manager
    Alicja Kucharczyk, Senior Program Manager
    @ Citus Con: An Event for Postgres 2023

    View Slide

  2. Software as a Service
    Squeezing margins out of your hardware

    View Slide

  3. Multi-tenant SaaS
    • Same software
    • Same infrastructure
    • Multiple businesses/customers
    Customer value:
    Ease of operation
    Operator margin:
    Efficient use of hardware
    DNS

    View Slide

  4. Sharding Models

    View Slide

  5. Row based vs Schema Based vs Database Sharding
    • 100k tenants per
    16vCPU node
    • Adjust schema
    • Modify application
    queries
    • 4-5k tenants per
    16vCPU node
    • No schema changes
    • No query changes
    • 100 tenants per
    16vCPU node
    • No schema changes
    • No query changes
    Best hardware utilization Easy lift & shift Highest operational cost

    View Slide

  6. Azure Cosmos DB for PostgreSQL
    Build
    Multi-tenant
    Scale-out ready

    View Slide

  7. Terminology
    Coordinator – Stores Metadata. Node which application connects to.
    Worker / Data nodes – Nodes which store data in form of shards.
    Sharding – Process of dividing data among nodes.
    Shards – A partition of the data containing a subset of rows.

    View Slide

  8. Data modelling
    How to efficiently shard your schema?

    View Slide

  9. Pick a distribution key
    • Good (high) cardinality
    • Filters reads to a single node
    • Fans out writes to multiple nodes
    Bad vs Good
    country vs unique_carrier
    dest or origin airport vs unique_carrier
    timestamp vs device_id

    View Slide

  10. Identify types of tables
    • Distributed table
    • Reference table
    • Local table

    View Slide

  11. Distributed Tables
    Definition:
    • Tables that are sharded.
    Classification:
    • Large tables (>10GB) – shard on same key (may require addition of shard key)
    • All tables are be co-located
    • Enables localized and fast joins on workers
    • Ex: transactions, events etc
    SELECT create_distributed_table(table_name, column_name);

    View Slide

  12. Reference Tables
    Definition:
    • Replicated to all the nodes (extra latency)
    Classification:
    • Small tables < 10GB
    • Efficient joins with distributed tables
    • Cannot have sharding dimension
    • Ex: countries, categories
    SELECT create_reference_table(table_name);

    View Slide

  13. Local tables
    • Plain Postgres tables on the coordinator node.
    • Admin Tables that don’t interact with main tables
    • Separate micro-service that doesn’t need sharding

    View Slide

  14. Modify your application
    • Add distribution column to PK/FK
    • Add distribution key to queries

    View Slide

  15. Monitoring
    What is the pulse of your system?

    View Slide

  16. Monitor – upcoming in Citus 11.3
    • Query counts per tenant
    • CPU utilization per tenant
    10 80 80
    50
    50 70
    30
    1k
    2K 100k
    1k
    99k
    0.5k
    0.5k

    View Slide

  17. Noisy neighbors

    View Slide

  18. Isolate tenant - commands
    SELECT isolate_tenant_to_new_shard('table_name', tenant_id);
    SELECT citus_move_shard_placement(shard_id, source_node_name,
    source_node_port, target_node_name, target_node_port);

    View Slide

  19. Scaling out

    View Slide

  20. Rebalance
    {
    "tasks": [
    {
    "LSN": {
    "lag": null,
    "source": "0/731ACC0",
    "target": null
    },
    "size": {
    "source": "14 MB",
    "target": "5440 kB"
    },
    "hosts": {
    "source": "localhost:9702",
    "target": "localhost:9700"
    },
    "phase": "Copying Data",
    "state": "running",
    "command": "SELECT pg_catalog.citus_move_shard_placement(102020,2,3,'block_writes')",
    "message": "",
    "retried": 0,
    "task_id": 10
    }
    ],
    "task_state_counts": {
    "done": 9,
    "blocked": 1,
    "running": 1
    }
    }
    select * from
    citus_rebalance_start();
    select jsonb_pretty(details) from
    citus_rebalance_status();

    View Slide

  21. Thank You
    Alicja Kucharczyk - @stiepantrofimo
    Adam Wołk - @mulander

    View Slide