Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SaaS on Rails on PostgreSQL

SaaS on Rails on PostgreSQL

Presentation for Posette 2024, covering scaling up and out with Ruby on Rails and PostgreSQL, for multi tenant SaaS applications.

Some of the tech covered:

- Row Level Security
- Composite primary keys
- Active Record Horizontal Sharding
- Citus, row-based and schema-based sharding

Andrew Atkinson

June 11, 2024
Tweet

More Decks by Andrew Atkinson

Other Decks in Programming

Transcript

  1. Presentation Overview Define terms Rails, SaaS, multi-tenancy, data isolation, sharding

    Work with a multi-tenant SaaS B2B Rails and PostgreSQL app Options for scaling up with traditional PostgreSQL Recently added capabilities in Active Record and Citus Scale out options beyond PostgreSQL SaaS on Rails on Postgres — andyatkinson.com Slide 2 of 34
  2. What is Rails? Tech Created Summary Open Source Language Ruby

    1995 General purpose programming language C Ruby on Rails, aka "Rails" 2004 Web App Development Framework Ruby Active Record Framework within Ruby on Rails. ORM. Schema Management. Multi-DBs. Horizontal Sharding. SaaS on Rails on Postgres — andyatkinson.com Slide 3 of 34
  3. Scale Out Overview Single primary PostgreSQL capabilities Active Record Horizontal

    Sharding Citus distributed Postgres sharding Database Sharding Explained - Mahdi Yusuf SaaS on Rails on Postgres — andyatkinson.com Slide 4 of 34
  4. github.com/andyatkinson/rideshare Ruby on Rails app with PostgreSQL Code samples for

    this presentation: Pull Request #198 SaaS on Rails on Postgres — andyatkinson.com Slide 5 of 34
  5. SaaS Definitions SaaS is a software delivery model SaaS apps

    require architectural planning to process large amounts of data while offering good performance and availability Customers are businesses (B2B) or consumers (B2C) B2B SaaS might have dozens, hundreds, or thousands of customers B2C might have thousands or millions SaaS on Rails on Postgres — andyatkinson.com Slide 6 of 34
  6. SaaS Apps Common Needs Authentication Profiles Plans & Subscriptions API

    Integrations Accounts & Teams SaaS on Rails on Postgres — andyatkinson.com Slide 7 of 34
  7. SaaS on Rails Bullet Train: https://bullettrain.co (Open Source) JumpStart Rails:

    https://jumpstartrails.com Business Class: https://businessclasskit.com SaaS on Rails on Postgres — andyatkinson.com Slide 8 of 34
  8. Adding Multi-Tenancy For our multi-tenancy, our tenants will be "Companies"

    Table data related to multiple companies will be stored in the same table We'll identify company data with the company_id foreign key column SaaS on Rails on Postgres — andyatkinson.com Slide 9 of 34
  9. Without multi-tenancy Provision a new database, or new instance per

    tenant Greater isolation More infra costs More maintenance burden SaaS on Rails on Postgres — andyatkinson.com Slide 10 of 34
  10. Designing multi-tenancy for Rideshare CREATE TABLE companies ( id INT

    GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name text UNIQUE NOT NULL ); CREATE TABLE trips ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, company_id INT NOT NULL REFERENCES companies(id) ); SaaS on Rails on Postgres — andyatkinson.com Slide 11 of 34
  11. Gain efficiencies with multi-tenancy Multi-tenant data in fewer tables, schemas,

    databases, and instances Gain flexibility for moving or copying tenant data Accommodate external legal requirements for tenant data Besides easier identification, we can add some security SaaS on Rails on Postgres — andyatkinson.com Slide 12 of 34
  12. PostgreSQL Row Level Security Create ROLE objects per company Within

    a multi-tenant table, use the company_id column to target rows Limit access to company_id rows to matching ROLE objects using a POLICY CREATE USER company_one; CREATE POLICY select_policy ON trips FOR SELECT USING (company_id = current_company_id()); SaaS on Rails on Postgres — andyatkinson.com Slide 13 of 34
  13. What are some database challenges we'll run into? We can't

    allocate our instance resources like CPU cores, memory, IO operations to tenants We have a shared set of database connections We have a single buffer cache ( shared_buffers ) Observability for tenant operations may be more challenging We may need to significantly over-provision SaaS on Rails on Postgres — andyatkinson.com Slide 14 of 34
  14. Optional: Composite Primary Keys We currently have traditional (surrogate) primary

    keys defined A composite primary key for trips would include company_id , and allow us to enforce unique data for the tenant CPK support in Active Record 7.1 (Released 2023) SaaS on Rails on Postgres — andyatkinson.com Slide 15 of 34
  15. Migrating to CPKs -- Single column primary key CREATE TABLE

    trips ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, company_id INT NOT NULL ); -- CPK CREATE TABLE trips ( id BIGINT GENERATED ALWAYS AS IDENTITY NOT NULL, company_id INT NOT NULL, CONSTRAINT trips_pkey_cpk PRIMARY KEY (id, company_id) ); SaaS on Rails on Postgres — andyatkinson.com Slide 16 of 34
  16. PostgreSQL Table Partitioning in Multi-Tenant SaaS With the LIST type,

    create a partition per tenant ( company_id ) When adding a new tenant, create a corresponding new partition All queries must include the company_id partition key Caveats: UNIQUE constraints, increased planning time Big benefit: If companies leave platform, efficient detachment and archival of their data is possible* *Using DETACH CONCURRENTLY available since 14 (2021) SaaS on Rails on Postgres — andyatkinson.com Slide 17 of 34
  17. PostgreSQL SCHEMA Per Tenant For new companies, create new SCHEMA

    objects as a grouping method for their tables Limit access to schemas to their corresponding tenant SaaS on Rails on Postgres — andyatkinson.com Slide 18 of 34
  18. Multi-tenant data isolation Ruby gems acts_as_tenant gem helps with row-level

    multi-tenant data Apartment gem offers schema-based data isolation SaaS on Rails on Postgres — andyatkinson.com Slide 19 of 34
  19. Data Isolation Database Schema Row Table Partitioning Per-tenant Schema Apartment

    gem acts_as_tenant gem Row Level Security SaaS on Rails on Postgres — andyatkinson.com Slide 20 of 34
  20. Scaling beyond a single PostgreSQL instance Using secondary physical or

    logical replication instances, we can distribute reads, or even perform Active Active (BDR, avoids PK conflicts). However, there's no native "multi-primary" or "sharding" capability in PostgreSQL. WIP PostgreSQL Sharding, FDW This means row-level or schema-level sharding isn't an option SaaS on Rails on Postgres — andyatkinson.com Slide 21 of 34
  21. Multi-Tenancy and Sharding With Active Record Active Record gained the

    Horizontal Sharding in 6.1 (2020) Using Active Record Horizontal Sharding, we can assign tenants to "shards" ( company_one ) Rails app identifies tenant, routes requests to shards The shard is a same-schema per-tenant database, exclusively with tenant data Shards (databases) can run on same instance, or on separate server instances for greater distribution and scalability SaaS on Rails on Postgres — andyatkinson.com Slide 22 of 34
  22. class ShardRecord < ApplicationRecord self.abstract_class = true connects_to shards: {

    company_one: { writing: :primary_company_one, reading: :primary_company_one_replica } } end Rails.application.configure do config.active_record.shard_selector = { lock: true } config.active_record.shard_resolver = ->(request) { Company.find_by!(subdomain: request.subdomain).shard } end SaaS on Rails on Postgres — andyatkinson.com Slide 23 of 34
  23. Citus and Sharding Citus is a distributed Postgres deployment for

    horizontal scaling One coordinator node, and 1 or more worker nodes Shards can be placed on workers, and "re-sharding" supported Open source, available as a (super) extension, hosted offerings (Microsoft Azure Cosmos DB for PostgreSQL), Crunchy Bridge (AWS, Azure, GCP) SaaS on Rails on Postgres — andyatkinson.com Slide 24 of 34
  24. Citus Distributed Tables Citus 12.1 supports IDENTITY columns Row-level sharding,

    company_id , similar shards can be on same nodes with colocation CREATE TABLE companies ( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name text UNIQUE NOT NULL ); CREATE TABLE trips ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, company_id INT NOT NULL ); SaaS on Rails on Postgres — andyatkinson.com Slide 25 of 34
  25. Citus Reference and Distributed Tables Preparing Reference Tables (replicated to

    all workers) Changing PKs and FKs The activerecord-multi-tenant Ruby gem adds helpers -- Prereq: Drop FKs referring to reference tables SELECT create_reference_table('users'); SELECT create_reference_table('locations'); -- Prereq: Drop single-column PKs SELECT create_distributed_table('trip_requests', 'company_id'); SELECT create_distributed_table('trips', 'company_id'); SaaS on Rails on Postgres — andyatkinson.com Slide 26 of 34
  26. Supply distribution column so query goes to correct shard Trip.where(id:

    1).select(:id) # slow SELECT id FROM rideshare.trips ... Trip.where(id: 1, company_id: 1).select(:id) # fast SELECT id FROM rideshare.trips_102114 ... SaaS on Rails on Postgres — andyatkinson.com Slide 27 of 34
  27. Citus Schema-based Sharding From Citus 12 (2023) onward, schema-based sharding

    is available Distribution column is not needed Create schema per tenant, schemas become shards Tables in same schema co-located on same node, so JOIN operations and FKs work SET citus.enable_schema_based_sharding TO ON; CREATE SCHEMA company_one; CREATE SCHEMA company_two; SET search_path TO company_one; SaaS on Rails on Postgres — andyatkinson.com Slide 28 of 34
  28. Sharding App-level Outside app Schema Row Active Record Horizontal Sharding

    Citus row-based Citus schema-based SaaS on Rails on Postgres — andyatkinson.com Slide 30 of 34
  29. Wrapping Up and Advice Single primary PostgreSQL w/ replication can

    take you far! Retain PK, FK, constraints, JOIN operations Collect usage-metrics for shared resources like CPU, Memory, connections, and IO Identify, measure performance, and plan for future states for high growth tables "Costs" are infrastructure costs and engineering time SaaS on Rails on Postgres — andyatkinson.com Slide 31 of 34
  30. Additional Considerations Consider designing for multi-tenancy even just to make

    tenant data easier to identify Check your PostgreSQL extension support Consider how app-level vs. Postgres-level solutions match your team skill sets and expected maintenance plans Your app-defined schema will evolve. With distributed databases, incremental changes are more complex. SaaS on Rails on Postgres — andyatkinson.com Slide 32 of 34
  31. Resources Database Sharding Explained - Mahdi Yusuf Active Record Horizontal

    Sharding Freshworks: Horizontal sharding in a multi-tenant app with Rails 6.1 - Ritikesh G Scale Out Multi-Tenant Apps based on Ruby on Rails - Lukas Fittl Ruby on Rails Scale Out Citus Documentation Designing your SaaS Database for Scale with Postgres - Ozgun Erdogan Schema-based sharding comes to PostgreSQL with Citus - Onur Tirtir Adding Postgres 16 support to Citus 12.1, plus schema-based sharding improvements - Naisila Puka Active Active in Postgres 16 - Brian Pace SaaS on Rails on Postgres — andyatkinson.com Slide 33 of 34