Slide 1

Slide 1 text

Scaling Multi-Tenant Applications Using the Django ORM & Postgres Louise Grandjonc - Pycaribbean 2019 @louisemeta

Slide 2

Slide 2 text

About me Software Engineer at Citus Data Postgres enthusiast @louisemeta and @citusdata on twitter www.louisemeta.com [email protected] @louisemeta

Slide 3

Slide 3 text

Today’s agenda 1. What do we mean by “multi-tenancy”? 2. Three ways to scale a multi-tenant app 3. Shared tables in your Django apps 4. Postgres, citus and shared tables @louisemeta

Slide 4

Slide 4 text

What do we mean by “multi-tenancy” @louisemeta

Slide 5

Slide 5 text

What do we mean by “multi-tenancy” - Multiple customers (tenants) - Each with their own data - SaaS - Example: shopify, salesforce @louisemeta

Slide 6

Slide 6 text

What do we mean by “multi-tenancy” A very realistic example Owner examples: - Hogwarts - Ministry of Magic - Harry Potter - Post office @louisemeta

Slide 7

Slide 7 text

What do we mean by “multi-tenancy” The problem of scaling multi tenant apps @louisemeta

Slide 8

Slide 8 text

3 ways to scale a multi-tenant app @louisemeta

Slide 9

Slide 9 text

Solution 1: One database per tenant @louisemeta

Slide 10

Slide 10 text

- Organized collection of interrelated data - Don’t share resources: - Username and password - Connections - Memory @louisemeta One database per tenant

Slide 11

Slide 11 text

@louisemeta One database per tenant

Slide 12

Slide 12 text

1. Changing the settings DATABASES = { 'tenant-{id1}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': 'hogwarts', 'USER': 'louise', 'PASSWORD': ‘abc', 'HOST': '…', 'PORT': '5432' }, 'tenant-{id2}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': ‘ministry', 'USER': 'louise', 'PASSWORD': 'abc', 'HOST': '…', 'PORT': '5432' }, … } Warnings ! - You need to have each tenant in the settings. - When you have a new customer, you need to create a database and change the settings. One database per tenant @louisemeta

Slide 13

Slide 13 text

2. Handling migrations python manage.py migrate —database=tenant_id1; For each tenant, when you have a new migration One database per tenant @louisemeta

Slide 14

Slide 14 text

Changes needed to handle it with Django ORM 3. Creating your own database router class ExampleDatabaseRouter(object): """ Determines on which tenant database to read/write """ def db_for_read(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def db_for_write(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def allow_relation(self, obj1, obj2, **hints): """Determine if relationship is allowed between two objects. The two objects have to be on the same database ;)””” pass One database per tenant @louisemeta

Slide 15

Slide 15 text

PROS - Start quickly - Isolate customer (tenant) data - Compliance is a bit easier - If one customer is queried a lot, performance degrade will be low - Time for DBA/developer to manage - Hard to handle with ORMs - Maintain consistency (ex: create index across all databases) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One database per tenant @louisemeta

Slide 16

Slide 16 text

Solution 2: One schema per tenant @louisemeta

Slide 17

Slide 17 text

- Logical namespaces to hold a set of tables - Share resources: - Username and password - Connections - Memory One schema per tenant @louisemeta

Slide 18

Slide 18 text

One schema per tenant @louisemeta

Slide 19

Slide 19 text

PROS - Better resource utilization vs. one database per tenant - Start quickly - Logical isolation - Hard to manage (ex: add column across all schemas) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One schema per tenant @louisemeta

Slide 20

Slide 20 text

Solution 3: Shared tables architecture @louisemeta

Slide 21

Slide 21 text

Shared tables architecture @louisemeta

Slide 22

Slide 22 text

Shared tables architecture @louisemeta

Slide 23

Slide 23 text

PROS - Easy maintenance - Faster running migrations - Best resource utilization - Faster performance - Scales to 1k-100k tenants - Application code to guarantee isolation - Make sure ORM calls are always scoped to a single tenant CONS Shared tables architecture @louisemeta

Slide 24

Slide 24 text

3 ways to scale multi-tenant apps @louisemeta

Slide 25

Slide 25 text

Shared tables in your Django apps @louisemeta

Slide 26

Slide 26 text

Main problems to solve - Make sure ORM calls are always scoped to a single tenant - Include the tenant column to joins @louisemeta

Slide 27

Slide 27 text

django-multitenant Automates all ORM calls to be scoped to a single tenant @louisemeta

Slide 28

Slide 28 text

django-multitenant Owl.objects.filter(name=‘Hedwige’) <=> SELECT * from app_owl where name=‘Hedwige’ Owl.objects.filter(id=1) <=> SELECT * from app_owl WHERE name=‘Hedwige’ AND owner_id =

Slide 29

Slide 29 text

django-multitenant Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id) WHERE app_letter.id=1 Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id AND app_owl.owner_id=app_letter.owner_id) WHERE app_letter.id=1 AND app_owl.owner_id = @louisemeta

Slide 30

Slide 30 text

django-multitenant 3 steps 1. Change models to use TenantMixin and TenantManagerMixin
 2.Change ForeignKey to TenantForeignKey 
 3.Define tenant scoping: set_current_tenant(t) 
 @louisemeta

Slide 31

Slide 31 text

django-multitenant 3 steps Models before using django-multitenant class Owner(models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) class Owl(models.Model): name = models.CharField(max_length=255) owner = models.ForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) class Letters(models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) @louisemeta

Slide 32

Slide 32 text

django-multitenant 3 steps Models with django-multitenant class TenantManager(TenantManagerMixin, models.Manager): pass class Owner(TenantModelMixin, models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) tenant_id = ‘id’ objects = TenantManager() class Owl(TenantModelMixin, models.Model): name = models.CharField(max_length=255) owner = TenantForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) tenant_id = ‘owner_id’ objects = TenantManager() class Letters(TenantModelMixin, models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) owner = TenantForeignKey(Owner) tenant_id = ‘owner_id’ objects = TenantManager() @louisemeta

Slide 33

Slide 33 text

django-multitenant 3 steps set_current_tenant(t) - Specifies which tenant the APIs should be scoped to - Set at authentication logic via middleware - Set explicitly at top of function (ex. view, external tasks/jobs) @louisemeta

Slide 34

Slide 34 text

django-multitenant 3 steps set_current_tenant(t) in Middleware class TenantMiddleware: def __init__(self, get_response): self.get_response = get_response # One-time configuration and initialization. def __call__(self, request): #Assuming your app has a function to get the tenant associated for a user current_tenant = get_tenant_for_user(request.user) set_current_tenant(current_tenant) response = self.get_response(request) return response @louisemeta

Slide 35

Slide 35 text

django-multitenant Benefits of django-multitenant - Drop-in implementation of shared tables architecture - Guarantees isolation - Ready to scale with distributed Postgres (Citus) @louisemeta

Slide 36

Slide 36 text

Postgres, citus and shared tables @louisemeta

Slide 37

Slide 37 text

Why Postgres - Open source - Constraints - Rich SQL support - Extensions - PostGIS / Geospatial - HLL - TopN - Citus - Foreign data wrappers - Fun indexes (GIN, GiST, BRIN…) - CTEs - Window functions - Full text search - Datatypes - JSONB @louisemeta

Slide 38

Slide 38 text

Why citus - Citus is an open source extension for postgreSQL - Implements a distributed architecture for postgres - Allows you to scale out CPU, memory, etc. - Compatible with modern postgres (up to 11) @louisemeta

Slide 39

Slide 39 text

Distributed Postgres with citus @louisemeta

Slide 40

Slide 40 text

Distributed Postgres with citus Foreign key colocation @louisemeta

Slide 41

Slide 41 text

- Full SQL support for queries on a single set of co-located shards - Multi-statement transaction support for modifications on a single set of co-located shards - Foreign keys - … Distributed Postgres with citus Foreign key colocation @louisemeta

Slide 42

Slide 42 text

Scope your queries ! Distributed Postgres with citus @louisemeta

Slide 43

Slide 43 text

Why citus @louisemeta

Slide 44

Slide 44 text

Scale out Django! github.com/citusdata/django-multitenant [email protected]
 citusdata.com/newsletter @louisemeta. @citusdata