Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Multi-Tenant Applications Using the Dja...

Citus Data
February 17, 2019

Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbean 2019 | Louise Grandjonc

There are a number of data architectures you could use when building a multi-tenant app. Some, such as using one database per customer or one schema per customer. These two options scale to an extent when you have say 10s of tenants. However as you start scaling to hundreds and thousands of tenants, you start running into challenges both from performance and maintenance of tenants perspective. You could solve the above problem by adding the notion of tenancy directly into the logic of your SaaS application. How to implement/automate this in Django-ORM is a challenge? We will talk about how to make the django app tenant aware and at a broader level explain how scale out applications that are built on top of Django ORM and follow a multi tenant data model. We'd take postgresql as our database of choice and the logic/implementation can be extended to any other relational databases as well.

Citus Data

February 17, 2019
Tweet

More Decks by Citus Data

Other Decks in Technology

Transcript

  1. About me Software Engineer at Citus Data Postgres enthusiast @louisemeta

    and @citusdata on twitter www.louisemeta.com [email protected] @louisemeta
  2. Today’s agenda 1. What do we mean by “multi-tenancy”? 2.

    Three ways to scale a multi-tenant app 3. Shared tables in your Django apps 4. Postgres, citus and shared tables @louisemeta
  3. What do we mean by “multi-tenancy” - Multiple customers (tenants)

    - Each with their own data - SaaS - Example: shopify, salesforce @louisemeta
  4. What do we mean by “multi-tenancy” A very realistic example

    Owner examples: - Hogwarts - Ministry of Magic - Harry Potter - Post office @louisemeta
  5. - Organized collection of interrelated data - Don’t share resources:

    - Username and password - Connections - Memory @louisemeta One database per tenant
  6. 1. Changing the settings DATABASES = { 'tenant-{id1}': { 'ENGINE':

    'django.db.backends.postgresql', 'NAME': 'hogwarts', 'USER': 'louise', 'PASSWORD': ‘abc', 'HOST': '…', 'PORT': '5432' }, 'tenant-{id2}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': ‘ministry', 'USER': 'louise', 'PASSWORD': 'abc', 'HOST': '…', 'PORT': '5432' }, … } Warnings ! - You need to have each tenant in the settings. - When you have a new customer, you need to create a database and change the settings. One database per tenant @louisemeta
  7. 2. Handling migrations python manage.py migrate —database=tenant_id1; For each tenant,

    when you have a new migration One database per tenant @louisemeta
  8. Changes needed to handle it with Django ORM 3. Creating

    your own database router class ExampleDatabaseRouter(object): """ Determines on which tenant database to read/write """ def db_for_read(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def db_for_write(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def allow_relation(self, obj1, obj2, **hints): """Determine if relationship is allowed between two objects. The two objects have to be on the same database ;)””” pass One database per tenant @louisemeta
  9. PROS - Start quickly - Isolate customer (tenant) data -

    Compliance is a bit easier - If one customer is queried a lot, performance degrade will be low - Time for DBA/developer to manage - Hard to handle with ORMs - Maintain consistency (ex: create index across all databases) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One database per tenant @louisemeta
  10. - Logical namespaces to hold a set of tables -

    Share resources: - Username and password - Connections - Memory One schema per tenant @louisemeta
  11. PROS - Better resource utilization vs. one database per tenant

    - Start quickly - Logical isolation - Hard to manage (ex: add column across all schemas) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One schema per tenant @louisemeta
  12. PROS - Easy maintenance - Faster running migrations - Best

    resource utilization - Faster performance - Scales to 1k-100k tenants - Application code to guarantee isolation - Make sure ORM calls are always scoped to a single tenant CONS Shared tables architecture @louisemeta
  13. Main problems to solve - Make sure ORM calls are

    always scoped to a single tenant - Include the tenant column to joins @louisemeta
  14. django-multitenant Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl

    ON (app_owl.id=app_letter.deliverer_id) WHERE app_letter.id=1 Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id AND app_owl.owner_id=app_letter.owner_id) WHERE app_letter.id=1 AND app_owl.owner_id = <tenant_id> @louisemeta
  15. django-multitenant 3 steps 1. Change models to use TenantMixin and

    TenantManagerMixin
 2.Change ForeignKey to TenantForeignKey 
 3.Define tenant scoping: set_current_tenant(t) 
 @louisemeta
  16. django-multitenant 3 steps Models before using django-multitenant class Owner(models.Model): type

    = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) class Owl(models.Model): name = models.CharField(max_length=255) owner = models.ForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) class Letters(models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) @louisemeta
  17. django-multitenant 3 steps Models with django-multitenant class TenantManager(TenantManagerMixin, models.Manager): pass

    class Owner(TenantModelMixin, models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) tenant_id = ‘id’ objects = TenantManager() class Owl(TenantModelMixin, models.Model): name = models.CharField(max_length=255) owner = TenantForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) tenant_id = ‘owner_id’ objects = TenantManager() class Letters(TenantModelMixin, models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) owner = TenantForeignKey(Owner) tenant_id = ‘owner_id’ objects = TenantManager() @louisemeta
  18. django-multitenant 3 steps set_current_tenant(t) - Specifies which tenant the APIs

    should be scoped to - Set at authentication logic via middleware - Set explicitly at top of function (ex. view, external tasks/jobs) @louisemeta
  19. django-multitenant 3 steps set_current_tenant(t) in Middleware class TenantMiddleware: def __init__(self,

    get_response): self.get_response = get_response # One-time configuration and initialization. def __call__(self, request): #Assuming your app has a function to get the tenant associated for a user current_tenant = get_tenant_for_user(request.user) set_current_tenant(current_tenant) response = self.get_response(request) return response @louisemeta
  20. django-multitenant Benefits of django-multitenant - Drop-in implementation of shared tables

    architecture - Guarantees isolation - Ready to scale with distributed Postgres (Citus) @louisemeta
  21. Why Postgres - Open source - Constraints - Rich SQL

    support - Extensions - PostGIS / Geospatial - HLL - TopN - Citus - Foreign data wrappers - Fun indexes (GIN, GiST, BRIN…) - CTEs - Window functions - Full text search - Datatypes - JSONB @louisemeta
  22. Why citus - Citus is an open source extension for

    postgreSQL - Implements a distributed architecture for postgres - Allows you to scale out CPU, memory, etc. - Compatible with modern postgres (up to 11) @louisemeta
  23. - Full SQL support for queries on a single set

    of co-located shards - Multi-statement transaction support for modifications on a single set of co-located shards - Foreign keys - … Distributed Postgres with citus Foreign key colocation @louisemeta