Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multitenant Applications: How and Why

Multitenant Applications: How and Why

Applications often need multitenancy at some level. The most common scenario is to keep data isolated among clients. One way to achieve this is to have multiple database instances and connect to each according to the user accessing the system. Another approach is to have a single database and model relationships so it's possible to query data separately. The last common way is again to have a single database instance, but this time there are multiple separate schemas. I'll go over each of these approaches. For each, you will learn about the architecture, understand how to build it using Django, see examples on how to make queries and learn what tools can help on the job. By the end, you will understand key differences and be able to choose the approach that better suits your next application.

Filipe Ximenes

August 04, 2017
Tweet

More Decks by Filipe Ximenes

Other Decks in Programming

Transcript

  1. Who am I? • Filipe Ximenes • Recife / Brazil

    • Aussie for 1 year (2008 - 2009)
  2. "... refers to a software architecture in which a single

    instance of software runs on a server and serves multiple tenants." - Wikipedia
  3. What we want to achieve? • Reduce infrastructure costs by

    sharing hardware resources • Simplify software maintenance by keeping a single code base • Simplify infrastructure maintenance by having fewer nodes
  4. Routing - ibm.spinnertracking.com def tenant_middleware(get_response): def middleware(request): host = request.get_host().split(':')[0]

    subdomain = host.split('.')[0] try: customer = Customer.objects.get(name=subdomain) except Customer.DoesNotExist: customer = None request.customer = customer response = get_response(request) return response return middleware
  5. Drawbacks • Guaranteeing isolation is hard • Might lead to

    complexity to the codebase • 3rd party library integration
  6. Routing DATABASES = { 'default': { 'ENGINE': ..., 'NAME': ...,

    }, 'ibm': { 'ENGINE': ..., 'NAME': ..., } }
  7. The threadlocal middleware approach def multidb_middleware(get_response): def middleware(request): subdomain =

    get_subdomain(request) customer = get_customer(subdomain) request.customer = customer @thread_local(using_db=customer.name) def execute_request(request): return get_response(request) response = execute_request(request) return response return middleware
  8. The router class TenantRouter(object): def db_for_read(self, model, **hints): return get_thread_local('using_db',

    'default') def db_for_write(self, model, **hints): return get_thread_local('using_db', 'default') # … # settings.py DATABASE_ROUTERS = ['multitenancy.routers.TenantRouter']
  9. What are schemas in the first place? SELECT id, name

    FROM user WHERE user.name LIKE 'F%';
  10. What are schemas in the first place? CREATE SCHEMA ibm;

    SELECT id, name FROM ibm.user WHERE ibm.user.name LIKE 'F%';
  11. Routing - middleware # ... connection.set_schema_to_public() hostname = self.hostname_from_request(request) TenantModel

    = get_tenant_model() try: tenant = self.get_tenant(TenantModel, hostname, request) assert isinstance(tenant, TenantModel) except TenantModel.DoesNotExist: # ... request.tenant = tenant connection.set_tenant(request.tenant) # ...
  12. Routing - settings MIDDLEWARE_CLASSES = [ 'tenant_schemas.middleware.TenantMiddleware', # … ]

    DATABASES = { 'default': { 'ENGINE': 'tenant_schemas.postgresql_backend', 'NAME': 'mydb', } }
  13. Routing - db backend # ... try: cursor_for_search_path.execute( 'SET search_path

    = {0}'.format(','.join(search_paths))) except (django.db.utils.DatabaseError, psycopg2.InternalError): self.search_path_set = False else: self.search_path_set = True if name: cursor_for_search_path.close() # ...
  14. SELECT id, duration FROM ibm.spinner_spin WHERE duration > 120 UNION

    SELECT id, duration FROM vinta.spinner_spin WHERE duration > 120; Querying across schemas
  15. SELECT uuid, duration FROM ibm.spinner_spin WHERE duration > 120 UNION

    SELECT uuid, duration FROM vinta.spinner_spin WHERE duration > 120; Querying across schemas
  16. Upsides • Querying looks same as standard application • New

    schemas created automatically • Knows how to handle migrations • Simpler infrastructure
  17. Drawbacks • Be carefull with too many schemas (maybe not

    more than 100's clients?) • Tests need some setup and might get slower • Harder to query across schemas