Slide 1

Slide 1 text

Django MySQL Optimization Query investigation Using django logger LOGGING = { 'version': 1, 'handlers': { 'debug_console': { 'level': 'DEBUG', 'class': 'logging.StreamHandler', } }, 'loggers': { 'django.db.backends': { 'level': 'DEBUG', 'handlers': ['debug_console'], } } } Using breakpoint in pymysql source Set breakpoint at pymysql/connections.py:_execute_command You can watch all command including row level command. eg) SET AUTOCOMMIT, COMMIT You cannot watch "SET AUTOCOMMIT", "COMMIT" using django.db.backends logger Using Newrelic APM Databases Using Monyog Query analyzer QUERY ANALYSIS MODE with Sniffer Using performance schema settings.py

Slide 2

Slide 2 text

SELECT digest_text, count_star / update_total * 100 as percentage_of_all FROM performance_schema.events_statements_summary_by_digest, ( SELECT sum(count_star) update_total FROM performance_schema.events_statements_summary_by_digest WHERE digest_text LIKE 'UPDATE%' ) update_totals WHERE digest_text LIKE 'UPDATE%' ORDER BY percentage_of_all DESC LIMIT 10 SELECT * FROM performance_schema.events_statements_history_long WHERE THREAD_ID IN (SELECT THREAD_ID FROM events_statements_history_long WHERE SQL_TEXT = 'COMMIT') ORDER BY THREAD_ID, EVENT_ID ; Optimization examples Removing unnecessary autocommit Before DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'ATOMIC_REQUESTS': True, ... }, } Top 10 time consuming queries Finding where "COMMIT" comes from settings.py

Slide 3

Slide 3 text

def ping_general(request): return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 SET AUTOCOMMIT = 1 SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 Investigating problem First "SET AUTOCOMMIT = 0" autocommit function is called with value and compared with get_autocommit result which is retrieved from mysql View function Query output

Slide 4

Slide 4 text

Parameter of autocommit function comes from self.autocommit_mode self.autocommit_mode is assigned by autocommit variable autocommit variable is False by default

Slide 5

Slide 5 text

Connection is initialized without autocommit parameter Connection class is pymysql's class. Thus, first "SET AUTOCOMMIT = 0" is pymysql class's internal behavior

Slide 6

Slide 6 text

First "SET AUTOCOMMIT = 1"

Slide 7

Slide 7 text

After pymysql connection is created, django calls set_autocommit with AUTOCOMMIT flag which is True by django's default Second "SET AUTOCOMMIT = 0"

Slide 8

Slide 8 text

Called from Atomic classes __enter__ function. You can find Atomic class decorates view function when ATOMIC_REQUESTS is True

Slide 9

Slide 9 text

After DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'ATOMIC_REQUESTS': True, ... }, # pymysql autocommit default False connection set autocommit = false # None connection # django level connection set autocommit = true # pymysql connection response packet autocommit # autocommit autocommit # connection django set autocommit = true 'OPTIONS': { 'autocommit': None, } } settings.py

Slide 10

Slide 10 text

SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 First "SET AUTOCOMMIT = 0" is removed because we added autocommit OPTIONS First "SET AUTOCOMMIT = 1" is removed even django calls self.set_autocommit(self.settings_dict["AUTOCOMMIT"]) because pymysql automatically detects current autocommit mode and if it is same, it doesn't send a query Removing transaction with view function def ping_general(request): return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 @transaction.non_atomic_requests def ping_general(request): return HttpResponse('OK', content_type='application/json') No queries Removing transaction with view class Query output Before Query output After Query output

Slide 11

Slide 11 text

class ActivityView(views.APIView): pass @method_decorator(transaction.non_atomic_requests, name='dispatch') class ActivityView(views.APIView): pass Removing feature check queries @transaction.non_atomic_requests def ping_general(request): user = User.objects.cache().get(pk=44) logger.info(user.id) return HttpResponse('OK', content_type='application/json') SELECT @@SQL_AUTO_IS_NULL SELECT VERSION() SELECT xxx FROM `users` WHERE xxx... I expect no queries are executed at all after model is cached but it still executes two unnecessary query SELECT @@SQL_AUTO_IS_NULL SELECT VERSION() Investigating "SELECT @@SQL_AUTO_IS_NULL" Before After View function Query output with first call Query output with second call

Slide 12

Slide 12 text

Breakpoint with _execute_command Django is trying to get some feature flag from mysql

Slide 13

Slide 13 text

Investigating "SELECT VERSION()" Breakpoint with _execute_command

Slide 14

Slide 14 text

Django is trying to get mysql version Django needs mysql version to check supports_microsecond_precision

Slide 15

Slide 15 text

Depending on supports_microsecond_precision, django decides data_types Removing "SELECT @@SQL_AUTO_IS_NULL" & "SELECT VERSION()"

Slide 16

Slide 16 text

# auto_is_null false False DatabaseFeatures.is_sql_auto_is_null_enabled = False # MySQL 5.6.4 supports_microsecond_precision true # 5.6.4 True DatabaseFeatures.supports_microsecond_precision = True Removing unnecessary transaction with model save @transaction.non_atomic_requests def ping_general(request): RawDataRequests().save() return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 INSERT INTO raw_data_requests ... COMMIT SET AUTOCOMMIT = 1 Breakpoint at "SET AUTUCOMMIT = 0" project_root/project_name/_ _.py init View function Query output

Slide 17

Slide 17 text

There is atomic block inside django model's save function When you look at what happens inside the block save parent model handling pre_save signal update or insert and so on... It's impossible to optimize at the moment outside of django FAQ Why use transaction by default? Data integrity issues are usually found long after the problem actually happens Performance issues can be detected in early stage if it is critical Broken data is hard to fix Performance issues can be easily fixed