Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Django MySQL Optimization

Django MySQL Optimization

By Zune

Buzzvil

June 27, 2018
Tweet

More Decks by Buzzvil

Other Decks in Programming

Transcript

  1. Django MySQL Optimization Query investigation Using django logger LOGGING =

    { 'version': 1, 'handlers': { 'debug_console': { 'level': 'DEBUG', 'class': 'logging.StreamHandler', } }, 'loggers': { 'django.db.backends': { 'level': 'DEBUG', 'handlers': ['debug_console'], } } } Using breakpoint in pymysql source Set breakpoint at pymysql/connections.py:_execute_command You can watch all command including row level command. eg) SET AUTOCOMMIT, COMMIT You cannot watch "SET AUTOCOMMIT", "COMMIT" using django.db.backends logger Using Newrelic APM Databases Using Monyog Query analyzer QUERY ANALYSIS MODE with Sniffer Using performance schema settings.py
  2. SELECT digest_text, count_star / update_total * 100 as percentage_of_all FROM

    performance_schema.events_statements_summary_by_digest, ( SELECT sum(count_star) update_total FROM performance_schema.events_statements_summary_by_digest WHERE digest_text LIKE 'UPDATE%' ) update_totals WHERE digest_text LIKE 'UPDATE%' ORDER BY percentage_of_all DESC LIMIT 10 SELECT * FROM performance_schema.events_statements_history_long WHERE THREAD_ID IN (SELECT THREAD_ID FROM events_statements_history_long WHERE SQL_TEXT = 'COMMIT') ORDER BY THREAD_ID, EVENT_ID ; Optimization examples Removing unnecessary autocommit Before DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'ATOMIC_REQUESTS': True, ... }, } Top 10 time consuming queries Finding where "COMMIT" comes from settings.py
  3. def ping_general(request): return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 SET

    AUTOCOMMIT = 1 SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 Investigating problem First "SET AUTOCOMMIT = 0" autocommit function is called with value and compared with get_autocommit result which is retrieved from mysql View function Query output
  4. Parameter of autocommit function comes from self.autocommit_mode self.autocommit_mode is assigned

    by autocommit variable autocommit variable is False by default
  5. Connection is initialized without autocommit parameter Connection class is pymysql's

    class. Thus, first "SET AUTOCOMMIT = 0" is pymysql class's internal behavior
  6. After pymysql connection is created, django calls set_autocommit with AUTOCOMMIT

    flag which is True by django's default Second "SET AUTOCOMMIT = 0"
  7. Called from Atomic classes __enter__ function. You can find Atomic

    class decorates view function when ATOMIC_REQUESTS is True
  8. After DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'ATOMIC_REQUESTS': True,

    ... }, # pymysql autocommit default False connection set autocommit = false # None connection # django level connection set autocommit = true # pymysql connection response packet autocommit # autocommit autocommit # connection django set autocommit = true 'OPTIONS': { 'autocommit': None, } } settings.py
  9. SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 First

    "SET AUTOCOMMIT = 0" is removed because we added autocommit OPTIONS First "SET AUTOCOMMIT = 1" is removed even django calls self.set_autocommit(self.settings_dict["AUTOCOMMIT"]) because pymysql automatically detects current autocommit mode and if it is same, it doesn't send a query Removing transaction with view function def ping_general(request): return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 COMMIT SET AUTOCOMMIT = 1 @transaction.non_atomic_requests def ping_general(request): return HttpResponse('OK', content_type='application/json') No queries Removing transaction with view class Query output Before Query output After Query output
  10. class ActivityView(views.APIView): pass @method_decorator(transaction.non_atomic_requests, name='dispatch') class ActivityView(views.APIView): pass Removing feature

    check queries @transaction.non_atomic_requests def ping_general(request): user = User.objects.cache().get(pk=44) logger.info(user.id) return HttpResponse('OK', content_type='application/json') SELECT @@SQL_AUTO_IS_NULL SELECT VERSION() SELECT xxx FROM `users` WHERE xxx... I expect no queries are executed at all after model is cached but it still executes two unnecessary query SELECT @@SQL_AUTO_IS_NULL SELECT VERSION() Investigating "SELECT @@SQL_AUTO_IS_NULL" Before After View function Query output with first call Query output with second call
  11. Django is trying to get mysql version Django needs mysql

    version to check supports_microsecond_precision
  12. # auto_is_null false False DatabaseFeatures.is_sql_auto_is_null_enabled = False # MySQL 5.6.4

    supports_microsecond_precision true # 5.6.4 True DatabaseFeatures.supports_microsecond_precision = True Removing unnecessary transaction with model save @transaction.non_atomic_requests def ping_general(request): RawDataRequests().save() return HttpResponse('OK', content_type='application/json') SET AUTOCOMMIT = 0 INSERT INTO raw_data_requests ... COMMIT SET AUTOCOMMIT = 1 Breakpoint at "SET AUTUCOMMIT = 0" project_root/project_name/_ _.py init View function Query output
  13. There is atomic block inside django model's save function When

    you look at what happens inside the block save parent model handling pre_save signal update or insert and so on... It's impossible to optimize at the moment outside of django FAQ Why use transaction by default? Data integrity issues are usually found long after the problem actually happens Performance issues can be detected in early stage if it is critical Broken data is hard to fix Performance issues can be easily fixed