Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitoring and Introspecting Django

Monitoring and Introspecting Django

A 15 minute presentation given at PyCon 2014 as part of the Advanced Django Patterns workshop http://lanyrd.com/2014/pycon/scxqhp/ discussing techniques for monitoring and debugging large-scale Django applications in production.

E172168287724cd3051588354ded616b?s=128

Simon Willison

April 11, 2014
Tweet

Transcript

  1. Monitoring and introspecting Django Simon Willison, @simonw ! PyCon 2014

  2. None
  3. The interesting bugs only happen in production.

  4. None
  5. What went wrong? ! What’s going to go wrong?

  6. class UserBasedExceptionMiddleware:! def process_exception(self, request, exception):! if request.user.is_superuser:! return technical_500_response(request,

    *exc_info()) djangosnippets.org/snippets/935/
  7. None
  8. None
  9. StatsD + Graphite

  10. StatsD! • Timers, counters, gauges • Local daemon, speaking UDP

    • Aggregates stats and sends to Graphite • Graphite! • Stores time-series data • Renders graphs on-demand
  11. StatsD! • Timers, counters, gauges • Local daemon, speaking UDP

    • Aggregates stats and sends to Graphite Graphite! • Stores time-series data • Renders graphs on-demand
  12. None
  13. None
  14. GRAPH ALL THE THINGS

  15. Intercept everything (monkey-patch if you have to)

  16. • response/exception middleware • • DatabaseWrapper cursor.execute() • outgoing HTTP

    traffic render(request, template, context)!
  17. Logs should be aggregated and searchable

  18. Splunk

  19. logstash + kibana

  20. Correlation IDs

  21. Application Service A Service B request

  22. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 request

  23. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 request

  24. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 SERVICE_A do_action ... 9110dbba-6dd9...

    request
  25. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 SERVICE_A do_action ... 9110dbba-6dd9...

    request
  26. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 SERVICE_A do_action ... 9110dbba-6dd9...

    SERVICE_B do_action ... 9110dbba-6dd9... request
  27. Application Service A Service B 9110dbba-6dd9-4d1c-8828-11be81ac0561 SERVICE_A do_action ... 9110dbba-6dd9...

    SERVICE_B do_action ... 9110dbba-6dd9... request GET /bar/ ... 9110dbba-6dd9...
  28. Application Service A Service B SERVICE_A do_action ... 9110dbba-6dd9... SERVICE_B

    do_action ... 9110dbba-6dd9... response GET /bar/ ... 9110dbba-6dd9...
  29. <meta name="correlation_id" ! content="9110dbba-6dd9-4d1c-8828-11be81ac0561" />!

  30. None
  31. None
  32. Instrument your SQL queries

  33. /* /2014/pycon/ */ SELECT "events_userevent"... /* manage.py send_subscriptions 1000 */

    SELECT... /* 9110dbba-6dd9-4d1c-8828-11be81ac0561 */ SELECT...
  34. No-one ever said… “I wish I had less information to

    help debug this problem”
  35. • The most interesting bugs happen in production • Use

    statsd/graphite to understand what’s going on in your stack • Logs should be detailed, aggregated and searchable