Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Query Fingerprinting

Query Fingerprinting

From a talk I gave at Ruby Ireland

Video here:
https://www.youtube.com/watch?v=OLLSDwMRLKc

ciaranlee

April 15, 2015
Tweet

More Decks by ciaranlee

Other Decks in Technology

Transcript

  1. Some of the things we did • percona toolkit •

    MySQL performance schema • Percona cloud tools1 (now supports RDS) • slow query log analysis • 2x Percona consulting engagements • Lots of conversations with AWS 1 Percona Cloud Tools
  2. In hindsight • We ran out of read capacity •

    It was not obvious • Found out when we started to fix it
  3. ActiveSupport Instrumentation API2 • Developers can create hooks in their

    code • Hooks can be subscribed to • Bunch of hooks in Rails, e.g. • processing of an action of a controller • rendering a partial 2 Check out the rails docs on Active Support Instrumentation
  4. Subscribing to ActiveRecord events ActiveSupport::Notifications.subscribe 'sql.active_record' do |*args| event =

    ActiveSupport::Notifications::Event.new(*args) puts "payload: #{event.payload}" puts "duration: #{event.duration}" end
  5. When ActiveRecord queries the database > App.find(6) payload: { :sql=>"SELECT

    * FROM `apps` WHERE `apps`.`id` = 1", :name=>"App Load", :connection_id=>70363132127620, :statement_name=>nil, :binds=>[] } duration: 1.156 >
  6. Query Fingerprinting idea • In the past we had a

    number of outages caused by suboptimal queries • New queries are unavoidable when shipping fast • Alerting of new queries would be cool
  7. Decided to give it a shot • We wanted to

    know • when new queries were introduced • which specific queries were slow • the relative differences in volume of the queries running on each table • which parts of our app were running certain queries
  8. Query Fingerprinting - generating a unique identifier /* raw */

    SELECT * FROM `apps` WHERE `apps`.`id` = 1; /* obfuscate */ SELECT * FROM `apps` WHERE `apps`.`id` = ? /* md5 */ '4fd58fda431298d217fa8e77b52859e8'
  9. class SqlMetricsProcessor attr_accessor :known_md5_signatures def initialize @known_md5_signatures = Rails.cache.read(:known_sql_fingerprint_md5_signatures) ||

    {} end def record_metrics(event) options = extract_options_from_event(event) send_metrics_to_statsd(options) SqlFingerprintWorker.perform_async(options) if should_record_fingerprint?(event, options) @known_md5_signatures[options[:md5_signature]] = true end def should_record_fingerprint?(event, options) @known_md5_signatures[options[:md5_signature]].nil? end end
  10. It's useful! • We know when new queries ship •

    We have timing and volume metrics for every query type
  11. Back to January • A small number of SELECTs dominated

    query volume (~20 queries over 25M QPD) • Most of those queries were for records that changed rarely • Mostly run by high volume API endpoints, which were sensitive to latency increases
  12. IdentityCache3 class App < ActiveRecord::Base include IdentityCache end @app =

    App.fetch(id) 3 Shopify's IdentityCache is awesome
  13. Our database liked caching :) • It became clear that

    dropping query volume improved DB health • Impact of database performance problems was reduced • The performance problems became less common
  14. Next steps • We're going to open source it •

    Will require a little effort because • tied to our production environment • requires sidekiq/memcache/rails model/views