Upgrade to Pro — share decks privately, control downloads, hide ads and more …

API Optimization Tale: Monitor, Fix and Deploy (on Friday). GopherCon Europe 2023

API Optimization Tale: Monitor, Fix and Deploy (on Friday). GopherCon Europe 2023

mrzasa

June 28, 2023
Tweet

More Decks by mrzasa

Other Decks in Programming

Transcript

  1. API OPTIMISATION TALE: API OPTIMISATION TALE: MONITOR, FIX MONITOR, FIX

    AND DEPLOY AND DEPLOY (ON FRIDAY) (ON FRIDAY) MACIEK RZĄSA · CHATTERMILL MACIEK RZĄSA · CHATTERMILL Photo by from STpcsDG @MJRZASA @MJRZASA Miguel Á. Padriñán Pixabay
  2. FROM FROM TO TO EXTRACTION EXTRACTION class Product < ApplicationRecord

    has_many :billing_records end class BillingRecord < ApplicationRecord belongs_to :product end class Product < ApplicationRecord def billing_records @billing_records ||= ::Billing::QueryService .billing_records_for_products(self) end end class BillingRecord def product @product ||= Product.find(product_id) end end
  3. MONITOR MONITOR wait for it FIX: OPTIMISE FIX: OPTIMISE wait

    for it DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags .
  4. standard errors (Rollbar/Sentry) performance (NewRelic) custom request instrumentation (Kibana) method

    name arguments stacktrace response time (elapsed) error EXTRACTION: MONITORING EXTRACTION: MONITORING { "payload": { "method": "records_for_product", "arguments": "[[\"gid://platform/Product/12345\"]]", "stacktrace": "[ \"app/models/product.rb:123\", \"app/services/sell_product.rb:43\" ]", "elapsed": 1.128494586795568, "error": null } }
  5. MONITOR MONITOR standard stack (Rollbar/NewRelic) custom request instrumentation FIX: OPTIMISE

    FIX: OPTIMISE wait for it DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags .
  6. FLOOD OF REQUESTS FLOOD OF REQUESTS PROBLEM: SINGLE VIEW/JOB INITIATES

    PROBLEM: SINGLE VIEW/JOB INITIATES MANY BILLING REQUESTS MANY BILLING REQUESTS HOW MANY? THOUSANDS! HOW MANY? THOUSANDS!
  7. INITIAL INITIAL OPTIMISED OPTIMISED FLOOD OF REQUESTS FLOOD OF REQUESTS

    def perform(*) products = Product.eligible products.find_in_batches.each do |product| # one billing request per call DoBusinessLogic.call(product) end end class DoBusinessLogic def call(product) product.billing_records.each {} end end class Product < ApplicationRecord def billing_records @billing_records ||= ::Billing::QueryService .billing_records_for_products(self) end end def perform(*) products = Product.eligible products.find_in_batches do |batch| # one billing request per batch cache_billing_records(batch).each do |p| # no billing requests DoBusinessLogic.call(p) end end end def cache_billing_records(products) indexed_records = ::Billing::QueryService .billing_records_for_products( *products ) .group_by(&:product_gid) products.each do |product| product.cache_billing_records!( indexed_records[product.gid].to_a ) end end
  8. FLOOD OF FLOOD OF REQUESTS? REQUESTS? PRELOAD, PRELOAD, CACHE, CACHE,

    (HASH-)JOIN (HASH-)JOIN ◀ Ulf Michael Widenius, MySQL. Image source: wikipedia.org
  9. FREQUENTLY NEEDED DATA FREQUENTLY NEEDED DATA PROBLEM: SINGLE FIELD WAS

    FREQUENTLY USED PROBLEM: SINGLE FIELD WAS FREQUENTLY USED (~1K HITS PER DAY) (~1K HITS PER DAY)
  10. PLAN PLAN add field to kafka build a read model

    backfill the data start using the read model remove billing query SOLUTION SOLUTION find that date in local DB verify if it's really the same date use it and remove billing query FREQUENTLY NEEDED DATA FREQUENTLY NEEDED DATA # 1k billing hits per day ::Billing::QueryService .first_successful_record_created_at(client) &.in_time_zone&.to_date # one local DB query client .products.successful .minimum(:start_date)
  11. DATA NEEDED DATA NEEDED FREQUENTLY? FREQUENTLY? USE THE USE THE

    DOMAIN, LUKE! DOMAIN, LUKE! ◀ Image source: starwars.fandom.com
  12. DATA FLOOD DATA FLOOD PROBLEM: GENERIC QUERIES PROBLEM: GENERIC QUERIES

    FETCHING ALL THE DATA THAT MIGHT BE NEEDED FETCHING ALL THE DATA THAT MIGHT BE NEEDED
  13. DATA FLOOD DATA FLOOD # REST response { "gid": "gid://..."

    "clientGid": "gid://..." "productGid": "gid://..." "availability": true "pending": false "frequency": "weekly" "startDate": "2020-08-21" "endDate": "2020-10-28" # ... # 36 fields total # loading 3-4 associations } def billing_records_for_products(*products) fetch_billing_records( filter: {products: products} ).select(&:accessible?) end query($filter: RecordFilter!) { cycles(filter: $filter) { nodes { gid productGid pending frequency } } } def billing_records_for_products(*products) fetch_billing_records( filter: { products: products, accessible: true } ) end
  14. WHAT COULD POSSIBLY GO WRONG? WHAT COULD POSSIBLY GO WRONG?

    ROOT CAUSE ROOT CAUSE FIX FIX # REST client get('/records', **params.slice(:product_gids)) # DB query in billing def billing_records(product_gids: nil, gids: nil, client_gid: nil) scope = ::BillingRecord scope = scope.where(product_gid: product_gids) if product_gids scope = scope.where(gid: gids) if gids scope = scope.where(client_gid: client_gid) if client_gid scope.all end def billing_records(product_gids: nil, gids: nil, client_gid: nil) return [] if [product_gids, gids, client_gid].all?(&:blank?) # ... end
  15. DATA FLOOD? DATA FLOOD? QUERY QUERY CUSTOMISATION & CUSTOMISATION &

    UNDERFETCHING UNDERFETCHING FILTERING ON THE FILTERING ON THE SERVER SIDE SERVER SIDE
  16. 429 TOO MANY REQUESTS 429 TOO MANY REQUESTS PROBLEM: SPIKE

    OF REQUESTS PROBLEM: SPIKE OF REQUESTS EVERY SUNDAY EVENING EVERY SUNDAY EVENING
  17. PROBLEM PROBLEM WEEK1. WEEK1. SOLUTION: PRELOADING SOLUTION: PRELOADING WEEK2. PROPER

    SOLUTION: JITTER WEEK2. PROPER SOLUTION: JITTER WEEK3. FINAL PROPER SOLUTION WEEK3. FINAL PROPER SOLUTION WEEK4. REALLY FINAL PROPER WEEK4. REALLY FINAL PROPER SOLUTION: RATE LIMITING SOLUTION: RATE LIMITING 429 TOO MANY REQUESTS 429 TOO MANY REQUESTS # scheduling at talent's 5 PM on Sunday eligible_products.each do |p| WeeklyReminder.schedule( product, day: :sunday, time: '17:00' ) end eligible_products.find_in_batches do |batch| with_billing_records_preloaded(batch) do batch.each do |product| WeeklyReminder.schedule( product, day: :sunday, time: '17:00' ) end end end # class WeeklyReminder def scheduling_time(*) super + (SecureRandom.rand * 120 - 60).seconds end # class AnotherWeeklyReminder def scheduling_time(*) super + (SecureRandom.rand * 120 - 60).seconds end Sidekiq::Limiter.window( 'weekly-reminder', RATE_LIMIT_COUNT, RATE_LIMIT_INTERVAL, wait_timeout: 2 )
  18. 429 TOO MANY 429 TOO MANY REQUESTS? REQUESTS? I DON'T

    ALWAYS TEST ON I DON'T ALWAYS TEST ON PRODUCTION PRODUCTION BUT WHEN I DO, I RUN BUT WHEN I DO, I RUN TESTS ON FRIDAY TESTS ON FRIDAY
  19. MONITOR MONITOR standard stack (Rollbar/NewRelic) custom request instrumentation FIX: OPTIMISE

    FIX: OPTIMISE preloading to avoid N+1 server-side filtering using local data underfetching spreading the load DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags . NIHIL NOVI SUB SOLE NIHIL NOVI SUB SOLE
  20. FAIL OFTEN SO FAIL OFTEN SO YOU CAN YOU CAN

    SUCCEED SUCCEED SOONER SOONER Tom Kelley Photo: snikologiannis/Flickr; http://ow.ly/CHwhd