Slide 1

Slide 1 text

API OPTIMIZATION TALE: API OPTIMIZATION TALE: MONITOR, FIX MONITOR, FIX AND DEPLOY AND DEPLOY (ON FRIDAY) (ON FRIDAY) MACIEK RZĄSA MACIEK RZĄSA TOPTAL TOPTAL Photo by on @MJRZASA @MJRZASA Tim Mossholder Unsplash

Slide 2

Slide 2 text

FRIDAY FRIDAY 16:03 16:03

Slide 3

Slide 3 text

Photo by on Max Baskakov Unsplash

Slide 4

Slide 4 text

Photo by on Fabius Leibrock Unsplash

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

TOPTAL TOPTAL PLATFORM PLATFORM

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

EXTRACTION EXTRACTION

Slide 9

Slide 9 text

EXTRACTION EXTRACTION

Slide 10

Slide 10 text

FROM FROM TO TO EXTRACTION EXTRACTION class Product < ApplicationRecord has_many :billing_records end class BillingRecord < ApplicationRecord belongs_to :product end class Product < ApplicationRecord def billing_records @billing_records ||= ::Billing::QueryService .billing_records_for_products(self) end end class BillingRecord def product @product ||= Product.find(product_id) end end

Slide 11

Slide 11 text

MONITOR MONITOR wait for it FIX: OPTIMIZE FIX: OPTIMIZE wait for it DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags .

Slide 12

Slide 12 text

EXTRACTION: FIRST ATTEMPT EXTRACTION: FIRST ATTEMPT

Slide 13

Slide 13 text

EXTRACTION: FIRST ATTEMPT EXTRACTION: FIRST ATTEMPT

Slide 14

Slide 14 text

EXTRACTION: GRAPHQL EXTRACTION: GRAPHQL

Slide 15

Slide 15 text

standard errors (Rollbar/Sentry) performance (NewRelic) custom request instrumentation (Kibana) method name arguments stacktrace response time (elapsed) error EXTRACTION: MONITORING EXTRACTION: MONITORING { "payload": { "method": "records_for_product", "arguments": "[[\"gid://platform/Product/12345\"]]", "stacktrace": "[ \"app/models/product.rb:123\", \"app/services/sell_product.rb:43\" ]", "elapsed": 1.128494586795568, "error": null } }

Slide 16

Slide 16 text

MONITOR MONITOR standard stack (Rollbar/NewRelic) custom request instrumentation FIX: OPTIMIZE FIX: OPTIMIZE wait for it DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags .

Slide 17

Slide 17 text

OPTIMIZE OPTIMIZE

Slide 18

Slide 18 text

FLOOD OF REQUESTS FLOOD OF REQUESTS PROBLEM: SINGLE VIEW/JOB INITIATES PROBLEM: SINGLE VIEW/JOB INITIATES MANY BILLING REQUESTS MANY BILLING REQUESTS HOW MANY? THOUSANDS! HOW MANY? THOUSANDS!

Slide 19

Slide 19 text

INITIAL INITIAL OPTIMIZED OPTIMIZED FLOOD OF REQUESTS FLOOD OF REQUESTS def perform(*) products = Product.eligible products.find_in_batches.each do |product| # one billing request per call DoBusinessLogic.call(product) end end class DoBusinessLogic def call(product) product.billing_records.each {} end end class Product < ApplicationRecord def billing_records @billing_records ||= ::Billing::QueryService .billing_records_for_products(self) end end def perform(*) products = Product.eligible products.find_in_batches do |batch| # one billing request per batch cache_billing_records(batch).each do |p| # no billing requests DoBusinessLogic.call(p) end end end def cache_billing_records(products) indexed_records = ::Billing::QueryService .billing_records_for_products( *products ) .group_by(&:product_gid) products.each do |product| product.cache_billing_records!( indexed_records[product.gid].to_a ) end end

Slide 20

Slide 20 text

FLOOD OF FLOOD OF REQUESTS? REQUESTS? PRELOAD, PRELOAD, CACHE, CACHE, (HASH-)JOIN (HASH-)JOIN ◀ Ulf Michael Widenius, MySQL. Image source: wikipedia.org

Slide 21

Slide 21 text

FREQUENTLY NEEDED DATA FREQUENTLY NEEDED DATA PROBLEM: SINGLE FIELD WAS FREQUENTLY USED PROBLEM: SINGLE FIELD WAS FREQUENTLY USED (~1K HITS PER DAY) (~1K HITS PER DAY)

Slide 22

Slide 22 text

PLAN PLAN add field to kafka build a read model backfill the data start using the read model remove billing query SOLUTION SOLUTION find that date in local DB verify if it's really the same date use it and remove billing query FREQUENTLY NEEDED DATA FREQUENTLY NEEDED DATA # 1k billing hits per day ::Billing::QueryService .first_successful_record_created_at(client) &.in_time_zone&.to_date # one local DB query client .products.successful .minimum(:start_date)

Slide 23

Slide 23 text

DATA NEEDED DATA NEEDED FREQUENTLY? FREQUENTLY? USE THE USE THE DOMAIN, LUKE! DOMAIN, LUKE! ◀ Image source: starwars.fandom.com

Slide 24

Slide 24 text

DATA FLOOD DATA FLOOD PROBLEM: GENERIC QUERIES PROBLEM: GENERIC QUERIES FETCHING ALL THE DATA THAT MIGHT BE NEEDED FETCHING ALL THE DATA THAT MIGHT BE NEEDED

Slide 25

Slide 25 text

DATA FLOOD DATA FLOOD # REST response { "gid": "gid://..." "clientGid": "gid://..." "productGid": "gid://..." "availability": true "pending": false "frequency": "weekly" "startDate": "2020-08-21" "endDate": "2020-10-28" # ... # 36 fields total # loading 3-4 associations } def billing_records_for_products(*products) fetch_billing_records( filter: {products: products} ).select(&:accessible?) end query($filter: RecordFilter!) { cycles(filter: $filter) { nodes { gid productGid pending frequency } } } def billing_records_for_products(*products) fetch_billing_records( filter: { products: products, accessible: true } ) end

Slide 26

Slide 26 text

Photo by on Erik-Jan Leusink Unsplash

Slide 27

Slide 27 text

WHAT COULD POSSIBLY GO WRONG? WHAT COULD POSSIBLY GO WRONG?

Slide 28

Slide 28 text

WHAT COULD POSSIBLY GO WRONG? WHAT COULD POSSIBLY GO WRONG? ROOT CAUSE ROOT CAUSE FIX FIX # REST client get('/records', **params.slice(:product_gids)) # DB query in billing def billing_records(product_gids: nil, gids: nil, client_gid: nil) scope = ::BillingRecord scope = scope.where(product_gid: product_gids) if product_gids scope = scope.where(gid: gids) if gids scope = scope.where(client_gid: client_gid) if client_gid scope.all end def billing_records(product_gids: nil, gids: nil, client_gid: nil) return [] if [product_gids, gids, client_gid].all?(&:blank?) # ... end

Slide 29

Slide 29 text

DATA FLOOD? DATA FLOOD? QUERY QUERY CUSTOMIZATION & CUSTOMIZATION & UNDERFETCHING UNDERFETCHING FILTERING ON THE FILTERING ON THE SERVER SIDE SERVER SIDE

Slide 30

Slide 30 text

TIP? TIP? ALWAYS TEST ALWAYS TEST MANUALLY. MANUALLY. ALWAYS. ALWAYS.

Slide 31

Slide 31 text

429 TOO MANY REQUESTS 429 TOO MANY REQUESTS PROBLEM: SPIKE OF REQUESTS PROBLEM: SPIKE OF REQUESTS EVERY SUNDAY EVENING EVERY SUNDAY EVENING

Slide 32

Slide 32 text

PROBLEM PROBLEM WEEK1. WEEK1. SOLUTION: PRELOADING SOLUTION: PRELOADING WEEK2. PROPER SOLUTION: JITTER WEEK2. PROPER SOLUTION: JITTER WEEK3. FINAL PROPER SOLUTION WEEK3. FINAL PROPER SOLUTION WEEK4. REALLY FINAL PROPER WEEK4. REALLY FINAL PROPER SOLUTION: RATE LIMITING SOLUTION: RATE LIMITING 429 TOO MANY REQUESTS 429 TOO MANY REQUESTS # scheduling at talent's 5 PM on Sunday eligible_products.each do |p| WeeklyReminder.schedule( product, day: :sunday, time: '17:00' ) end eligible_products.find_in_batches do |batch| with_billing_records_preloaded(batch) do batch.each do |product| WeeklyReminder.schedule( product, day: :sunday, time: '17:00' ) end end end # class WeeklyReminder def scheduling_time(*) super + (SecureRandom.rand * 120 - 60).seconds end # class AnotherWeeklyReminder def scheduling_time(*) super + (SecureRandom.rand * 120 - 60).seconds end Sidekiq::Limiter.window( 'weekly-reminder', RATE_LIMIT_COUNT, RATE_LIMIT_INTERVAL, wait_timeout: 2 )

Slide 33

Slide 33 text

429 TOO MANY 429 TOO MANY REQUESTS? REQUESTS? I DON'T ALWAYS TEST ON I DON'T ALWAYS TEST ON PRODUCTION PRODUCTION BUT WHEN I DO, I RUN BUT WHEN I DO, I RUN TESTS ON FRIDAY TESTS ON FRIDAY

Slide 34

Slide 34 text

MONITOR MONITOR standard stack (Rollbar/NewRelic) custom request instrumentation FIX: OPTIMIZE FIX: OPTIMIZE preloading to avoid N+1 server-side filtering using local data underfetching spreading the load DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags . NIHIL NOVI SUB SOLE NIHIL NOVI SUB SOLE

Slide 35

Slide 35 text

MONITOR MONITOR standard stack (Rollbar/NewRelic) custom request instrumentation FIX: OPTIMIZE FIX: OPTIMIZE preloading to avoid N+1 every ORM server-side filtering find_all{} vs where() using local data The Best Request Is No Request underfetching SELECT * vs SELECT a, b spreading the load highscalability.com post about YouTube, 2012 DEPLOY DEPLOY CI checks easy & reliable rollback safe env with a fallback feature flags NIHIL NOVI SUB SOLE NIHIL NOVI SUB SOLE

Slide 36

Slide 36 text

FAIL OFTEN SO FAIL OFTEN SO YOU CAN YOU CAN SUCCEED SUCCEED SOONER SOONER Tom Kelley Photo: snikologiannis/Flickr; http://ow.ly/CHwhd

Slide 37

Slide 37 text

MACIEK MACIEK RZĄSA RZĄSA Q&A Q&A  @mjrzasa