Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Handling 225k requests per second to RubyGems.org

Handling 225k requests per second to RubyGems.org

Delieved at rubyconf taiwan 2023

Samuel E. Giddins

December 16, 2023
Tweet

More Decks by Samuel E. Giddins

Other Decks in Programming

Transcript

  1. Handling 225k requests per second to RubyGems.org 1 — Handling

    225k requests per second to RubyGems.org @ RubyConf TW 2023
  2. Your intrepid presenter @segiddins → Samuel Giddins → RubyGems, Bundler,

    RubyGems.org maintainer → 10+ year bug contributor 2 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  3. Your intrepid presenter Security Engineer in Residence @ Ruby Central

    Sponsored by AWS 3 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  4. Some Q&A Show of hands if you've ever run gem

    install 5 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  5. Some Q&A Show of hands if you're ever put source

    "https://rubygems.org" in a Gemfile 6 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  6. Guess What You've helped contribute to the traffic that RubyGems.org

    serves 8 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  7. RubyGems statistics As of October 2023, our major numbers are

    → 181,745 users → 192,825 gems → 1,555,458 versions of gems → 147,326,326,048 total gem downloads → average 20,000 requests/ second → average 2 billion requests/weekday → maximum 225,000 requests/second → 7.5 TB/hour, 185 TB/day → 4.6 PB/month, 54 PB/year 11 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  8. How much does this cost??? 12 — Handling 225k requests

    per second to RubyGems.org @ RubyConf TW 2023
  9. How much does this cost? AWS $20,000 / month 13

    — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  10. How much does this cost? Fastly $500,000 / month 14

    — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  11. Who runs all this? 15 — Handling 225k requests per

    second to RubyGems.org @ RubyConf TW 2023
  12. Who runs all this? Ruby Central & Volunteers 16 —

    Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  13. Who runs all this? Until 3 weeks ago, nobody worked

    full time on RubyGems.org 17 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  14. Who runs all this? → We have a 24/7 on

    call rotation → We only get paged a few times a year for really minor outages 18 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  15. How? So how in the world could we handle that

    kind of traffic? 19 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  16. How? No, it isn't little elves typing out HTTP/1.1 responses

    by hand 21 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  17. How? a secret weapon 22 — Handling 225k requests per

    second to RubyGems.org @ RubyConf TW 2023
  18. Our Secret We don't serve most of that traffic ourselves

    23 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  19. Our Secret WE CHEAT 24 — Handling 225k requests per

    second to RubyGems.org @ RubyConf TW 2023
  20. We Cheat Fair enough! 25 — Handling 225k requests per

    second to RubyGems.org @ RubyConf TW 2023
  21. We Cheat Good artists copy, Great artists steal, And smart

    engineers do no work 26 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  22. We Cheat the fastest work you can do is no

    work at all 27 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  23. Someone else doing the work is almost as good as

    no one doing it 28 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  24. Rails is amazing, we love having RubyGems.org be a Rails

    app written in Ruby... 29 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  25. Rails is amazing, we love having RubyGems.org be a Rails

    app written in Ruby... There are limits to what a full-featured web app can serve 30 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  26. We rely heavily on letting other, simpler & more optimized,

    systems serve the vast, vast majority of our traffic 31 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  27. Request Lifecycle → Client → Fastly edge POP → Fastly

    shield POP → Backend →S3 →Rails app → AWS ALB → nginx → puma 32 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  28. Request Lifecycle Each layer is optimized for something different →

    Fastly edge POP →cache, close to client → Fastly shield POP →cache, all request to backend flow through a single POP → S3 →static content store → nginx →buffering, rules for rate-limiting → puma →serves the rails app 33 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  29. Story Time aka the weekend sam got paged 35 —

    Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  30. Story Time aka the weekend sam got paged aka the

    time we hit 225k rps 36 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  31. (Back)Story Time Dependency API deprecation caused a 25x increase in

    traffic, from ~10k rps to ~225k rps 38 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  32. (Back)Story Time Why did we do that? Better uptime and

    stability by removing the dependency API 39 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  33. Story Time How does that lead to a 25x increase

    in traffic? 40 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  34. Story Time → New Bundler & RubyGems versions mostly stayed

    the same → Old (really, really old) bundler versions fell back to the "full index" →Which involves downloading https://rubygems.org/ specs.4.8.gz →And also https://index.rubygems.org/quick/ Marshal.4.8/rails-7.0.8.gemspec.rz 41 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  35. Story Time Lots of people started updating... 42 — Handling

    225k requests per second to RubyGems.org @ RubyConf TW 2023
  36. Story Time Misbehaving clients 10k rps from one IP 43

    — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  37. Story Time Misbehaving clients We blocked requests from that User

    Agent / IP combination 44 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  38. Story Time Misbehaving clients We politely asked that they contact

    us so we could help them stop DOSing us 45 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  39. Story Time Misbehaving clients Started to upgrade (yay!) 46 —

    Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  40. Story Time → More traffic to https://index.rubygems.org/versions → More traffic

    to https://index.rubygems.org/info/ rails → ... which were being served by complex queries in the rails app →Only on very rare cache misses 47 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  41. Lesson Cache misses aren't rare at this scale 48 —

    Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  42. Lesson 0.1% times a very very very large number is

    still a large number 49 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  43. Solution Don't serve those requests! 50 — Handling 225k requests

    per second to RubyGems.org @ RubyConf TW 2023
  44. Solution Spend 12 hours on a Saturday writing code 51

    — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  45. Solution → Enqueue a background job on gem push →

    Pre-calculate responses → Upload responses to S3 → Have Fastly serve those requests from S3 52 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  46. Solution → Have Fastly serve those requests from S3 →

    Deploy → Sleep 14 hours → Profit? 53 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  47. Lessons → Don't serve traffic → If you have to

    serve traffic, let someone else do it → Please don't DDoS yourself → Find appropriate places to do different kinds of work → Do all the "normal" optimization work 54 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  48. Lessons → Use rails caching → Add indices to your

    database tables → Set cache-control headers → Scale up your DB & web server instances → Add rate limiting 55 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  49. Lessons Be very grateful for sponsors who provide those services

    for free 56 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  50. Lessons Use true but shocking & misleading talk titles 57

    — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  51. Lessons Ask very politely for people to not install gems

    when I have brunch plans 58 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023
  52. Thank You Samuel Giddins Security Engineer in Residence @ Ruby

    Central Sponsored by AWS 60 — Handling 225k requests per second to RubyGems.org @ RubyConf TW 2023