Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling Rails: The Journey to 200M Notifications

Scaling Rails: The Journey to 200M Notifications

In the talk "Scaling Rails: The Journey to 200M Notifications", Gustavo Araujo will share his experience scaling a Ruby on Rails application to send millions of notifications, covering challenges, trade-offs, and technical decisions involved in turning an MVP into a robust and scalable system.

Avatar for Gustavo Araújo

Gustavo Araújo

April 09, 2025
Tweet

Other Decks in Programming

Transcript

  1. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on

    Rails and Elixir gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  2. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on

    Rails and Elixir 💜 Passionate about code quality, observability, and performance gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  3. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on

    Rails and Elixir 💜 Passionate about code quality, observability, and performance 󰜼 Sharing knowledge through talks and technical content gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  4. Application Challenges ➡ High-Volume Notifications (1B last year) ➡ Several

    Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
  5. Application Challenges ➡ High-Volume Notifications (1B last year) ➡ Several

    Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push) ➡ Unpredictable workloads
  6. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations

    WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Ruby Webservers
  7. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations

    WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Ruby Webservers
  8. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations

    WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Ruby Webservers
  9. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations

    WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Puma Multi-threaded + multi-process (clustered) Highly configurable — scales with threads and workers Modern Rails apps needing concurrent I/O handling Requires tuning (threads/workers + DB pool alignment) Ruby Webservers
  10. Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound

    applications Higher memory usage since each worker is a separate process
  11. Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound

    applications Higher memory usage since each worker is a separate process Single Mode Best for I/O-bound apps, lower memory usage Limitation: Single process, may not fully utilize multi-core CPUs
  12. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare

    metal Use 1 worker per available CPU core
  13. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare

    metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2
  14. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare

    metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads
  15. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare

    metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads 🧪 Unclear limits or mixed load Start with 2–4 workers and benchmark
  16. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more

    threads (e.g. 16–32) to handle concurrency
  17. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more

    threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention
  18. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more

    threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention 🧵 Low traffic env (dev/staging) Use something like threads 1, 4
  19. Platform as a Service (PaaS) Example: Heroku, fly.io ✅ Easy

    to get started and Built-in autoscaling
  20. Platform as a Service (PaaS) Example: Heroku, fly.io ✅ Easy

    to get started and Built-in autoscaling ❌ Cost to scale and limited control over infrastructure for advanced tuning
  21. Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅

    Full control (Instances, Auto Scaling, Load Balancers)
  22. Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅

    Full control (Instances, Auto Scaling, Load Balancers) ❌ High availability requires manual setup, fixed resource provisioning
  23. Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure,

    or On-Premise) ✅ Portability and fine-grained control over resources(CPU, memory, limits per pod)
  24. Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure,

    or On-Premise) ✅ Portability and fine-grained control over resources(CPU, memory, limits per pod) ❌ Steep learning curve — complex concepts (pods, services, volumes)
  25. Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts

    to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods
  26. Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts

    to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods 🧱 Cluster Autoscaler → adds/removes nodes to accommodate workload
  27. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool

    Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors
  28. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool

    Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance
  29. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool

    Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance Resource Overcommit Too many workers × threads for available CPU/memory Application instability due to memory exhaustion (OOM) or CPU throttling
  30. Database Insights ➡ Read Replicas ➡ Add and maintain proper

    indexes ➡ Optimize slow queries ➡ Use partitioning
  31. Database Insights ➡ Read Replicas ➡ Add and maintain proper

    indexes ➡ Optimize slow queries ➡ Use partitioning ➡ Consider cache strategies