Scaling Rails: The Journey to 200M Notifications

SCALING RAILS The Journey to 200M Notiﬁcations

󰞦 Software Engineer @ CloudWalk gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo

󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on
Rails and Elixir gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo

Rails and Elixir 💜 Passionate about code quality, observability, and performance gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo

Rails and Elixir 💜 Passionate about code quality, observability, and performance 󰜼 Sharing knowledge through talks and technical content gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo

Application Challenges ➡ High-Volume Notiﬁcations (1B last year)

Application Challenges ➡ High-Volume Notiﬁcations (1B last year) ➡ Multi-channel
(e.g., WhatsApp, SMS, Email, Push)

Application Challenges ➡ High-Volume Notiﬁcations (1B last year) ➡ Several
Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)

Application Challenges ➡ High-Volume Notiﬁcations (1B last year) ➡ Several
Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push) ➡ Unpredictable workloads

The challenges we faced, and the key lessons learned

CONCEPT OF SCALE

HOW WE STARTED?

CHOOSE YOUR WEBSERVER

⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Ruby Webservers

WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Ruby Webservers

WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Ruby Webservers

WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Puma Multi-threaded + multi-process (clustered) Highly conﬁgurable — scales with threads and workers Modern Rails apps needing concurrent I/O handling Requires tuning (threads/workers + DB pool alignment) Ruby Webservers

Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound
applications Higher memory usage since each worker is a separate process

Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound
applications Higher memory usage since each worker is a separate process Single Mode Best for I/O-bound apps, lower memory usage Limitation: Single process, may not fully utilize multi-core CPUs

How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare
metal Use 1 worker per available CPU core

metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2

metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads

metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads 🧪 Unclear limits or mixed load Start with 2–4 workers and benchmark

How Many Threads? Factor Guideline 🔄 I/O-bound app Use more
threads (e.g. 16–32) to handle concurrency

threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention

threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention 🧵 Low traffic env (dev/staging) Use something like threads 1, 4

DEPLOYMENT STRATEGIES

Platform as a Service (PaaS) Example: Heroku, ﬂy.io ✅ Easy
to get started and Built-in autoscaling

Platform as a Service (PaaS) Example: Heroku, ﬂy.io ✅ Easy
to get started and Built-in autoscaling ❌ Cost to scale and limited control over infrastructure for advanced tuning

Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅
Full control (Instances, Auto Scaling, Load Balancers)

Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅
Full control (Instances, Auto Scaling, Load Balancers) ❌ High availability requires manual setup, ﬁxed resource provisioning

Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure,
or On-Premise) ✅ Portability and ﬁne-grained control over resources(CPU, memory, limits per pod)

Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure,
or On-Premise) ✅ Portability and ﬁne-grained control over resources(CPU, memory, limits per pod) ❌ Steep learning curve — complex concepts (pods, services, volumes)

Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts
to CPU/memory usage

to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods

to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods 🧱 Cluster Autoscaler → adds/removes nodes to accommodate workload

Horizontal Pod Autoscaling (HPA)

Kubernetes Event-driven Autoscaling (KEDA)

WHAT WE LEARNED

Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool
Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors

Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance

Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance Resource Overcommit Too many workers × threads for available CPU/memory Application instability due to memory exhaustion (OOM) or CPU throttling

Database Insights ➡ Read Replicas

Database Insights ➡ Read Replicas ➡ Add and maintain proper
indexes

indexes ➡ Optimize slow queries

indexes ➡ Optimize slow queries ➡ Use partitioning

indexes ➡ Optimize slow queries ➡ Use partitioning ➡ Consider cache strategies

Pro Tips Use background jobs Offload heavy work to async
queues

Fail fast, retry smart Use retries with backoff (Sidekiq, Shoryuken,
etc.) to avoid overload loops Pro Tips

Use observability tools Datadog, Sentry, NewRelic, AppSignal, Skylight… Pro Tips

Thank you! gustavoaraujo.dev garaujodev garaujodev We are hiring! cloudwalk.io/jobs

References https://www.speedshop.co/2015/07/29/scaling-ruby-apps-to-1000-rpm.html https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/ https://keda.sh/docs/2.16/concepts/ https://kubernetes.io/docs/concepts/workloads/autoscaling/

Scaling Rails: The Journey to 200M Notifications

Scaling Rails: The Journey to 200M Notifications

Other Decks in Programming

Featured

Transcript