$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building Adaptive Systems
Search
Chris Keathley
May 28, 2020
Programming
44
2.8k
Building Adaptive Systems
Chris Keathley
May 28, 2020
Tweet
Share
More Decks by Chris Keathley
See All by Chris Keathley
Solid code isn't flexible
keathley
5
1.1k
Contracts for building reliable systems
keathley
6
970
Kafka, the hard parts
keathley
3
1.8k
Building Resilient Elixir Systems
keathley
7
2.4k
Consistent, Distributed Elixir
keathley
6
1.6k
Telling stories with data visualization
keathley
1
660
Easing into continuous deployment
keathley
2
410
Leveling up your git skills
keathley
0
800
Generative Testing in Elixir
keathley
0
550
Other Decks in Programming
See All in Programming
競馬で学ぶ機械学習の基本と実践 / Machine Learning with Horse Racing
shoheimitani
14
14k
CSC305 Lecture 14
javiergs
PRO
0
330
目的で駆動する、AI時代のアーキテクチャ設計 / purpose-driven-architecture
minodriven
11
3.7k
これだけで丸わかり!LangChain v1.0 アップデートまとめ
os1ma
4
360
乱雑なコードの整理から学ぶ設計の初歩
masuda220
PRO
32
15k
Phronetic Team with AI - Agile Japan 2025 closing
hiranabe
2
700
JEP 496 と JEP 497 から学ぶ耐量子計算機暗号入門 / Learning Post-Quantum Crypto Basics from JEP 496 & 497
mackey0225
2
510
CloudNative Days Winter 2025: 一週間で作る低レイヤコンテナランタイム
ternbusty
7
1.8k
FlutterKaigi 2025 システム裏側
yumnumm
0
1.2k
「文字列→日付」の落とし穴 〜Ruby Date.parseの意外な挙動〜
sg4k0
0
320
Java_プロセスのメモリ監視の落とし穴_NMT_で見抜けない_glibc_キャッシュ問題_.pdf
ntt_dsol_java
0
230
Module Harmony
petamoriken
2
580
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
56
14k
Rails Girls Zürich Keynote
gr2m
95
14k
The Cult of Friendly URLs
andyhume
79
6.7k
Designing Experiences People Love
moore
142
24k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Building an army of robots
kneath
306
46k
Making the Leap to Tech Lead
cromwellryan
135
9.6k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
4 Signs Your Business is Dying
shpigford
186
22k
Transcript
Chris Keathley / @ChrisKeathley /
[email protected]
Building Adaptive Systems
Server Server
Server Server I have a request
Server Server
Server Server
Server Server No Problem!
Server Server
Server Server Thanks!
Server Server
Server Server I have a request
Server Server
Server Server
Server Server I’m a little busy
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I’m a little busy I have more requests!
Server Server I don’t feel so good
Server
Server Welp
Server Welp
All services have objectives
A resilient service should be able to withstand a 10x
traffic spike and continue to meet those objectives
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
What causes overload?
What causes overload? Server Queue
What causes overload? Server Queue Processing Time Arrival Rate >
Little’s Law Elements in the queue = Arrival Rate *
Processing Time
Little’s Law Server 1 requests = 10 rps * 100
ms 100ms
Little’s Law Server 1 requests = 10 rps * 100
ms 100ms
Little’s Law Server 1 requests = 10 rps * 100
ms 100ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms BEAM Processes
Little’s Law Server 2 requests = 10 rps * 200
ms 200ms BEAM Processes CPU Pressure
Little’s Law Server 3 requests = 10 rps * 300
ms 300ms BEAM Processes CPU Pressure
Little’s Law Server 30 requests = 10 rps * 3000
ms 3000ms BEAM Processes CPU Pressure
Little’s Law Server 30 requests = 10 rps * ∞
ms ∞ BEAM Processes CPU Pressure
Little’s Law 30 requests = 10 rps * ∞ ms
Little’s Law ∞ requests = 10 rps * ∞ ms
Little’s Law ∞ requests = 10 rps * ∞ ms
This is bad
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
Overload Arrival Rate > Processing Time
Overload Arrival Rate > Processing Time We need to get
these under control
Load Shedding Server Queue Server
Load Shedding Server Queue Server Drop requests
Load Shedding Server Queue Server Drop requests Stop sending
Autoscaling
Autoscaling
Autoscaling Server DB Server
Autoscaling Server DB Server Requests start queueing
Autoscaling Server DB Server Server
Autoscaling Server DB Server Server Now its worse
Autoscaling needs to be in response to load shedding
Circuit Breakers
Circuit Breakers
Circuit Breakers Server Server
Circuit Breakers Server Server
Circuit Breakers Server Server Shut off traffic
Circuit Breakers Server Server
Circuit Breakers Server Server I’m not quite dead yet
Circuit Breakers are your last line of defense
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
Lets Talk About… Queues Overload Mitigation Adaptive Concurrency
We want to allow as many requests as we can
actually handle
None
Adaptive Limits Time Concurrency
Adaptive Limits Actual limit Time Concurrency
Adaptive Limits Actual limit Dynamic Discovery Time Concurrency
Load Shedding Server Server
Load Shedding Server Server Are we at the limit?
Load Shedding Server Server Am I still healthy?
Load Shedding Server Server
Load Shedding Server Server Update Limits
Adaptive Limits Time Concurrency Increased latency
Latency Successful vs. Failed requests Signals for Adjusting Limits
Additive Increase Multiplicative Decrease Success state: limit + 1 Backoff
state: limit * 0.95 Time Concurrency
Prior Art/Alternatives https://github.com/ferd/pobox/ https://github.com/fishcakez/sbroker/ https://github.com/heroku/canal_lock https://github.com/jlouis/safetyvalve https://github.com/jlouis/fuse
Regulator https://github.com/keathley/regulator
Regulator.install(:service, [ limit: {Regulator.Limit.AIMD, [timeout: 500]} ]) Regulator.ask(:service, fn ->
{:ok, Finch.request(:get, "https://keathley.io")} end) Regulator
Conclusion
Queues are everywhere
Those queues need to be bounded to avoid overload
If your system is dynamic, your solution will also need
to be dynamic
Go and build awesome stuff
Thanks Chris Keathley / @ChrisKeathley /
[email protected]