Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Optimizing Go: From 3k req/s/core to 480k req/s...
Search
Ashish
November 20, 2014
Technology
4.7k
22
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Optimizing Go: From 3k req/s/core to 480k req/s/core
Ashish
November 20, 2014
Other Decks in Technology
See All in Technology
「エンジニア進化論」2028年の開発完全自動化、エンジニアはどう進化するか
cyberagentdevelopers
PRO
4
4.5k
Claude Codeをどのように キャッチアップしているか
oikon48
3
2.1k
ポケモンの型をTypeScriptの型システムで表現してみた
subroh0508
0
370
就職⽀援サービスにおけるキャリアアドバイザーのシフトスケジューリング
recruitengineers
PRO
1
140
Microsoft Build Keynoteふりかえり
tomokusaba
0
120
Building applications in the Gemini API family.
line_developers_tw
PRO
0
2.9k
機械学習を「社会実装」するということ 2026年夏版 / Social Implementation of Machine Learning June 2026 Version
moepy_stats
4
1.5k
プロダクト開発から業務改善コンサルまで。事業全体へ「染み出す」ことで広がるエンジニアの可能性
ham0215
0
100
【Cyber-sec+】経営層を"動かす"ための考え方
hssh2_bin
0
130
小さく始める AI 活用推進 ― 日経電子版 Web チームの事例/nikkei-tech-talk47
nikkei_engineer_recruiting
0
220
"何を作るか"を任される エンジニアは、どう育つのか
yutaokafuji
1
590
Claude Code×Terraform IaC テンプレート駆動開発
itouhi
1
490
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.9k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
140
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.8k
Exploring anti-patterns in Rails
aemeredith
3
400
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
From π to Pie charts
rasagy
0
210
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
200
Docker and Python
trallard
47
3.9k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Paper Plane (Part 1)
katiecoart
PRO
0
8.8k
Unsuck your backbone
ammeep
672
58k
Designing for Timeless Needs
cassininazir
1
250
Transcript
Optimization 3k req/s/core to 480k req/s/core
DDoS
Layers
User Agents 537 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
FREE; .NET CLR 1.1.4322) 272 Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 4.0) Opera 7.0 [en] 269 Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.0) Opera 7.02 Bork-edition [en] 264 Opera/8.00 (Windows NT 5.1; U; en) 264 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; KKman2.0) 261 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; .NET CLR 1.1.4322) 258 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727) 255 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt) 253 Opera/7.60 (Windows NT 5.2; U) [en] (IBM EVV/3.0/EAK01AG9/LE) 251 Opera/7.54 (Windows NT 5.1; U) [pl] 251 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; WOW64; SV1; .NET CLR 2.0.50727) 251 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461) 248 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; Win64; AMD64) 247 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) 243 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Referer 172 http://pvppw.ru/ 166 text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,*/*;q=0.5 164 az-us 162 zh, en-us;
q=0.8, en; q=0.6 161 http://zhyk.ru/ 160 en-en,en;q=0.8,en-us;q=0.5,en;q=0.3 157 http://www.niagarastar.ru/ 152 az-ua 150 http://kremlin.ru/ 150 application/xml, image/png, text/html 149 text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 149 http://premier.gov.ru/ 148 text/x-dvi; q=.8; mxb=100000; mxt=5.0, text/x-c 147 text/html, */* 147 en-us,en;q=0.5
Architecture
Architecture
Architecture
Manageable λ • Reduce the set of clients that you
coordinate the state of, say top k
Algorithms • Space Saving algorithm (https://icmi.cs.ucsb.edu/ research/tech_reports/reports/2005-23.pdf) • An implementation
in Go (https://github.com/ cloudflare/golibs)
Perfect!
Benchmark Test • func BenchmarkFoo(b *testing.B) • b.N • http://dave.cheney.net/2013/06/30/how-to-write-
benchmarks-in-go
Slow! (3k req/s/core)
Benchmark CPU Profile • go test -bench=. -cpuprofile=cpu.out
None
Virtual CPU
Real CPU Profile • Copying memory is expensive • Copy
pointers instead
Keeping Elements Sorted • Array worst case: O(n) • Priority
queue: O(log n)
O(n), O(log n)
First Pass Optimization • Reduce the size of memory needed
to be copied • Reduce the number of times copying is needed
75k req/s/core
Wrong Output!
Correctness • Processing rate was approaching workable • But the
rate estimation was grossly inaccurate
The Distribution 0 2.5 5 7.5 10 A F B
O U C D E G H I J K L M N P Q R S T Requests
Lesson • Read the paper properly • Streaming algorithms tend
to output estimates • Know in what scenarios they fail
Naïve Approach • Use a map for counting • There
is no second bullet
Find Top k • Quicksort: O(n log n) • Quickselect:
O(n)
O(n log n), O(n)
Reduce Data Set • Prune the map of ultra low
values • Use sort from standard library
120k req/s/core
Test In Production
More Like
Test In Production • Run in dark mode (no side
effects) • Needs to be faster to keep up with attacks during peak load
Hello perf top
Garbage • String manipulation produces garbage • Cannot use slices
as keys in maps, use arrays
480k req/s/core
None
Takeaway • Start simple • Benchmark • Profile • Verify
correctness • Back-of-the-envelope calculations are helpful
Future Work • This was put into production • Later,
we switched to using lock-free algorithms where possible to reduce the load on CPU
We Are Hiring
–Abraham Lincoln “Join CloudFlare.”
–Me “Special thanks to Albert Strasheim and Stephan Lachowsky.”
The End • Ashish Gandhi • @ashishgandhi_ •
[email protected]
•
Easy questions?