Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probablistic Data Structures
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Sergey Arkhipov
November 11, 2017
Programming
0
260
Probablistic Data Structures
My talk on rannts #18 (11.11.2017)
Sergey Arkhipov
November 11, 2017
Tweet
Share
More Decks by Sergey Arkhipov
See All by Sergey Arkhipov
Fingerprinting
9seconds
0
160
Concurrency Models
9seconds
0
240
Own Mustache
9seconds
0
340
Daemonize
9seconds
0
340
Stuff That Works
9seconds
0
370
Evidence
9seconds
0
99
Redneck Monads
9seconds
1
120
Latency
9seconds
0
140
Oh Blindfold Russia!
9seconds
0
310
Other Decks in Programming
See All in Programming
インターン生でもAuth0で認証基盤刷新が出来るのか
taku271
0
190
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
190
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
370
Oxlintはいいぞ
yug1224
5
1.3k
Fragmented Architectures
denyspoltorak
0
150
Spinner 軸ズレ現象を調べたらレンダリング深淵に飲まれた #レバテックMeetup
bengo4com
1
230
コントリビューターによるDenoのすゝめ / Deno Recommendations by a Contributor
petamoriken
0
200
メルカリのリーダビリティチームが取り組む、AI時代のスケーラブルな品質文化
cloverrose
2
510
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
180
IFSによる形状設計/デモシーンの魅力 @ 慶應大学SFC
gam0022
1
300
SourceGeneratorのススメ
htkym
0
190
AtCoder Conference 2025
shindannin
0
1k
Featured
See All Featured
Designing Experiences People Love
moore
144
24k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
64
How to Think Like a Performance Engineer
csswizardry
28
2.4k
30 Presentation Tips
portentint
PRO
1
210
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Mind Mapping
helmedeiros
PRO
0
78
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Odyssey Design
rkendrick25
PRO
1
490
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
The Limits of Empathy - UXLibs8
cassininazir
1
210
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Transcript
Вероятностные структуры данных Сергей Архипов, 2017
None
None
curl http://site.com
curl -x myproxy.ru:3128 http://site.com
curl -x proxy.crawlera.com:8010 http://site.com
None
evt evt evt evt evt
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
None
collector collector collector
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned": 12, "errors": 3, }
Consumer 1 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned":
12, "errors": 3, } Consumer 2 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 250, "banned": 3, "errors": 0, } Consumer 3 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 0, "banned": 124, "errors": 84, }
INSERT INTO stats ( date, user, hostname, ok, ban, error
) VALUES ( :date, :user, :hostname, :ok, :ban, :error ) ON DUPLICATE KEY UPDATE ok = ok + VALUES(ok), ban = ban + VALUES(ban), error = error + VALUES(error);
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "", "response_time":
2861, }
(20 + 10) + 11 = (20 + 11) +
10
F(x)=P{σ<x} { P(x⩽x α )⩾α P(x⩾x α )⩾1−α
Ω(N 1 p )
collector collector collector pworker pworker pworker
None
None
var memCount = 75604275; var memPerSec = 1.38176367782; function updateCount()
{ next = -(1000 / memPerSec) * Math.log(Math.random()); memCountString = ''+memCount; len = memCountString.length; memCountString = memCountString.substr(0, len - 6) + ’ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-6,3)+‘ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-3,3); ge(‘memCount’).innerHTML = memCountString; memCount = memCount + 1; setTimeout(updateCount, next); } addEvent(window, ‘load’, updateCount);
3500 3671 3400 3502 3463 3371 3607 6012 6168 6211
6017
3500 3507 3671 3667 3400 3410 3502 3502 3463 3466
3371 3330 3607 3599 6012 6009 6168 6152 6211 6215 6017 6016
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 1 0 0 0 0 0
1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
MinHash J (A , B)= |A∩B| |A∪B| k=[ 1 ε2
]
HyperLogLog 010010000110010101101100011011000110111100100001 b 26 = 64 1001 b = 9
100001 b = 33 σ= 1.04 √2k E= α(k)4k ∑ j 2−M j
t-digest
t-digest
t-digest
t-digest X=x 1 , x 2 ,…, x n X={s
1 ,s 2 ,…,s m } s i ={x l e f t(i) ,…, x r i ght(i) }
t-digest k(q,δ)≝δ (sin−1 (2q−1) π + 1 2 ) K(i)≝k(
r i ght(i) n ,δ)−k( le f t(i)−1 n ,δ) K (i)⩽1 K(i)+K (i+1)>1
t-digest
t-digest
collector collector collector pworker pworker pworker
None
Q/A