Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probablistic Data Structures
Search
Sergey Arkhipov
November 11, 2017
Programming
0
220
Probablistic Data Structures
My talk on rannts #18 (11.11.2017)
Sergey Arkhipov
November 11, 2017
Tweet
Share
More Decks by Sergey Arkhipov
See All by Sergey Arkhipov
Fingerprinting
9seconds
0
140
Concurrency Models
9seconds
0
170
Own Mustache
9seconds
0
290
Daemonize
9seconds
0
260
Stuff That Works
9seconds
0
260
Evidence
9seconds
0
71
Redneck Monads
9seconds
1
83
Latency
9seconds
0
96
Oh Blindfold Russia!
9seconds
0
240
Other Decks in Programming
See All in Programming
17年周年のWebアプリケーションにTanStack Queryを導入する / Implementing TanStack Query in a 17th Anniversary Web Application
saitolume
0
250
Jakarta EE meets AI
ivargrimstad
0
260
なまけものオバケたち -PHP 8.4 に入った新機能の紹介-
tanakahisateru
1
120
PHPで学ぶプログラミングの教訓 / Lessons in Programming Learned through PHP
nrslib
3
290
KMP와 kotlinx.rpc로 서버와 클라이언트 동기화
kwakeuijin
0
150
PHPUnitしか使ってこなかった 一般PHPerがPestに乗り換えた実録
mashirou1234
0
210
「Chatwork」Android版アプリを 支える単体テストの現在
okuzawats
0
180
Итераторы в Go 1.23: зачем они нужны, как использовать, и насколько они быстрые?
lamodatech
0
820
rails statsで大解剖 🔍 “B/43流” のRailsの育て方を歴史とともに振り返ります
shoheimitani
2
940
競技プログラミングへのお誘い@阪大BOOSTセミナー
kotamanegi
0
360
暇に任せてProxmoxコンソール 作ってみました
karugamo
2
720
Kaigi on Railsに初参加したら、その日にLT登壇が決定した件について
tama50505
0
100
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
48
2.2k
Code Review Best Practice
trishagee
65
17k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
28
9.1k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.3k
Raft: Consensus for Rubyists
vanstee
137
6.7k
How GitHub (no longer) Works
holman
311
140k
Typedesign – Prime Four
hannesfritz
40
2.4k
Designing for humans not robots
tammielis
250
25k
4 Signs Your Business is Dying
shpigford
181
21k
How to Ace a Technical Interview
jacobian
276
23k
Being A Developer After 40
akosma
87
590k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
111
49k
Transcript
Вероятностные структуры данных Сергей Архипов, 2017
None
None
curl http://site.com
curl -x myproxy.ru:3128 http://site.com
curl -x proxy.crawlera.com:8010 http://site.com
None
evt evt evt evt evt
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
None
collector collector collector
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned": 12, "errors": 3, }
Consumer 1 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned":
12, "errors": 3, } Consumer 2 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 250, "banned": 3, "errors": 0, } Consumer 3 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 0, "banned": 124, "errors": 84, }
INSERT INTO stats ( date, user, hostname, ok, ban, error
) VALUES ( :date, :user, :hostname, :ok, :ban, :error ) ON DUPLICATE KEY UPDATE ok = ok + VALUES(ok), ban = ban + VALUES(ban), error = error + VALUES(error);
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "", "response_time":
2861, }
(20 + 10) + 11 = (20 + 11) +
10
F(x)=P{σ<x} { P(x⩽x α )⩾α P(x⩾x α )⩾1−α
Ω(N 1 p )
collector collector collector pworker pworker pworker
None
None
var memCount = 75604275; var memPerSec = 1.38176367782; function updateCount()
{ next = -(1000 / memPerSec) * Math.log(Math.random()); memCountString = ''+memCount; len = memCountString.length; memCountString = memCountString.substr(0, len - 6) + ’ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-6,3)+‘ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-3,3); ge(‘memCount’).innerHTML = memCountString; memCount = memCount + 1; setTimeout(updateCount, next); } addEvent(window, ‘load’, updateCount);
3500 3671 3400 3502 3463 3371 3607 6012 6168 6211
6017
3500 3507 3671 3667 3400 3410 3502 3502 3463 3466
3371 3330 3607 3599 6012 6009 6168 6152 6211 6215 6017 6016
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 1 0 0 0 0 0
1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
MinHash J (A , B)= |A∩B| |A∪B| k=[ 1 ε2
]
HyperLogLog 010010000110010101101100011011000110111100100001 b 26 = 64 1001 b = 9
100001 b = 33 σ= 1.04 √2k E= α(k)4k ∑ j 2−M j
t-digest
t-digest
t-digest
t-digest X=x 1 , x 2 ,…, x n X={s
1 ,s 2 ,…,s m } s i ={x l e f t(i) ,…, x r i ght(i) }
t-digest k(q,δ)≝δ (sin−1 (2q−1) π + 1 2 ) K(i)≝k(
r i ght(i) n ,δ)−k( le f t(i)−1 n ,δ) K (i)⩽1 K(i)+K (i+1)>1
t-digest
t-digest
collector collector collector pworker pworker pworker
None
Q/A