Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probablistic Data Structures
Search
Sergey Arkhipov
November 11, 2017
Programming
0
240
Probablistic Data Structures
My talk on rannts #18 (11.11.2017)
Sergey Arkhipov
November 11, 2017
Tweet
Share
More Decks by Sergey Arkhipov
See All by Sergey Arkhipov
Fingerprinting
9seconds
0
150
Concurrency Models
9seconds
0
200
Own Mustache
9seconds
0
310
Daemonize
9seconds
0
290
Stuff That Works
9seconds
0
310
Evidence
9seconds
0
85
Redneck Monads
9seconds
1
90
Latency
9seconds
0
110
Oh Blindfold Russia!
9seconds
0
280
Other Decks in Programming
See All in Programming
AIエージェントによるテストフレームワーク Arbigent
takahirom
0
360
コードに語らせよう――自己ドキュメント化が内包する楽しさについて / Let the Code Speak
nrslib
6
1.4k
The Evolution of Enterprise Java with Jakarta EE 11 and Beyond
ivargrimstad
1
530
複数アプリケーションを育てていくための共通化戦略
irof
10
3.7k
統一感のある Go コードを生成 AI の力で手にいれる
otakakot
0
2.9k
Rails産でないDBを Railsに引っ越すHACK - Omotesando.rb #110
lnit
1
160
RubyKaigiで得られる10の価値 〜Ruby話を聞くことだけが RubyKaigiじゃない〜
tomohiko9090
0
140
技術懸念に立ち向かい 法改正を穏便に乗り切った話
pop_cashew
0
1.3k
Spring gRPC で始める gRPC 入門 / Introduction to gRPC with Spring gRPC
mackey0225
2
480
Parallel::Pipesの紹介
skaji
2
900
ktr0731/go-mcpでMCPサーバー作ってみた
takak2166
0
150
20250528 AWS Startupイベント登壇資料:AIコーディングの取り組み
procrustes5
0
160
Featured
See All Featured
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
31
1.2k
Unsuck your backbone
ammeep
671
58k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.3k
Scaling GitHub
holman
459
140k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
48
5.4k
Java REST API Framework Comparison - PWX 2021
mraible
31
8.6k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
52
2.8k
Facilitating Awesome Meetings
lara
54
6.4k
The Cult of Friendly URLs
andyhume
79
6.4k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
Speed Design
sergeychernyshev
30
990
Transcript
Вероятностные структуры данных Сергей Архипов, 2017
None
None
curl http://site.com
curl -x myproxy.ru:3128 http://site.com
curl -x proxy.crawlera.com:8010 http://site.com
None
evt evt evt evt evt
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
None
collector collector collector
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" }
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "" } { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned": 12, "errors": 3, }
Consumer 1 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 234, "banned":
12, "errors": 3, } Consumer 2 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 250, "banned": 3, "errors": 0, } Consumer 3 { "user": "sarkhipov", "hostname": "rannts.ru", "ok": 0, "banned": 124, "errors": 84, }
INSERT INTO stats ( date, user, hostname, ok, ban, error
) VALUES ( :date, :user, :hostname, :ok, :ban, :error ) ON DUPLICATE KEY UPDATE ok = ok + VALUES(ok), ban = ban + VALUES(ban), error = error + VALUES(error);
{ "user": "sarkhipov", "hostname": "rannts.ru", "status": "ok", "status_description": "", "response_time":
2861, }
(20 + 10) + 11 = (20 + 11) +
10
F(x)=P{σ<x} { P(x⩽x α )⩾α P(x⩾x α )⩾1−α
Ω(N 1 p )
collector collector collector pworker pworker pworker
None
None
var memCount = 75604275; var memPerSec = 1.38176367782; function updateCount()
{ next = -(1000 / memPerSec) * Math.log(Math.random()); memCountString = ''+memCount; len = memCountString.length; memCountString = memCountString.substr(0, len - 6) + ’ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-6,3)+‘ < span style = ”font - size: 8 px” > < /span>’+memCountString.substr(len-3,3); ge(‘memCount’).innerHTML = memCountString; memCount = memCount + 1; setTimeout(updateCount, next); } addEvent(window, ‘load’, updateCount);
3500 3671 3400 3502 3463 3371 3607 6012 6168 6211
6017
3500 3507 3671 3667 3400 3410 3502 3502 3463 3466
3371 3330 3607 3599 6012 6009 6168 6152 6211 6215 6017 6016
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Count-Min Sketch 0 0 1 0 0 0 0 0
1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
Count-Min Sketch 32 11 1 18 200 126 184 78
1 0 91 59 30 24 8 82 76 34 48 72 11 200 129 136 14
MinHash J (A , B)= |A∩B| |A∪B| k=[ 1 ε2
]
HyperLogLog 010010000110010101101100011011000110111100100001 b 26 = 64 1001 b = 9
100001 b = 33 σ= 1.04 √2k E= α(k)4k ∑ j 2−M j
t-digest
t-digest
t-digest
t-digest X=x 1 , x 2 ,…, x n X={s
1 ,s 2 ,…,s m } s i ={x l e f t(i) ,…, x r i ght(i) }
t-digest k(q,δ)≝δ (sin−1 (2q−1) π + 1 2 ) K(i)≝k(
r i ght(i) n ,δ)−k( le f t(i)−1 n ,δ) K (i)⩽1 K(i)+K (i+1)>1
t-digest
t-digest
collector collector collector pworker pworker pworker
None
Q/A