Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
550
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
900
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
950
AWS Lambda - Pycon India 2016
cnu
0
440
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
アジャイルを支えるテストアーキテクチャ設計/Test Architecting for Agile
goyoki
9
3k
Macとオーディオ再生 2024/11/02
yusukeito
0
300
役立つログに取り組もう
irof
27
9.2k
推し活としてのrails new/oshikatsu_ha_iizo
sakahukamaki
3
1.9k
【Kaigi on Rails 2024】YOUTRUST スポンサーLT
krpk1900
1
290
Kaigi on Rails 2024 - Rails APIモードのためのシンプルで効果的なCSRF対策 / kaigionrails-2024-csrf
corocn
5
3.7k
Hotwire or React? ~アフタートーク・本編に含めなかった話~ / Hotwire or React? after talk
harunatsujita
1
110
CSC509 Lecture 09
javiergs
PRO
0
130
シェーダーで魅せるMapLibreの動的ラスタータイル
satoshi7190
1
420
Importmapを使ったJavaScriptの 読み込みとブラウザアドオンの影響
swamp09
4
1.3k
Tuning GraphQL on Rails
pyama86
2
1.2k
讓數據說話:用 Python、Prometheus 和 Grafana 講故事
eddie
0
380
Featured
See All Featured
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
250
21k
Docker and Python
trallard
40
3.1k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
92
16k
Six Lessons from altMBA
skipperchong
26
3.5k
VelocityConf: Rendering Performance Case Studies
addyosmani
325
24k
Making Projects Easy
brettharned
115
5.9k
Automating Front-end Workflow
addyosmani
1366
200k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
328
21k
A better future with KSS
kneath
238
17k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Speed Design
sergeychernyshev
24
580
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
28
9.1k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu