Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
600
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
950
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1k
AWS Lambda - Pycon India 2016
cnu
0
480
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
LRパーサーはいいぞ
ydah
7
1.4k
AIコーディングの理想と現実
tomohisa
38
40k
In geheimer Mission: AI Agents entwickeln
joergneumann
0
120
医療系ソフトウェアのAI駆動開発
koukimiura
1
130
実践Webフロントパフォーマンスチューニング
cp20
45
10k
ぽちぽち選択するだけでOSSを読めるVSCode拡張機能
ymbigo
14
6.4k
音声プラットフォームのアーキテクチャ変遷から学ぶ、クラウドネイティブなバッチ処理 (20250422_CNDS2025_Batch_Architecture)
thousanda
0
430
Duke on CRaC with Jakarta EE
ivargrimstad
1
130
“技術カンファレンスで何か変わる?” ──RubyKaigi後の自分とチームを振り返る
ssagara00
0
120
開発者フレンドリーで顧客も満足?Platformの秘密
algoartis
0
230
大LLM時代にこの先生きのこるには-ITエンジニア編
fumiyakume
8
3.4k
「MCPを使ってる人」が より詳しくなるための解説
yamaguchidesu
0
230
Featured
See All Featured
Balancing Empowerment & Direction
lara
0
21
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
60k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.7k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
Reflections from 52 weeks, 52 projects
jeffersonlam
349
20k
The Cost Of JavaScript in 2023
addyosmani
49
7.8k
Designing for humans not robots
tammielis
253
25k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Git: the NoSQL Database
bkeepers
PRO
430
65k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Rails Girls Zürich Keynote
gr2m
94
13k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
105
19k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu