Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
620
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
960
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1k
AWS Lambda - Pycon India 2016
cnu
0
500
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
Cursor AI Agentと伴走する アプリケーションの高速リプレイス
daisuketakeda
1
120
セキュリティマネジャー廃止とクラウドネイティブ型サンドボックス活用
kazumura
1
180
A comprehensive view of refactoring
marabesi
0
660
関数型まつり2025登壇資料「関数プログラミングと再帰」
taisontsukada
2
830
イベントストーミングから始めるドメイン駆動設計
jgeem
4
850
ドメインモデリングにおける抽象の役割、tagless-finalによるDSL構築、そして型安全な最適化
knih
11
1.9k
社内での開発コミュニティ活動とモジュラーモノリス標準化事例のご紹介/xPalette and Introduction of Modular monolith standardization
m4maruyama
1
120
KotlinConf 2025 現地参加の土産話
n_takehata
0
100
業務自動化をJavaとSeleniumとAWS Lambdaで実現した方法
greenflagproject
1
120
複数アプリケーションを育てていくための共通化戦略
irof
10
3.9k
型付きアクターモデルがもたらす分散シミュレーションの未来
piyo7
0
790
Beyond Portability: Live Migration for Evolving WebAssembly Workloads
chikuwait
0
380
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
92
6.1k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.3k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
29
9.5k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
Automating Front-end Workflow
addyosmani
1370
200k
Testing 201, or: Great Expectations
jmmastey
42
7.5k
How STYLIGHT went responsive
nonsquared
100
5.6k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Stop Working from a Prison Cell
hatefulcrawdad
269
20k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu