Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
690
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
1k
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1.1k
AWS Lambda - Pycon India 2016
cnu
0
570
ZeroMQ - PyCon India 2013
cnu
2
1.6k
Other Decks in Programming
See All in Programming
ポーリング処理廃止によるイベント駆動アーキテクチャへの移行
seitarof
3
1.1k
What Spring Developers Should Know About Jakarta EE
ivargrimstad
0
680
GoのDB アクセスにおける 「型安全」と「柔軟性」の両立 - Bob という選択肢
tak848
0
250
Migration to Signals, Signal Forms, Resource API, and NgRx Signal Store @Angular Days 03/2026 Munich
manfredsteyer
PRO
0
110
AI 開発合宿を通して得た学び
niftycorp
PRO
0
160
野球解説AI Agentを開発してみた - 2026/02/27 LayerX社内LT会資料
shinyorke
PRO
0
350
コーディングルールの鮮度を保ちたい / keep-fresh-go-internal-conventions
handlename
0
220
RAGでハマりがちな"Excelの罠"を、データの構造化で突破する
harumiweb
9
2.9k
[PHPerKaigi 2026]PHPerKaigi2025の企画CodeGolfが最高すぎて社内で内製して半年運営して得た内製と運営の知見
ikezoemakoto
0
220
S3ストレージクラスの「見える」「ある」「使える」は全部違う ─ 体験から見た、仕様の深淵を覗く
ya_ma23
0
810
Go Conference mini in Sendai 2026 : Goに新機能を提案し実装されるまでのフロー徹底解説
yamatoya
0
620
Redox OS でのネームスペース管理と chroot の実現
isanethen
0
340
Featured
See All Featured
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
150
Writing Fast Ruby
sferik
630
63k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.7k
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
BBQ
matthewcrist
89
10k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.3k
New Earth Scene 8
popppiees
1
1.8k
SEO for Brand Visibility & Recognition
aleyda
0
4.4k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
WCS-LA-2024
lcolladotor
0
480
GraphQLとの向き合い方2022年版
quramy
50
14k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu