Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
cnu
September 03, 2017
Programming
750
0
Share
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
1.1k
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1.2k
AWS Lambda - Pycon India 2016
cnu
0
610
ZeroMQ - PyCon India 2013
cnu
2
1.6k
Other Decks in Programming
See All in Programming
Old Dog, New Tricks: The Java 25 Reinvention - JNation
bazlur_rahman
0
140
軽量Java基盤の設計 DIコンテナに頼らない、長期保守と1秒起動の実現 JJUG CCC 2026 Spring
macha64
0
450
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
3.1k
ビジネスモデルから紐解く、AI+型駆動開発
hirokiomote
2
5.2k
Why Laravel apps break—Mastering the fundamentals to keep them maintainable
kentaroutakeda
1
340
AIとRubyの静的型付け
ukin0k0
0
530
IBM Bobを活用したレガシーアプリの最新化
oniak3ibm
PRO
1
170
CLIであることを活かしたGitHub Copilot CLI活用術 / GitHub Copilot CLI Pro Tips & Tricks
nao_mk2
1
1.2k
不変条件と整合性境界—ビジネスが決める設計判断と実現パターン / Invariants and Consistency Boundaries
nrslib
13
3.4k
DynamoDBには集計系のクエリがないけどなんとかしたい
musan
1
130
肥大化するレガシーコードに立ち向かうためのインターフェース分離と依存の逆転 / JJUG CCC 2026 Spring
hirokunimaeta
0
490
Copilot CLI の継戦能力を高める コンテキスト管理
nozomutu
1
1.2k
Featured
See All Featured
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
320
Skip the Path - Find Your Career Trail
mkilby
1
140
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
How to train your dragon (web standard)
notwaldorf
97
6.7k
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.7k
Utilizing Notion as your number one productivity tool
mfonobong
4
310
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
1
320
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
220
Building Adaptive Systems
keathley
44
3k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
310
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu