Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
640
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
970
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1k
AWS Lambda - Pycon India 2016
cnu
0
510
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
Jakarta EE Meets AI
ivargrimstad
0
590
なぜ今、Terraformの本を書いたのか? - 著者陣に聞く!『Terraformではじめる実践IaC』登壇資料
fufuhu
4
410
副作用と戦う PHP リファクタリング ─ ドメインイベントでビジネスロジックを解きほぐす
kajitack
3
520
AIに安心して任せるためにTypeScriptで一意な型を作ろう
arfes0e2b3c
0
330
Comparing decimals in Swift Testing
417_72ki
0
160
Vibe coding コードレビュー
kinopeee
0
410
GitHub Copilotの全体像と活用のヒント AI駆動開発の最初の一歩
74th
7
1.8k
React は次の10年を生き残れるか:3つのトレンドから考える
oukayuka
41
16k
自作OSでDOOMを動かしてみた
zakki0925224
0
260
変化を楽しむエンジニアリング ~ いままでとこれから ~
murajun1978
0
670
あなたとJIT, 今すぐアセンブ ル
sisshiki1969
0
140
Streamlitで実現できるようになったこと、実現してくれたこと
ayumu_yamaguchi
2
270
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.4k
Become a Pro
speakerdeck
PRO
29
5.5k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Speed Design
sergeychernyshev
32
1.1k
Balancing Empowerment & Direction
lara
1
530
Designing for Performance
lara
610
69k
Rails Girls Zürich Keynote
gr2m
95
14k
Building a Modern Day E-commerce SEO Strategy
aleyda
43
7.4k
GraphQLとの向き合い方2022年版
quramy
49
14k
The Language of Interfaces
destraynor
158
25k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu