Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
580
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
930
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
980
AWS Lambda - Pycon India 2016
cnu
0
470
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
CloudNativePGを布教したい
nnaka2992
0
120
楽しく向き合う例外対応
okutsu
0
750
Learning Kotlin with detekt
inouehi
1
210
Datadog DBMでなにができる? JDDUG Meetup#7
nealle
0
160
変化の激しい時代における、こだわりのないエンジニアの強さ
satoshi256kbyte
1
500
dbt Pythonモデルで実現するSnowflake活用術
trsnium
0
270
Rubyと自由とAIと
yotii23
6
1.9k
From the Wild into the Clouds - Laravel Meetup Talk
neverything
0
190
Datadog Workflow Automation で圧倒的価値提供
showwin
1
320
Serverless Rust: Your Low-Risk Entry Point to Rust in Production (and the benefits are huge)
lmammino
1
170
SwiftUI移行のためのインプレッショントラッキング基盤の構築
kokihirokawa
0
180
Rails 1.0 のコードで学ぶ find_by* と method_missing の仕組み / Learn how find_by_* and method_missing work in Rails 1.0 code
maimux2x
1
270
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
6
580
The Pragmatic Product Professional
lauravandoore
32
6.4k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
30
4.6k
How GitHub (no longer) Works
holman
314
140k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Practical Orchestrator
shlominoach
186
10k
Unsuck your backbone
ammeep
669
57k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
233
17k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
7
660
Building Applications with DynamoDB
mza
93
6.2k
We Have a Design System, Now What?
morganepeng
51
7.4k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
193
16k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu