Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
630
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
970
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1k
AWS Lambda - Pycon India 2016
cnu
0
500
ZeroMQ - PyCon India 2013
cnu
2
1.5k
Other Decks in Programming
See All in Programming
顧客の画像データをテラバイト単位で配信する 画像サーバを WebP にした際に起こった課題と その対応策 ~継続的な取り組みを添えて~
takutakahashi
4
1.3k
React は次の10年を生き残れるか:3つのトレンドから考える
oukayuka
7
2.4k
Model Pollution
hschwentner
1
160
Claude Code + Container Use と Cursor で作る ローカル並列開発環境のススメ / ccc local dev
kaelaela
12
7k
PipeCDのプラグイン化で目指すところ
warashi
1
300
テスターからテストエンジニアへ ~新米テストエンジニアが歩んだ9ヶ月振り返り~
non0113
2
220
Agentic Coding: The Future of Software Development with Agents
mitsuhiko
0
130
明示と暗黙 ー PHPとGoの インターフェイスの違いを知る
shimabox
2
620
High-Level Programming Languages in AI Era -Human Thought and Mind-
hayat01sh1da
PRO
0
880
Android 16KBページサイズ対応をはじめからていねいに
mine2424
0
440
Hack Claude Code with Claude Code
choplin
7
2.5k
効率的な開発手段として VRTを活用する
ishkawa
0
160
Featured
See All Featured
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.3k
Making Projects Easy
brettharned
116
6.3k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Typedesign – Prime Four
hannesfritz
42
2.7k
Become a Pro
speakerdeck
PRO
29
5.4k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
2.9k
Automating Front-end Workflow
addyosmani
1370
200k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
970
GraphQLとの向き合い方2022年版
quramy
49
14k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
YesSQL, Process and Tooling at Scale
rocio
173
14k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu