$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
660
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
1k
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
1.1k
AWS Lambda - Pycon India 2016
cnu
0
530
ZeroMQ - PyCon India 2013
cnu
2
1.6k
Other Decks in Programming
See All in Programming
connect-python: convenient protobuf RPC for Python
anuraaga
0
390
Microservices Platforms: When Team Topologies Meets Microservices Patterns
cer
PRO
1
1k
複数人でのCLI/Infrastructure as Codeの暮らしを良くする
shmokmt
5
2.2k
S3 VectorsとStrands Agentsを利用したAgentic RAGシステムの構築
tosuri13
6
300
手軽に積ん読を増やすには?/読みたい本と付き合うには?
o0h
PRO
1
170
Why Kotlin? 電子カルテを Kotlin で開発する理由 / Why Kotlin? at Henry
agatan
2
6.9k
Cell-Based Architecture
larchanjo
0
110
ソフトウェア設計の課題・原則・実践技法
masuda220
PRO
26
22k
リリース時」テストから「デイリー実行」へ!開発マネージャが取り組んだ、レガシー自動テストのモダン化戦略
goataka
0
120
まだ間に合う!Claude Code元年をふりかえる
nogu66
5
680
関数実行の裏側では何が起きているのか?
minop1205
1
680
AIコーディングエージェント(Gemini)
kondai24
0
200
Featured
See All Featured
Six Lessons from altMBA
skipperchong
29
4.1k
Producing Creativity
orderedlist
PRO
348
40k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
970
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.3k
Code Reviewing Like a Champion
maltzj
527
40k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.8k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
36
6.2k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.5k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu