Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Probabilistic Data Structures
Search
cnu
September 03, 2017
Programming
0
500
Probabilistic Data Structures
Learn how to use Probabilistic Data Structures and modules in Redis v4 to analyse logs.
cnu
September 03, 2017
Tweet
Share
More Decks by cnu
See All by cnu
Redisconf 2018: Probabilistic Data Structures
cnu
1
800
The Rocky Road from Monolithic to Microservices Architecture
cnu
0
870
AWS Lambda - Pycon India 2016
cnu
0
400
ZeroMQ - PyCon India 2013
cnu
2
1.4k
Other Decks in Programming
See All in Programming
Tailwind CSSを本気でカスタマイズする方法
fsubal
13
5.2k
PHPはいつから死んでいるかの調査
chiroruxx
1
400
大規模Reactアプリのリアーキテクチャ~8万行のTanStack Query移行の軌跡~
kj455
4
950
Milestoner
bkuhlmann
1
410
Git Lint
bkuhlmann
4
750
AWS Application Composerで始める、 サーバーレスなデータ基盤構築 / 20240406-jawsug-hokuriku-shinkansen
kasacchiful
1
260
検証も兼ねて個人開発でHonoとかと向き合った話
hanetsuki
0
830
大規模UIKitベースアプリへのTCAの段階的導入/gradual-adoption-of-tca-in-a-large-scale-uikit-based-app
takehilo
1
100
Random\Randomizer クラスで日常のあれこれを解決しよう! / Random\Randomizer class solves familiar trouble
cocoeyes02
0
220
Changed Rules: Architectures with Lightweight Stores
manfredsteyer
PRO
0
240
Blue/Greenデプロイの導入による 運用フローの改善
kudoas
1
370
Behind VS Code Extensions for JavaScript / TypeScript Linnting and Formatting
unvalley
5
900
Featured
See All Featured
Rails Girls Zürich Keynote
gr2m
91
13k
4 Signs Your Business is Dying
shpigford
175
21k
Product Roadmaps are Hard
iamctodd
44
9.7k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
30
6k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
125
32k
Building Better People: How to give real-time feedback that sticks.
wjessup
355
18k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
21
1.6k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
352
28k
Code Reviewing Like a Champion
maltzj
514
39k
Adopting Sorbet at Scale
ufuk
68
8.6k
Art, The Web, and Tiny UX
lynnandtonic
289
19k
Building a Modern Day E-commerce SEO Strategy
aleyda
17
6.4k
Transcript
Probabilistic Data Structures in Redis
Srinivasan Rangarajan Head of Engineering
Srinivasan Rangarajan @cnu https://cnu.name
Agenda • Log Analysis • Redis V4 • Probabilistic Data
Structures
Log Analysis
Challenges • 100s of Millions of events processed every day
• Peak of ~10 Million events in an hour • Needs Realtime processing • Memory/Storage Requirements
Cost Accuracy Scale
None
Sample Event Data { "ip": "123.123.123.123", "client_id": 232, "user_id": "35827",
"email": "
[email protected]
", "product_id": "ABC-12345", "image_id": 3, "action": "pageview", "datetime": "2017-06-29T12:42:53Z", }
None
Redis Version 4 • Module system • Better Replication •
Cache eviction Improvements • Non-Blocking DEL and FLUSH* commands • Mixed RDB-AOF persistence format • MEMORY DOCTOR
Modules mikicon NounProject
Loading Modules • ./redis-server --loadmodule /path/to/module.so • redis.conf loadmodule /path/to/module.so
• MODULE LOAD /path/to/module.so
Execute a custom command
Probabilistic Data Structures
There are three kinds of people in the world. 1.
Those who can count. 2. Those who can’t count.
There are three kinds of people in the world. data
structures 1. Those who can count. 2. Those who can’t count. 3. Those who count approximately.
None
Advantage: Huge Memory Savings
3 Data Structures
HyperLogLog Count the Cardinality of a Set http://antirez.com/news/75
Count Unique Visitor / hour
Merge Hourly into Daily
TopK Get Top k Elements in a set https://github.com/RedisLabsModules/topk
Top k IP Addresses
None
CountMinSketch Count the frequency of items https://github.com/RedisLabsModules/countminsketch
User Pageview counter
Bloom Filters Test membership in a set https://github.com/RedisLabsModules/rebloom
Bloom Filters False Positives False Negatives
User Session checking
~3 Data Structures • HyperLogLog • TopK • CountMinSketch •
BloomFilter
Thank you
Follow me @cnu