Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
50
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
310
Life of a Document in Elasticsearch
bleskes
3
3.2k
Resiliency in Elasticsearch & Lucene
bleskes
0
500
Resiliency in Elasticsearch & Lucene
bleskes
0
220
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
700
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
360
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
650
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
310
Other Decks in Technology
See All in Technology
あなたの人生も変わるかも?AWS認定2つで始まったウソみたいな話
iwamot
3
840
GoogleのAIエージェント論 Authors: Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic
customercloud
PRO
0
130
2025年に挑戦したいこと
molmolken
0
150
商品レコメンドでのexplicit negative feedbackの活用
alpicola
1
340
三菱電機で社内コミュニティを立ち上げた話
kurebayashi
1
350
カップ麺の待ち時間(3分)でわかるPartyRockアップデート
ryutakondo
0
130
デジタルアイデンティティ技術 認可・ID連携・認証 応用 / 20250114-OIDF-J-EduWG-TechSWG
oidfj
2
650
The future we create with our own MVV
matsukurou
0
2k
Unsafe.BitCast のすゝめ。
nenonaninu
0
200
PaaSの歴史と、 アプリケーションプラットフォームのこれから
jacopen
7
1.4k
ゼロからわかる!!AWSの構成図を書いてみようワークショップ 問題&解答解説 #デッカイギ #羽田デッカイギおつ
_mossann_t
0
1.5k
Oracle Base Database Service:サービス概要のご紹介
oracle4engineer
PRO
1
16k
Featured
See All Featured
Adopting Sorbet at Scale
ufuk
74
9.2k
Building Adaptive Systems
keathley
38
2.4k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
6
500
Being A Developer After 40
akosma
89
590k
The Pragmatic Product Professional
lauravandoore
32
6.4k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
33
2.7k
The Cult of Friendly URLs
andyhume
78
6.1k
Git: the NoSQL Database
bkeepers
PRO
427
64k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
3
180
Documentation Writing (for coders)
carmenintech
67
4.5k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857