$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
68
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
520
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
720
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
330
Other Decks in Technology
See All in Technology
オープンソースKeycloakのMCP認可サーバの仕様の対応状況 / 20251219 OpenID BizDay #18 LT Keycloak
oidfj
0
160
20251219 OpenIDファウンデーション・ジャパン紹介 / OpenID Foundation Japan Intro
oidfj
0
490
【開発を止めるな】機能追加と並行して進めるアーキテクチャ改善/Keep Shipping: Architecture Improvements Without Pausing Dev
bitkey
PRO
1
120
AgentCoreとStrandsで社内d払いナレッジボットを作った話
motojimayu
1
870
通勤手当申請チェックエージェント開発のリアル
whisaiyo
3
440
アラフォーおじさん、はじめてre:Inventに行く / A 40-Something Guy’s First re:Invent Adventure
kaminashi
0
130
JEDAI認定プログラム JEDAI Order 2026 エントリーのご案内 / JEDAI Order 2026 Entry
databricksjapan
0
180
AI との良い付き合い方を僕らは誰も知らない
asei
0
240
AWSに革命を起こすかもしれない新サービス・アップデートについてのお話
yama3133
0
500
ActiveJobUpdates
igaiga
1
310
まだ間に合う! Agentic AI on AWSの現在地をやさしく一挙おさらい
minorun365
17
2.6k
2025-12-18_AI駆動開発推進プロジェクト運営について / AIDD-Promotion project management
yayoi_dd
0
150
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
32
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
7.9k
The Curse of the Amulet
leimatthew05
0
4.7k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
37
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
0
31
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.6k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
210
Designing Powerful Visuals for Engaging Learning
tmiket
0
190
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857