Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Boaz Leskes
June 04, 2013
Technology
80
0
Share
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
550
Resiliency in Elasticsearch & Lucene
bleskes
0
260
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
740
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
380
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
710
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
350
Other Decks in Technology
See All in Technology
もっとコンテンツをよく構造化して理解したいので、LLM 時代こそ Taxonomy の設計品質に目を向けたい〜!
morinota
0
170
Fabric MCPの紹介と使い分け
ryomaru0825
1
120
Percolatorを廃止し、マルチ検索サービスへ刷新した話 / Search Engineering Tech Talk 2026 Spring
visional_engineering_and_design
0
310
ファインディの事業拡大を支える 拡張可能なデータ基盤へのリアーキテクチャ
hiracky16
0
850
国内外の生成AIセキュリティの最新動向 & AIガードレール製品「chakoshi」のご紹介 / Latest Trends in Generative AI Security (Domestic & International) & Introduction to AI Guardrail Product "chakoshi"
nttcom
4
1.7k
VespaのParent Childを用いたフィードパフォーマンスの改善
taking
0
260
AIと乗り切った1,500ページ超のヘルプサイト基盤刷新とさらにその先の話
mugi_uno
2
300
QAエンジニアはどうやって プロダクト議論の場に入れるのか?
moritamasami
2
380
20260428_Product Management Summit_Loglass_JoeHirose
loglassjoe
4
6.8k
AWS Transform CustomでIaCコードを自由自在に変換しよう
duelist2020jp
0
240
ハーネスエンジニアリング入門
knishioka
0
110
プラットフォームエンジニアリングの実践 - AWS コンテナサービスで構築する社内プラットフォーム / AWS Containers Platform Meetup #1
literalice
1
240
Featured
See All Featured
The Spectacular Lies of Maps
axbom
PRO
1
730
Skip the Path - Find Your Career Trail
mkilby
1
110
Facilitating Awesome Meetings
lara
57
6.8k
[SF Ruby Conf 2025] Rails X
palkan
2
1k
How to audit for AI Accessibility on your Front & Back End
davetheseo
0
350
For a Future-Friendly Web
brad_frost
183
10k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Git: the NoSQL Database
bkeepers
PRO
432
67k
Into the Great Unknown - MozCon
thekraken
41
2.4k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.6k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
370
A Soul's Torment
seathinner
6
2.8k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857