Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Boaz Leskes
June 04, 2013
Technology
0
70
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
530
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
330
Other Decks in Technology
See All in Technology
AWS監視を「もっと楽する」ために
uechishingo
0
310
2026/01/16_実体験から学ぶ 2025年の失敗と対策_Progate Bar
teba_eleven
1
220
Data Intelligence on Lakehouse Paradigm
scotthsieh825
0
200
Lambda Durable FunctionsでStep Functionsの代わりはできるのかを試してみた
smt7174
2
120
The Engineer with a Three-Year Cycle
e99h2121
0
160
AI時代のPMに求められるのは 「Ops」と「Enablement」
shimotaroo
1
300
みんなでAI上手ピーポーになろう! / Let’s All Get AI-Savvy!
kaminashi
0
200
困ったCSVファイルの話
mottyzzz
2
350
Models vs Bounded Contexts for Domain Modularizati...
ewolff
0
220
コミュニティが持つ「学びと成長の場」としての作用 / RSGT2026
ama_ch
2
460
エンジニアとマネジメントの距離/Engineering and Management
ikuodanaka
3
440
Security Hub と出会ってから 1年半が過ぎました
rch850
0
180
Featured
See All Featured
Google's AI Overviews - The New Search
badams
0
890
Accessibility Awareness
sabderemane
0
41
Claude Code どこまでも/ Claude Code Everywhere
nwiizo
61
52k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
How GitHub (no longer) Works
holman
316
140k
Facilitating Awesome Meetings
lara
57
6.7k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
120
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
72
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
61
49k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
44
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857