Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
72
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
540
Resiliency in Elasticsearch & Lucene
bleskes
0
250
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
380
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
340
Other Decks in Technology
See All in Technology
ガバメントクラウドにおけるAWSの長期継続割引について
takeda_h
2
5.4k
スピンアウト講座01_GitHub管理
overflowinc
0
820
開発チームとQAエンジニアの新しい協業モデル -年末調整開発チームで実践する【QAリード施策】-
kaomi_wombat
0
200
Phase04_ターミナル基礎
overflowinc
0
1.4k
Kiro Powers 入門
k_adachi_01
0
130
夢の無限スパゲッティ製造機 #phperkaigi
o0h
PRO
0
330
モジュラモノリス導入から4年間の総括:アーキテクチャと組織の相互作用について / Architecture and Organizational Interaction
nazonohito51
3
1.4k
Zero Data Loss Autonomous Recovery Service サービス概要
oracle4engineer
PRO
3
13k
GCASアップデート(202601-202603)
techniczna
0
250
2026年もソフトウェアサプライチェーンのリスクに立ち向かうために / Product Security Square #3
flatt_security
1
740
1GB RAMのラズピッピで何ができるのか試してみよう / 20260319-rpijam-1gb-rpi-whats-possible
akkiesoft
0
750
俺の/私の最強アーキテクチャ決定戦開催 ― チームで新しいアーキテクチャに適合していくために / 20260322 Naoki Takahashi
shift_evolve
PRO
1
380
Featured
See All Featured
Thoughts on Productivity
jonyablonski
75
5.1k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.2k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
110
Making the Leap to Tech Lead
cromwellryan
135
9.8k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.8k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
96
Deep Space Network (abreviated)
tonyrice
0
94
Balancing Empowerment & Direction
lara
5
960
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
570
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.4k
We Are The Robots
honzajavorek
0
200
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857