Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
70
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
530
Resiliency in Elasticsearch & Lucene
bleskes
0
240
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
330
Other Decks in Technology
See All in Technology
外部キー制約の知っておいて欲しいこと - RDBMSを正しく使うために必要なこと / FOREIGN KEY Night
soudai
PRO
12
5.6k
StrandsとNeptuneを使ってナレッジグラフを構築する
yakumo
1
130
Context Engineeringの取り組み
nutslove
0
380
SREチームをどう作り、どう育てるか ― Findy横断SREのマネジメント
rvirus0817
0
350
Tebiki Engineering Team Deck
tebiki
0
24k
Oracle AI Database移行・アップグレード勉強会 - RAT活用編
oracle4engineer
PRO
0
110
AIエージェントに必要なのはデータではなく文脈だった/ai-agent-context-graph-mybest
jonnojun
1
250
ランサムウェア対策としてのpnpm導入のススメ
ishikawa_satoru
0
230
制約が導く迷わない設計 〜 信頼性と運用性を両立するマイナンバー管理システムの実践 〜
bwkw
3
1.1k
顧客の言葉を、そのまま信じない勇気
yamatai1212
1
370
こんなところでも(地味に)活躍するImage Modeさんを知ってるかい?- Image Mode for OpenShift -
tsukaman
1
170
広告の効果検証を題材にした因果推論の精度検証について
zozotech
PRO
0
210
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
77
5.3k
Producing Creativity
orderedlist
PRO
348
40k
Ecommerce SEO: The Keys for Success Now & Beyond - #SERPConf2024
aleyda
1
1.8k
Google's AI Overviews - The New Search
badams
0
910
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
330
Why Our Code Smells
bkeepers
PRO
340
58k
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
Art, The Web, and Tiny UX
lynnandtonic
304
21k
Thoughts on Productivity
jonyablonski
74
5k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
920
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857