Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
63
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
320
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
510
Resiliency in Elasticsearch & Lucene
bleskes
0
230
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
710
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
670
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
310
Other Decks in Technology
See All in Technology
第9回情シス転職ミートアップ_テックタッチ株式会社
forester3003
0
230
“社内”だけで完結していた私が、AWS Community Builder になるまで
nagisa53
1
390
Amazon Bedrockで実現する 新たな学習体験
kzkmaeda
2
540
Clineを含めたAIエージェントを 大規模組織に導入し、投資対効果を考える / Introducing AI agents into your organization
i35_267
4
1.6k
25分で解説する「最小権限の原則」を実現するための AWS「ポリシー」大全 / 20250625-aws-summit-aws-policy
opelab
9
1.1k
Claude Code Actionを使ったコード品質改善の取り組み
potix2
PRO
6
2.2k
ひとり情シスなCTOがLLMと始めるオペレーション最適化 / CTO's LLM-Powered Ops
yamitzky
0
430
Understanding_Thread_Tuning_for_Inference_Servers_of_Deep_Models.pdf
lycorptech_jp
PRO
0
120
より良いプロダクトの開発を目指して - 情報を中心としたプロダクト開発 #phpcon #phpcon2025
bengo4com
1
3.1k
Observability в PHP без боли. Олег Мифле, тимлид Altenar
lamodatech
0
350
データプラットフォーム技術におけるメダリオンアーキテクチャという考え方/DataPlatformWithMedallionArchitecture
smdmts
5
630
【5分でわかる】セーフィー エンジニア向け会社紹介
safie_recruit
0
26k
Featured
See All Featured
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
2.9k
[RailsConf 2023] Rails as a piece of cake
palkan
55
5.6k
Agile that works and the tools we love
rasmusluckow
329
21k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
5.9k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
Being A Developer After 40
akosma
90
590k
Building Applications with DynamoDB
mza
95
6.5k
Music & Morning Musume
bryan
46
6.6k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
48
2.8k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
107
19k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.4k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857