Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Boaz Leskes
June 04, 2013
Technology
0
72
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
330
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
540
Resiliency in Elasticsearch & Lucene
bleskes
0
250
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
730
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
380
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
690
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
340
Other Decks in Technology
See All in Technology
Kubernetesにおける推論基盤
ry
1
180
AWSをCLIで理解したい! / I want to understand AWS using the CLI
mel_27
2
270
Security Diaries of an Open Source IAM
ahus1
0
210
ナレッジワーク IT情報系キャリア研究セッション資料(情報処理学会 第88回全国大会 )
kworkdev
PRO
0
140
OCI技術資料 : コンピュート・サービス 概要
ocise
4
54k
SRE NEXT 2026 CfP レビュアーが語る聞きたくなるプロポーザルとは?
yutakawasaki0911
0
160
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
11k
楽しく学ぼう!コミュニティ入門 AWSと人が つむいできたストーリー
hiroramos4
PRO
1
180
聲の形にみるアクセシビリティ
tomokusaba
0
160
ブラックボックス観測に基づくAI支援のプロトコルのリバースエンジニアリングと再現~AIを用いたリバースエンジニアリング~ @ SECCON 14 電脳会議 / Reverse Engineering and Reproduction of an AI-Assisted Protocol Based on Black-Box Observation @ SECCON 14 DENNO-KAIGI
chibiegg
0
160
製造業ドメインにおける LLMプロダクト構築: 複雑な文脈へのアプローチ
caddi_eng
1
540
EMからICへ、二周目人材としてAI全振りのプロダクト開発で見つけた武器
yug1224
5
510
Featured
See All Featured
Paper Plane
katiecoart
PRO
0
47k
Heart Work Chapter 1 - Part 1
lfama
PRO
5
35k
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
Writing Fast Ruby
sferik
630
63k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
190
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Designing for humans not robots
tammielis
254
26k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
150
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.2k
The agentic SEO stack - context over prompts
schlessera
0
680
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
9.9k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857