Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
63
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
320
Life of a Document in Elasticsearch
bleskes
3
3.3k
Resiliency in Elasticsearch & Lucene
bleskes
0
510
Resiliency in Elasticsearch & Lucene
bleskes
0
230
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
720
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
370
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
680
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
310
Other Decks in Technology
See All in Technology
COVESA VSSによる車両データモデルの標準化とAWS IoT FleetWiseの活用
osawa
1
290
Practical Agentic AI in Software Engineering
uzyn
0
110
職種の壁を溶かして開発サイクルを高速に回す~情報透明性と職種越境から考えるAIフレンドリーな職種間連携~
daitasu
0
170
BPaaSにおける人と協働する前提のAIエージェント-AWS登壇資料
kentarofujii
0
140
新アイテムをどう使っていくか?みんなであーだこーだ言ってみよう / 20250911-rpi-jam-tokyo
akkiesoft
0
280
Language Update: Java
skrb
2
300
共有と分離 - Compose Multiplatform "本番導入" の設計指針
error96num
2
580
Django's GeneratedField by example - DjangoCon US 2025
pauloxnet
0
150
AI時代を生き抜くエンジニアキャリアの築き方 (AI-Native 時代、エンジニアという道は 「最大の挑戦の場」となる) / Building an Engineering Career to Thrive in the Age of AI (In the AI-Native Era, the Path of Engineering Becomes the Ultimate Arena of Challenge)
jeongjaesoon
0
170
これでもう迷わない!Jetpack Composeの書き方実践ガイド
zozotech
PRO
0
860
slog.Handlerのよくある実装ミス
sakiengineer
4
180
初めてAWSを使うときのセキュリティ覚書〜初心者支部編〜
cmusudakeisuke
1
260
Featured
See All Featured
How STYLIGHT went responsive
nonsquared
100
5.8k
Building Applications with DynamoDB
mza
96
6.6k
Side Projects
sachag
455
43k
How GitHub (no longer) Works
holman
315
140k
Designing for humans not robots
tammielis
253
25k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
44
2.5k
Docker and Python
trallard
45
3.6k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
51
5.6k
Being A Developer After 40
akosma
90
590k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.7k
Speed Design
sergeychernyshev
32
1.1k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857