Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Faceting analyzed fields with some sprinkles of...
Search
Boaz Leskes
June 04, 2013
Technology
0
49
Faceting analyzed fields with some sprinkles of probability theory
Talk given at Berlin buzzwords 2013
Boaz Leskes
June 04, 2013
Tweet
Share
More Decks by Boaz Leskes
See All by Boaz Leskes
Every Shard Deserves a Home - Shard Allocation in Elasticsearch
bleskes
0
310
Life of a Document in Elasticsearch
bleskes
3
3.1k
Resiliency in Elasticsearch & Lucene
bleskes
0
500
Resiliency in Elasticsearch & Lucene
bleskes
0
220
Designing Concurrent Distributed Sequence Numbers for Elasticsearch
bleskes
2
690
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
1
360
Not all Nodes are Created Equal - Scaling Elasticsearch
bleskes
6
650
The ELK Stack: For Real-Time Enlightenment
bleskes
1
1.7k
Staying Ahead of Users And Time - two use cases of scaling data with Elasticsearch
bleskes
1
310
Other Decks in Technology
See All in Technology
SREが投資するAIOps ~ペアーズにおけるLLM for Developerへの取り組み~
takumiogawa
1
240
OCI Network Firewall 概要
oracle4engineer
PRO
0
4.1k
Security-JAWS【第35回】勉強会クラウドにおけるマルウェアやコンテンツ改ざんへの対策
4su_para
0
180
CysharpのOSS群から見るModern C#の現在地
neuecc
2
3.3k
複雑なState管理からの脱却
sansantech
PRO
1
140
Shopifyアプリ開発における Shopifyの機能活用
sonatard
4
250
Adopting Jetpack Compose in Your Existing Project - GDG DevFest Bangkok 2024
akexorcist
0
110
[CV勉強会@関東 ECCV2024 読み会] オンラインマッピング x トラッキング MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping (Chen+, ECCV24)
abemii
0
220
Python(PYNQ)がテーマのAMD主催のFPGAコンテストに参加してきた
iotengineer22
0
470
初心者向けAWS Securityの勉強会mini Security-JAWSを9ヶ月ぐらい実施してきての近況
cmusudakeisuke
0
120
信頼性に挑む中で拡張できる・得られる1人のスキルセットとは?
ken5scal
2
530
強いチームと開発生産性
onk
PRO
34
11k
Featured
See All Featured
A Modern Web Designer's Workflow
chriscoyier
693
190k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.3k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Building Applications with DynamoDB
mza
90
6.1k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
126
18k
Teambox: Starting and Learning
jrom
133
8.8k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
506
140k
Designing Experiences People Love
moore
138
23k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
4
370
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
250
21k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
[RailsConf 2023] Rails as a piece of cake
palkan
52
4.9k
Transcript
Faceting analyzed fields with some sprinkles of probability theory conjures
trending topic analysis and other interesting insights Boaz Leskes Elasticsearch @bleskes work done for Buzzcapture
Trending?
© Buzzcapture
© Buzzcapture
reference reference topic © Buzzcapture
topic reference ≠
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
None
topic reference P(w|T) = kDt |w 2 Dt k kDt
k
P(w|T) = kDt |w 2 Dt k kDt k
brown dog fox quick 2 5 10 12 5 6
12 13 2 5 6 10 12 13 brown dog fox quick
In our index. • Terms = 12GB • “Arrows” =
41GB
{ tweet: { type: "string", analyzer: "whitespace" fielddata: { filter:
{ regex: "^#.*", frequency: { min: 10 } } } } } Drop terms which occur too little
Drop docs with too many terms
reference reference topic © Buzzcapture
iculture 10,122 floor 8,998 cover 6,874 toy 4,402 ground 3,841
4.0 7,878 4.1 4,292 rtacties 4,078 jelly 2,905 bean 2,857