Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Leland McInnes
July 12, 2019
Research
1
490
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
1k
Word and Document Embeddings
lmcinnes
0
160
Topological Data Analysis
lmcinnes
1
350
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.6k
A Guide to Dimension Reduction
lmcinnes
3
1.4k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.7k
Other Decks in Research
See All in Research
20年前に50代だった人たちの今
hysmrk
0
170
LLMアプリケーションの透明性について
fufufukakaka
0
200
2026年3月1日(日)福島「除染土」の公共利用をかんがえる
atsukomasano2026
0
470
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
500
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
860
YOLO26_ Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection
satai
3
170
「なんとなく」の顧客理解から脱却する ──顧客の解像度を武器にするインサイトマネジメント
tajima_kaho
10
7.1k
ForestCast: Forecasting Deforestation Risk at Scale with Deep Learning
satai
3
560
COFFEE-Japan PROJECT Impact Report(海ノ向こうコーヒー)
ontheslope
0
1.1k
世界モデルにおける分布外データ対応の方法論
koukyo1994
7
2k
業界横断 副業コンプライアンス調査 三者(副業者・本業先・発注者)におけるトラブル認知ギャップの構造分析
fkske
0
1.2k
SREはサイバネティクスの夢をみるか? / Do SREs Dream of Cybernetics?
yuukit
3
440
Featured
See All Featured
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
52k
Why Our Code Smells
bkeepers
PRO
340
58k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.7k
What's in a price? How to price your products and services
michaelherold
247
13k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Evolving SEO for Evolving Search Engines
ryanjones
0
170
How to Ace a Technical Interview
jacobian
281
24k
Exploring anti-patterns in Rails
aemeredith
2
290
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
320
It's Worth the Effort
3n
188
29k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
150
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop