Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Leland McInnes
July 12, 2019
Research
1
450
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
990
Word and Document Embeddings
lmcinnes
0
140
Topological Data Analysis
lmcinnes
1
310
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.5k
A Guide to Dimension Reduction
lmcinnes
3
1.3k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.5k
Other Decks in Research
See All in Research
スキマバイトサービスにおける現場起点でのデザインアプローチ
yoshioshingyouji
0
240
Mechanistic Interpretability:解釈可能性研究の新たな潮流
koshiro_aoki
1
440
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
140
最適化と機械学習による問題解決
mickey_kubo
0
180
大規模な2値整数計画問題に対する 効率的な重み付き局所探索法
mickey_kubo
1
380
PhD Defense 2025: Visual Understanding of Human Hands in Interactions
tkhkaeio
1
250
20250725-bet-ai-day
cipepser
2
470
機械学習と数理最適化の融合 (MOAI) による革新
mickey_kubo
1
380
2025年度人工知能学会全国大会チュートリアル講演「深層基盤モデルの数理」
taiji_suzuki
25
19k
Nullspace MPC
mizuhoaoki
1
160
EarthSynth: Generating Informative Earth Observation with Diffusion Models
satai
3
360
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
satai
3
340
Featured
See All Featured
The Cost Of JavaScript in 2023
addyosmani
53
9k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
6.1k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
For a Future-Friendly Web
brad_frost
180
9.9k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
19
1.2k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Why Our Code Smells
bkeepers
PRO
339
57k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.9k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Become a Pro
speakerdeck
PRO
29
5.5k
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop