Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Leland McInnes
July 12, 2019
Research
1
450
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
1k
Word and Document Embeddings
lmcinnes
0
150
Topological Data Analysis
lmcinnes
1
330
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.6k
A Guide to Dimension Reduction
lmcinnes
3
1.4k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.6k
Other Decks in Research
See All in Research
AIグラフィックデザインの進化:断片から統合(One Piece)へ / From Fragment to One Piece: A Survey on AI-Driven Graphic Design
shunk031
0
570
[Devfest Incheon 2025] 모두를 위한 친절한 언어모델(LLM) 학습 가이드
beomi
2
1k
Nullspace MPC
mizuhoaoki
1
490
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation
satai
3
410
LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
satai
3
150
国際論文を出そう!ICRA / IROS / RA-L への論文投稿の心構えとノウハウ / RSJ2025 Luncheon Seminar
koide3
10
6.3k
Remote sensing × Multi-modal meta survey
satai
4
640
CoRL2025速報
rpc
2
3.6k
思いつきが武器になる:研究というゲームを始めよう / Ideas Are Your Equipments : Let the Game of Research Begin!
ks91
PRO
0
110
音声感情認識技術の進展と展望
nagase
0
400
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
400
その推薦システムの評価指標、ユーザーの感覚とズレてるかも
kuri8ive
1
270
Featured
See All Featured
The Language of Interfaces
destraynor
162
25k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Done Done
chrislema
186
16k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
GraphQLとの向き合い方2022年版
quramy
50
14k
Git: the NoSQL Database
bkeepers
PRO
432
66k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.6k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
The Pragmatic Product Professional
lauravandoore
37
7.1k
Writing Fast Ruby
sferik
630
62k
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop