Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Leland McInnes
July 12, 2019
Research
1
480
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
1k
Word and Document Embeddings
lmcinnes
0
150
Topological Data Analysis
lmcinnes
1
340
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.6k
A Guide to Dimension Reduction
lmcinnes
3
1.4k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.7k
Other Decks in Research
See All in Research
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
R&Dチームを起ち上げる
shibuiwilliam
1
150
離散凸解析に基づく予測付き離散最適化手法 (IBIS '25)
taihei_oki
PRO
1
680
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
180
生成AI による論文執筆サポート・ワークショップ ─ サーベイ/リサーチクエスチョン編 / Workshop on AI-Assisted Paper Writing Support: Survey/Research Question Edition
ks91
PRO
0
140
SREはサイバネティクスの夢をみるか? / Do SREs Dream of Cybernetics?
yuukit
3
370
Remote sensing × Multi-modal meta survey
satai
4
710
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
1.1k
2025-11-21-DA-10th-satellite
yegusa
0
110
LLMアプリケーションの透明性について
fufufukakaka
0
130
Grounding Text Complexity Control in Defined Linguistic Difficulty [Keynote@*SEM2025]
yukiar
0
100
Agentic AI フレームワーク戦略白書 (2025年度版)
mickey_kubo
1
120
Featured
See All Featured
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
99
[SF Ruby Conf 2025] Rails X
palkan
1
740
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
320
Writing Fast Ruby
sferik
630
62k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
90
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
Bash Introduction
62gerente
615
210k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
120
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop