Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ensemble Topic Modelling
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Leland McInnes
July 12, 2019
Research
1
480
Ensemble Topic Modelling
A short lightning talk on ensemble topic modelling with pLSA using the enstop package.
Leland McInnes
July 12, 2019
Tweet
Share
More Decks by Leland McInnes
See All by Leland McInnes
PyNNDescent: Fast Approximate Nearest Neighbors with Numba
lmcinnes
0
1k
Word and Document Embeddings
lmcinnes
0
160
Topological Data Analysis
lmcinnes
1
340
Learning Topology: topological methods for unsupervised learning
lmcinnes
2
3.6k
A Guide to Dimension Reduction
lmcinnes
3
1.4k
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
lmcinnes
2
2.7k
Other Decks in Research
See All in Research
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
250
Any-Optical-Model: A Universal Foundation Model for Optical Remote Sensing
satai
3
150
[SITA2025 Workshop] 空中計算による高速・低遅延な分散回帰分析
k_sato
0
110
生成AI による論文執筆サポート・ワークショップ 論文執筆・推敲編 / Generative AI-Assisted Paper Writing Support Workshop: Drafting and Revision Edition
ks91
PRO
0
140
ローテーション別のサイドアウト戦略 ~なぜあのローテは回らないのか?~
vball_panda
0
300
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
440
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
230
FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
satai
3
160
Agentic AI フレームワーク戦略白書 (2025年度版)
mickey_kubo
1
130
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
1k
データサイエンティストをめぐる環境の違い2025年版〈一般ビジネスパーソン調査の国際比較〉
datascientistsociety
PRO
0
870
衛星×エッジAI勉強会 衛星上におけるAI処理制約とそ取組について
satai
3
220
Featured
See All Featured
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.6k
The SEO Collaboration Effect
kristinabergwall1
0
380
Odyssey Design
rkendrick25
PRO
2
530
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
59
50k
Agile that works and the tools we love
rasmusluckow
331
21k
Accessibility Awareness
sabderemane
0
71
Fireside Chat
paigeccino
41
3.8k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
92
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
370
Everyday Curiosity
cassininazir
0
150
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
62
50k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
860
Transcript
Ensemble Topic Modelling Leland McInnes
Model a corpus of documents in terms of underlying “topics”
Topic Modelling as Matrix Factorization
None
None
None
None
LDA and pLSA are probabilistic matrix factorization methods
(Ensembles of) pLSA
Performance?
None
Quality?
None
Instability?
These are hard optimization problems
Topics vary from one run to another
What are the stable topics? Inspired by https://github.com/RaRe-Technologies/gensim/pull/2282
None
Each cluster represents a stable topic
None
• Greater stability • Determines number of topics automatically •
Embarrassingly parallel computation
Implementation
sklearn API
None
https://github.com/lmcinnes/enstop
pip install enstop