Slide 1

Slide 1 text

τϐοΫϞσϧ Sorami Hisamoto AS^2 LT April 18, 2015

Slide 2

Slide 2 text

๏ Modeling latent “topics” of each data. ๏ Originally a method for text,
 but not limited to text. What is topic modeling? 2 Figure from [Blei+ 2003] Data e.g. document Topics e.g. word distribution

Slide 3

Slide 3 text

Slide 4

Slide 4 textτϐοΫϞσϧʹجͮ͘ଟ༷ੑͷఆྔԽ

Slide 5

Slide 5 textτϐοΫϞσϧʹجͮ͘ଟ༷ੑͷఆྔԽ

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

๏ Matrix Decompositions: LSI, SVD, … ๏ 1999: pLSI ๏ 2003: LDA
 Same method independently found in population genetics [Pritchard+ 200] ๏ 2003-: Extensions of LDA ๏ 2007-: Scalable algorithms 7 History of the topic models

Slide 16

Slide 16 text

latent Dirichlet allocation (LDA) [Blei+ 2003] 8 ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.

Slide 17

Slide 17 text

latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.

Slide 18

Slide 18 text

latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.

Slide 19

Slide 19 text

latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.

Slide 20

Slide 20 text

latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.

Slide 21

Slide 21 text

9 Figure from [Blei 2011]

Slide 22

Slide 22 text

9 Figure from [Blei 2011] Topic: distribution over vocabulary

Slide 23

Slide 23 text

9 Figure from [Blei 2011] Topic: distribution over vocabulary Step 1: Choose a distribution over topics

Slide 24

Slide 24 text

9 Figure from [Blei 2011] Topic: distribution over vocabulary Step 1: Choose a distribution over topics Step 2a: Choose a topic from distribution

Slide 25

Slide 25 text

9 Figure from [Blei 2011] Topic: distribution over vocabulary Step 1: Choose a distribution over topics Step 2a: Choose a topic from distribution Step 2b: Choose a word from topic

Slide 26

Slide 26 text

10 Figures from [Blei 2011] Graphical model representation

Slide 27

Slide 27 text

10 Figures from [Blei 2011] topic Graphical model representation

Slide 28

Slide 28 text

10 Figures from [Blei 2011] topic proportion topic Graphical model representation

Slide 29

Slide 29 text

10 Figures from [Blei 2011] topic assignment topic proportion topic Graphical model representation

Slide 30

Slide 30 text

10 Figures from [Blei 2011] observed word topic assignment topic proportion topic Graphical model representation

Slide 31

Slide 31 text

10 Figures from [Blei 2011] Joint probability of hidden and observed variables observed word topic assignment topic proportion topic Graphical model representation

Slide 32

Slide 32 text

10 Figures from [Blei 2011] Joint probability of hidden and observed variables observed word topic assignment topic proportion topic Graphical model representation

Slide 33

Slide 33 text

10 Figures from [Blei 2011] Joint probability of hidden and observed variables observed word topic assignment topic proportion topic Graphical model representation

Slide 34

Slide 34 text

Geometric interpretation 11 Figure from [Blei+ 2003]

Slide 35

Slide 35 text

Geometric interpretation 11 Figure from [Blei+ 2003] Topic: in word simplex

Slide 36

Slide 36 text

Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose a distribution over topics Topic: in word simplex В

Slide 37

Slide 37 text

Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose a distribution over topics Topic: in word simplex Step 2a: Choose a topic from distribution В ;

Slide 38

Slide 38 text

Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ;

Slide 39

Slide 39 text

Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ; LDA: finding the optimal sub-simplex to represent documents.

Slide 40

Slide 40 text

Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ; LDA: finding the optimal sub-simplex to represent documents. sub-simplex

Slide 41

Slide 41 text

“reverse” the generation process ๏ We are interested in the posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12

Slide 42

Slide 42 text

“reverse” the generation process ๏ We are interested in the posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12

Slide 43

Slide 43 text

“reverse” the generation process ๏ We are interested in the posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12

Slide 44

Slide 44 text

“reverse” the generation process ๏ We are interested in the posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12

Slide 45

Slide 45 text

๏ Hierarchical Dirichlet Process [Teh+ 2005] ๏ Correlated Topic Models [Blei+ 2006] ๏ Supervised Topic Models [Blei+ 2007] ๏ Topic Models with Power-law using Pitman-Yor process [Sato+ 2010] ๏ Time-series: ๏ Dynamic Topic Models [Blei+ 2006] ๏ Continuous Time Dynamic Topic Models [Wang+ 2008] ๏ Online Multiscale Dynamic Topic Models [Iwata+ 2010] ๏ Various learning methods ๏ Various scaling algorithms ๏ Various applications ๏ … 13 Extensions of LDA

Slide 46

Slide 46 text

๏ Text analysis
 Papers, Blogs, Classical texts … ๏ Video analysis ๏ Audio analysis ๏ Bioinformatics ๏ Network analysis ๏ … 14 Applications

Slide 47

Slide 47 text

๏ Gensim
 Python-based, Radim Řehůřek
 ๏ Mallet
 Java-based, UMass
 ๏ Stanford Topic Modeling Toolbox
 Java-based, Stanford
 15 Tools

Slide 48

Slide 48 text

References (1): books 16 “τϐοΫϞσϧʹΑΔ౷ܭతજࡏҙຯղੳ” ࠤ౻Ұ੣, 2015 “τϐοΫϞσϧ (ػցֶशϓϩϑΣογϣφϧγϦʔζ) ” ؠా۩࣏, 2015

Slide 49

Slide 49 text

๏ [Blei&Lafferty 2009] Topic Models ๏ [Blei 2011] Introduction to Probabilistic Topic Models ๏ [Blei 2012] Review Articles: Probabilistic Topic Models
 Communications of The ACM ๏ [Blei 2012] Probabilistic Topic Models
 Machine Learning Summer School ๏ Topic Models by David Blei (video) 17 References (2): papers, videos, and articles ๏ What is a good explanation of Latent Dirichlet Allocation? - Quora ๏ The LDA Buffet is Now Open by Matthew L. Jockers ๏ [ࠤ౻ 2012] ࢲͷϒοΫϚʔΫ Latent Topic Model (જࡏతτϐοΫϞσϧ) ๏ [࣋ڮ&ੴࠇ 2013] ֬཰తτϐοΫϞσϧ
 ౷ܭ਺ཧݚڀॴ H24೥౓ެ։ߨ࠲ ๏ Links to the Papers Related to Topic Models by Tomonori Masada