Upgrade to Pro — share decks privately, control downloads, hide ads and more …

トピックモデル - AS^2 LT

トピックモデル - AS^2 LT

At BrainPad, Inc.

Sorami Hisamoto

April 17, 2015
Tweet

More Decks by Sorami Hisamoto

Other Decks in Technology

Transcript

  1. ๏ Modeling latent “topics” of each data. ๏ Originally a

    method for text,
 but not limited to text. What is topic modeling? 2 Figure from [Blei+ 2003] Data e.g. document Topics e.g. word distribution
  2. ๏ Matrix Decompositions: LSI, SVD, … ๏ 1999: pLSI ๏

    2003: LDA
 Same method independently found in population genetics [Pritchard+ 200] ๏ 2003-: Extensions of LDA ๏ 2007-: Scalable algorithms 7 History of the topic models
  3. latent Dirichlet allocation (LDA) [Blei+ 2003] 8 ๏ “Document” is

    a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.
  4. latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process

    for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.
  5. latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process

    for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.
  6. latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process

    for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.
  7. latent Dirichlet allocation (LDA) [Blei+ 2003] 8 Two-stage generation process

    for each document 1. Randomly choose a distribution over topics. 2. For each word in the document a) Randomly choose a topic from the distribution over topic in step #1. b) Randomly choose a word from the corresponding topic. ๏ “Document” is a set of “Words”. ๏ “Document” consists of multiple “Topics”. ๏ “Topic” is a distribution over a vocabulary (all possible words). ๏ “Words” are generated by “Topics”.
  8. 9 Figure from [Blei 2011] Topic: distribution over vocabulary Step

    1: Choose a distribution over topics Step 2a: Choose a topic from distribution
  9. 9 Figure from [Blei 2011] Topic: distribution over vocabulary Step

    1: Choose a distribution over topics Step 2a: Choose a topic from distribution Step 2b: Choose a word from topic
  10. 10 Figures from [Blei 2011] observed word topic assignment topic

    proportion topic Graphical model representation
  11. 10 Figures from [Blei 2011] Joint probability of hidden and

    observed variables observed word topic assignment topic proportion topic Graphical model representation
  12. 10 Figures from [Blei 2011] Joint probability of hidden and

    observed variables observed word topic assignment topic proportion topic Graphical model representation
  13. 10 Figures from [Blei 2011] Joint probability of hidden and

    observed variables observed word topic assignment topic proportion topic Graphical model representation
  14. Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose

    a distribution over topics Topic: in word simplex В
  15. Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose

    a distribution over topics Topic: in word simplex Step 2a: Choose a topic from distribution В ;
  16. Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose

    a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ;
  17. Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose

    a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ; LDA: finding the optimal sub-simplex to represent documents.
  18. Geometric interpretation 11 Figure from [Blei+ 2003] Step 1: Choose

    a distribution over topics Step 2b: Choose a word from topic Topic: in word simplex 8 Step 2a: Choose a topic from distribution В ; LDA: finding the optimal sub-simplex to represent documents. sub-simplex
  19. “reverse” the generation process ๏ We are interested in the

    posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12
  20. “reverse” the generation process ๏ We are interested in the

    posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12
  21. “reverse” the generation process ๏ We are interested in the

    posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12
  22. “reverse” the generation process ๏ We are interested in the

    posterior distribution. ๏ latent topic structure, given the observed documents. ๏ But it is difficult … → approximate: ๏ 1. Sampling-based methods (e.g. Gibbs sampling) ๏ 2. Variational methods (e.g. variational Bayes) ๏ etc… 12
  23. ๏ Hierarchical Dirichlet Process [Teh+ 2005] ๏ Correlated Topic Models

    [Blei+ 2006] ๏ Supervised Topic Models [Blei+ 2007] ๏ Topic Models with Power-law using Pitman-Yor process [Sato+ 2010] ๏ Time-series: ๏ Dynamic Topic Models [Blei+ 2006] ๏ Continuous Time Dynamic Topic Models [Wang+ 2008] ๏ Online Multiscale Dynamic Topic Models [Iwata+ 2010] ๏ Various learning methods ๏ Various scaling algorithms ๏ Various applications ๏ … 13 Extensions of LDA
  24. ๏ Text analysis
 Papers, Blogs, Classical texts … ๏ Video

    analysis ๏ Audio analysis ๏ Bioinformatics ๏ Network analysis ๏ … 14 Applications
  25. ๏ Gensim
 Python-based, Radim Řehůřek
 ๏ Mallet
 Java-based, UMass
 ๏

    Stanford Topic Modeling Toolbox
 Java-based, Stanford
 15 Tools
  26. ๏ [Blei&Lafferty 2009] Topic Models
 http://www.cs.princeton.edu/~blei/papers/BleiLafferty2009.pdf ๏ [Blei 2011] Introduction

    to Probabilistic Topic Models
 https://www.cs.princeton.edu/~blei/papers/Blei2011.pdf ๏ [Blei 2012] Review Articles: Probabilistic Topic Models
 Communications of The ACM
 http://www.cs.princeton.edu/~blei/papers/Blei2012.pdf ๏ [Blei 2012] Probabilistic Topic Models
 Machine Learning Summer School
 http://www.cs.princeton.edu/~blei/blei-mlss-2012.pdf ๏ Topic Models by David Blei (video)
 https://www.youtube.com/watch?v=DDq3OVp9dNA 17 References (2): papers, videos, and articles ๏ What is a good explanation of Latent Dirichlet Allocation? - Quora
 http://www.quora.com/What-is-a-good-explanation-of-Latent-Dirichlet-Allocation ๏ The LDA Buffet is Now Open by Matthew L. Jockers
 http://www.matthewjockers.net/2011/09/29/ ๏ [ࠤ౻ 2012] ࢲͷϒοΫϚʔΫ Latent Topic Model (જࡏతτϐοΫϞσϧ)
 http://www.ai-gakkai.or.jp/my-bookmark_vol27-no3/ ๏ [࣋ڮ&ੴࠇ 2013] ֬཰తτϐοΫϞσϧ
 ౷ܭ਺ཧݚڀॴ H24೥౓ެ։ߨ࠲
 http://www.ism.ac.jp/~daichi/lectures/ISM-2012-TopicModels-daichi.pdf ๏ Links to the Papers Related to Topic Models by Tomonori Masada
 http://tmasada.wikispaces.com/Links+to+the+Papers+Related+to+Topic+Models