Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Interactive LDA

Interactive LDA

Avatar for Quentin Pleplé

Quentin Pleplé

May 03, 2013
Tweet

More Decks by Quentin Pleplé

Other Decks in Research

Transcript

  1. Latent Dirichlet Allocation LDA discovers topics into a collection of

    documents. obama budweiser baseball speech pizza words ϕk Topic k LDA tags each document with topics. politics sports news economy war topics θd Document d
  2. Model based on a generative process for topic k =

    1, ..., K do Draw a word-distribution ϕk ∼ Dir(β) for document d = 1, ..., D do Draw a topic-distribution θd ∼ Dir(α) for position i = 1, ..., N in document d do Draw a topic k ∼ Discrete(θd ) Draw a word wdi ∼ Discrete(ϕk )
  3. Inference Recover model parameters: topics ϕk tagging θd of documents

    But because: Completely unsupervised Function optimized = human judgment We get junk topics. Live supervision
  4. Interactive LDA Repeat: Run LDA until convergence Ask user for

    feedback Move the current state Repeatedly use live user feedback to kick LDA out its local optimum
  5. Previous work [Andrzejewski et. al, 2009] [Hu et. al, 2013]

    Gibbs sampling Encode user feedback (must link, cannot link) into a complex prior Keep running Gibbs sampler
  6. Variational EM for LDA Posterior p(ϕk , θd | corpus)

    intractable Approximate it using Variational EM Instead of having ϕk ∼ Dir(β) Break dependencies ϕk ∼ Dir(λk ) λkw count(word w has been drawn from topic k)
  7. Example Corpus = “Pixel Cafe.” “The Pixel Cafe is a

    weekly seminar.” λ = ← λ1 ← λ2 pixel 2 0 cafe 2 0 weekly 0 1 seminar 0 1 In reality (smoothing 0.01) λ = 1.81 0.21 1.91 0.11 0.11 0.91 0.21 0.81
  8. Variational EM for LDA Initialize topics λ randomly Repeat until

    convergence: E step. tag documents with topics keeping topics λ fixed M step. update topics λ according to assignments in E step We can modify topics λ between each epoch.
  9. Remove words from topics Given a word w and a

    topic k λ = 8 3 7 5 5 0 ← row k w Procedure: λkw ← 0
  10. Delete topics Given topic k λ = 8 3 5

    7 5 ← row k Procedure: Remove row k in λ Number of topics K ← K − 1
  11. Merge topics Given two topics k1 and k2 λ =

    ← row k1 ← row k2 Procedure: λk1 ← λk1 + λk2 Delete topic k2
  12. Split topics Given set of words W1 and W2 and

    topic k1 λ = 1 3 5 7 5 2 1 1 8 4 ← row k1 ← row k2 move fraction ρw Procedure: Add a row k2 in λ For each word w: λk2 w = ρw λk1 w λk1 w = (1 − ρw ) λk1 w with ρw = p(w|W2 ) i p(w|Wi ) Number of topics K ← K + 1
  13. Can we do this? Is the algorithm still correct? Yes.

    No assumption about λ between epochs. Is the model still optimal? Yes. After each user update, LDA converges again. Variational EM Initialize λ randomly Repeat: E step tag docs M step update λ Interactive LDA Repeat: λ = EMepochs(λ) User updates λ