Upgrade to Pro — share decks privately, control downloads, hide ads and more …




Javier Honduvilla Coto

November 24, 2016


  1. Word2vec Word representation in Vector Space Javier Honduvilla Coto

  2. What’s word2vec? • Vector representation of words • Uses neural

    networks (more on the training later) • Unsupervised • Published in 2013 by Google researchers and engineers • A companion C implementation was published with the paper
  3. Why? Image and video representation is pretty rich, usually done

    with humongous vectors – commonly having a high dimensionality. Meanwhile, words are usually mapped to arbitrary IDs such as the word itself.
  4. Previous work • Counting based methods: probability of a word

    happening with some neighbour words • Predictive models: guess using nearby words’ vectors
  5. Cool things of this model • Continuous Bag of Words:

    predict a word using previous words (good in small models) • Skip-Gram: predict words which are close, from the context from an input word (good for big models) => • Pretty good performance (100 billions words/day in a single box) • 33 billions: 72% accuracy
  6. None
  7. Example (distance to Sweden)

  8. None
  9. Vector operations!!

  10. None
  11. None
  12. Appendix • Original paper: http://papers.nips.cc/paper/5021-distributed-representations-of-words- and-phrases-and-their-compositionality.pdf • Original implementation: https://code.google.com/archive/p/word2vec

    • Interesting JVM implementation https://deeplearning4j.org/word2vec