Word2vec
Word representation in Vector Space
Javier Honduvilla Coto
Slide 2
Slide 2 text
What’s word2vec?
●
Vector representation of words
●
Uses neural networks (more on the training later)
●
Unsupervised
●
Published in 2013 by Google researchers and engineers
●
A companion C implementation was published with the paper
Slide 3
Slide 3 text
Why?
Image and video representation is pretty rich, usually done with
humongous vectors – commonly having a high dimensionality. Meanwhile,
words are usually mapped to arbitrary IDs such as the word itself.
Slide 4
Slide 4 text
Previous work
●
Counting based methods: probability of a word happening with some
neighbour words
●
Predictive models: guess using nearby words’ vectors
Slide 5
Slide 5 text
Cool things of this model
●
Continuous Bag of Words: predict a word using previous words (good
in small models)
●
Skip-Gram: predict words which are close, from the context from an
input word (good for big models)
=>
●
Pretty good performance (100 billions words/day in a single box)
●
33 billions: 72% accuracy
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
Example (distance to Sweden)
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
Vector operations!!
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
Appendix
●
Original paper:
http://papers.nips.cc/paper/5021-distributed-representations-of-words-
and-phrases-and-their-compositionality.pdf
●
Original implementation:
https://code.google.com/archive/p/word2vec
●
Interesting JVM implementation
https://deeplearning4j.org/word2vec