Slide 1

Slide 1 text

Word2vec Word representation in Vector Space Javier Honduvilla Coto

Slide 2

Slide 2 text

What’s word2vec? ● Vector representation of words ● Uses neural networks (more on the training later) ● Unsupervised ● Published in 2013 by Google researchers and engineers ● A companion C implementation was published with the paper

Slide 3

Slide 3 text

Why? Image and video representation is pretty rich, usually done with humongous vectors – commonly having a high dimensionality. Meanwhile, words are usually mapped to arbitrary IDs such as the word itself.

Slide 4

Slide 4 text

Previous work ● Counting based methods: probability of a word happening with some neighbour words ● Predictive models: guess using nearby words’ vectors

Slide 5

Slide 5 text

Cool things of this model ● Continuous Bag of Words: predict a word using previous words (good in small models) ● Skip-Gram: predict words which are close, from the context from an input word (good for big models) => ● Pretty good performance (100 billions words/day in a single box) ● 33 billions: 72% accuracy

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Example (distance to Sweden)

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Vector operations!!

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Appendix ● Original paper: http://papers.nips.cc/paper/5021-distributed-representations-of-words- and-phrases-and-their-compositionality.pdf ● Original implementation: https://code.google.com/archive/p/word2vec ● Interesting JVM implementation https://deeplearning4j.org/word2vec