BERT ● masked LM と next sentence predictionを学習 ● BooksCorpus (0.8 billion words) + English Wikipedia (2.5 billion words) ELMo ● 3つのlayerの出力を合計したembeddings ● One Billion Word Benchmark corpus (0.8 billion words) Flair ● One Billion Word Benchmark corpus (0.8 billion words) 7 Contextualized word embeddings