standard in NLP research 2018 2019 2020 ELMo GPT BERT XLNet T5 GPT-3 Pretraining language model + fine tuning on task => SOTA Competition on pretrained language models
standard in NLP research 2018 2019 2020 ELMo GPT BERT XLNet T5 GPT-3 Pretraining language model + fine tuning on task => SOTA Competition on pretrained language models › High quality language model can improve many downstream tasks Sentiment analysis Question answering Search ranking … Increasing adoption in production (Google, Microsoft etc.)
Text Analysis, LINE AiCall, and more coming › Active R&D › Early study since 2018 Production-grade high quality model Large scale model Light weight model
20:00:30 .63g+'(,/ 24)1*g\^b J[0,/316 "d B@8 SM ;9%K"HWZD\^ ! F \^ $h &:ad3RP%`"bT 5 Q] 5 5 5cA 7G8?C"A EYALV">N U<=XI # "&d <BEFORE> &:ad3RP%`"bT 5Q] 5 5 5cA 7G8?C"A EYALV">N U<=XI # "&d <AFTER> Input: raw text Extractor: document filter • Cutting out samples from data resource. • Preprocessing : remove unnecessary tags. (HTML etc.) Sentence splitter / Preprocessor • Extract sentences from each sample. • Preprocessing : remove/replace special symbols/chars. Sentence filter • Determine which a cleaned data can be used as a sample or not under the various conditions; grammatical filter etc.
iterations › Distributed on NSML (on-prem GPU cluster) to scale-up batch size and reduce training time › Dynamic instances generation to reduce storage
comprehension Question answering Acc F1 Acc F1 Acc F1 Acc F1 Acc EM F1 LINE model Tohoku Univ NICT-BPE NICT-Word Evaluation benchmarks On diverse Japanese NLP tasks
comprehension Question answering Acc F1 Acc F1 Acc F1 Acc F1 Acc EM F1 LINE model Tohoku Univ NICT-BPE NICT-Word Evaluation benchmarks On diverse Japanese NLP tasks
analysis Textual entailment Named entity recognition Reading comprehension Question answering Acc F1 Acc F1 Acc F1 Acc F1 Acc EM F1 LINE model 57.49 57.27 89.31 89.46 72.66 70.81 97.90 71.99 Tohoku Univ NICT-BPE NICT-Word 83.75 78.47 79.49