senior researcher at Yahoo! JAPAN Research. • My specialty is natural language processing • I am engaged in research on Japanese language analysis using deep learning and the development of service contributions using the latest natural language processing technology. • My hobbies are shogi and go. Tomohide Shibata
with large- scale data and can be used for various tasks - BERT [Devlin+ 18], GPT-3 [Brown+ 20], CLIP [Radford+ 21], .. - Foundation models make great progress in Natural Language Processing (NLP) and Computer Vision “On the Opportunities and Risks of Foundation Models” https://arxiv.org/abs/2108.07258
- Bidirectional Encoder Representations from Transformers - Perform much better in a variety of NLP tasks - Consists of two steps: pre-training and fine-tuning
so easy - Difficult for engineers who are not familiar with NLP (difficult even for NLP engineers) 2. Developing models separately in individual departments would be wasteful → To solve these problems, we are developing a platform called AutoFM
perform fine-tuning just by submitting a job https://techblog.yahoo.co.jp/entry/2021083130180585/ $ acloud laketahoe jobs submit training <job id> --config <config file> # of epochs learning rate maximize accuracy on validation set (in-house model training system)
for understanding user needs and issues - Have to categorize them into person’s name, product name, etc. - Web search queries are long-tail → Even low-frequent queries have to be categorized with high accuracy - New words and jargon are created day to day - BERT performs much better than conventional machine learning methods
Ϡϑʔ ϩάΠϯ ʜ খḺϑΣϦʔ ʜ όδϦεΫឺϢχϝϞ ϛογϣϯ ʜ Learn general meaning Web search logs (50M queries) Pre-trained model Downstream task Solve cloze tasks → No human labeling about 15 days model sharing The search queries in this presentation are obtained within the scope of our privacy policy and processed in such a way that individuals cannot be identified.
Ϡϑʔ ϩάΠϯ ʜ খḺϑΣϦʔ ʜ όδϦεΫឺϢχϝϞ ϛογϣϯ ʜ Learn general meaning Web search logs (50M queries) Pre-trained model amazon ネットショッピング youtube 動画 ソフトバンク スマートデバイス,企業・組織 マリトッツォ グルメ,レシピ・料理 .. Downstream task query categorization Fine-tuned model Solve cloze tasks → No human labeling human labeling (about 30,000 queries) about 10 minutes model sharing The search queries in this presentation are obtained within the scope of our privacy policy and processed in such a way that individuals cannot be identified. about 15 days
マリトッツォとは マリトッツォ コンビニ マリトッツォ レシピ ゴディバ マリトッツォ マリトッツォ カロリー ʜ Web search logs amazon ネットショッピング youtube 動画 … どら焼き グルメ, レシピ・料理 .. https://ja.wikipedia.org/wiki/マリトッツォ どら焼き レシピ うさぎや どら焼き どら焼き 有名 どら焼き お取り寄せ コンビニ どら焼き ʜ https://ja.wikipedia.org/wiki/どら焼き マリトッツォ ≒ どら焼き “Maritozzo” can be categorized into “グルメ” and “レシピ・料理” New words can be trained by web search logs Only core data is given
Add other models other than BERT (encoder-decoder model, sentence vector learning, etc.) - Backbone: Huggingface transformers - Provide Web interface → Non-engineers can use AutoFM
for the training and inference of foundation models on our in-house AI platform - Continue to extend our system according to requests by several projects