Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All We Need Is Prompting on a Pre-trained Japanese Large Language Model

All We Need Is Prompting on a Pre-trained Japanese Large Language Model

LINE DEVDAY 2021

November 10, 2021
Tweet

More Decks by LINE DEVDAY 2021

Other Decks in Technology

Transcript

  1. Toshinori Sato (@overlast) • Senior Software Engineer / Manager •

    Natural Language Processing • Information Retrieval • LINE CLOVA • Japanese NLU system • HyperCLOVA • Japanese Corpus / Evaluation • OSS: Main Contributor of NEologd project • mecab-ipadic-NEologd
  2. LINE NLP team and contributors Toshinori Sato Takashi Uemura Wataru

    Sakata Akifumi Nakamachi Kenta Shinzato Takuto Asakura Tatsuya Uchiyama Masahiko Higashiyama Tung Nguyen Shengzhe Li Koga Kobayashi Takato Yamazaki Seiichi Inoue Yoshifumi Kondo Jumon Nozaki et al.
  3. Attention, please ! The target audience is Mainly engineers who

    are interested in natural language processing. And then there is the part for NLP professionals. NLP means Natural Language Processing. Omit from this session is Detailed information related to the following. - Building language models - Tuning methods for language models See below for more information. - https://arxiv.org/abs/2109.04650 * 40-minute session Please enjoy listening to it over a cup of coffee or something ! * What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers, Boseop Kim et.al, EMNLP 2021
  4. E.g. A spoken dialogue system applying HyperCLOVA 㲔Ç㲋 Speech To

    Text Client App Dialog App HyperCLOVA Query text Voice Hello.
  5. E.g. A spoken dialogue system applying HyperCLOVA 㲔Ç㲋 Speech To

    Text Client App Dialog App HyperCLOVA Result text Query text Large-scale Language Model ɾɾɾ Knowledge Base Search Voice Hello.
  6. E.g. A spoken dialogue system applying HyperCLOVA 㲔Ç㲋 Speech To

    Text Response text Client App Dialog App HyperCLOVA Result text Query text ɾɾɾ Knowledge Base Search Voice Hello. Large-scale Language Model
  7. 㲔Ç㲋 E.g. A spoken dialogue system applying HyperCLOVA 㲔Ç㲋 Speech

    To Text Text To Speech Response text Client App Dialog App HyperCLOVA Result text Query text ɾɾɾ Knowledge Base Search Voice Hello. Sounds Long time, no see. Large-scale Language Model
  8. E.g. A spoken dialogue system applying HyperCLOVA 㲔Ç㲋 Speech To

    Text Text To Speech Response text Client App Dialog App HyperCLOVA 㲔Ç㲋 Result text Query text ɾɾɾ Knowledge Base Search Sounds Voice Long time, no see. Hello. Large-scale Language Model
  9. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  10. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  11. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  12. What is Language Model (LM) ? - HyperCLOVA includes an

    unsupervised autoregressive language model - Autoregressive language model … - is capable of calculating probability distributions - provides maximum likelihood estimation of parameters for a sample - can generate future text based on a text up to the present - A model that gives the probability of a certain sequence of words - E.g. P(It’s sunny today) > P(Sunny is today)
  13. Our policy for corpus data collection We do not use

    any data of our conversation service. - All messages on LINE - All posts on OpenChat We maintain this corpus with the utmost consideration for the rights held by various customers. - Add versatility to this corpus - Make a subset of this corpus available for use outside of LINE
  14. LINE LM Corpus(for HyperCLOVA’s LMs) NO DATA from LINE or

    LINE Open-chat is used to build our LMs - Developed based on a corpus built for training the BERT models after 2019 - Used crawled data for LINE search - Eliminated data that can be easily extracted as "non-public personal information" - Covered important sites for learning Japanese expressions - Purchased and used of external content after resolving rights issues !
  15. Current status of LINE LM Corpus For 82B JP Model

    Samples 10B Tokens 500B Bytes 1.8T
  16. Modeling status of HyperCLOVA 1.3B → 6.7B → 13B →

    39B 13B → 39B 82B 204B ʙ (in 2022) Multi-lingual Model Large model JP / Multi-lingual Hyper scale JP Model JP Model Work in progress
  17. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  18. Methods for applying LMs to a target task - Method

    for HyperCLOVA e.t.c. - Few-shot: Give “a description of a task” and “some demonstration cases” - One-shot: Give “a description of a task” and “a demonstration case” - Zero-shot: Give “a description of a task” only - Pros: Possibility to solve a task from brief instructions or short examples - Cons: Possibility of not reaching the performance of a SOTA model achieved by fine-tuning - Method for BERT e.t.c. - Fine-tuning: Supervised learning on a dataset of a target task based on a general- purpose pre-trained model - Pros: Excellent performance in benchmarks - Cons: Need to learn for each target task / Possible loss of generalization ability
  19. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Output Title(description of task) Another information Samples Query (In some cases, output is given as a suffix after inference) Shot Shot ….
  20. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task)
  21. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information
  22. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Shot Shot ….
  23. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Shot Shot ….
  24. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples
  25. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Shot
  26. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Shot Shot ….
  27. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Shot Shot ….
  28. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Query (In some cases, output is given as a suffix after inference) Shot Shot ….
  29. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Output Title(description of task) Another information Samples Query (In some cases, output is given as a suffix after inference) Shot Shot ….
  30. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Output Title(description of task) Another information Samples Query (In some cases, output is given as a suffix after inference) Shot Shot ….
  31. Task Outlines and Few Shots for individual tasks Playground(HyperCLOVA Studio)

    TextField Prompt Task outline Title(description of task) Another information Samples Query (In some cases, a next output is given as a suffix after inference) Shot Previous output …. Output
  32. • ղઆ͔Βആ۟Λੜ੒͠·͢ɻ • */͕֝ݹ͍஑ʹඈͼࠐΜͩ࣌ͷԻͷ༷ࢠΛӵΜͩ۟Ͱ͢ɻ͕֝஑ʹඈͼࠐΉԻΛදݱͨ͠୯७ͳ۟Ͱ͸͋Γ ·͕͢ɺपғͷ੩ऐ΍ऐΕͨݹ஑ͷ༷ࢠɺ͕֝஑ʹඈͼࠐΉੜͷ༂ಈͷΑ͏ͳ৘ܠ͕·͟·͟ͱ఻Θͬͯ͘ Δɺझͷ͋Δ۟ͱͳ͍ͬͯ·͢ɻ͜ͷ۟ͷقޠ͸ʰ֝ʱͰɺ͜Ε͸य़Λදݱ͍ͯ͠ΔقޠͰ͢ɻ • 065ݹ஑΍֝ඈ͜Ήਫͷ͓ͱ • */ࢁܗݝʹ͋ΔཱੴࣉʢΓͬ͠Ό͘͡ʣͰηϛ͕໐͍͍ͯΔ༷ࢠΛӵ·Εͨ۟Ͱ͢ɻ͜ͷ۟͸໌֬ͳ۟༁͸

    Θ͔͍ͬͯͳ͍ͷͰɺগ͠ϛεςϦΞεͰӵΈղ͘͜ͱʹϩϚϯ͕͋Δആ۟Ͱ͢ɻʮؓ͞ʯͱʮઊͷ੠ʯͱ͍ ͏Ұݟໃ६͢Δදݱͷҙਤ͕ಛʹߟ͑ͤ͞ΒΕ·͢Ͷɻ • 065ؓ͞΍ؠʹ͠Έೖઊͷ੠ • */݄຤ͷॵ͍೔ͷޕޙͷձٞʹͱͯ΋૖େͳγεςϜ։ൃͷ࿩Λ͠Α͏ͱ͍ͯ͠Δ༷ࢠΛӵ·Εͨ۟Ͱ͢ɻ ୯७ͳ۟Ͱ͸͋Γ·͕͢ɺαϥϦʔϚϯͳΤϯδχΞ͕༷ʑͳݒ೦͕͋ΔதͰɺͦΕͰ΋͜ͷٕज़Λܗʹ͢Δ ͜ͱͰਓ޻஌ೳٕज़ͷະདྷΛ։͖͍ͨͱ͍͏ر๬Λ͜Ίͨؾ࣋ͪΛදݱ͍ͯ͠Δɻ • 065݄ͷ೤͍࠭ʹຒΊͨΔΘ͕ࠢ • ˡ8JUIBMJUUMFFEJUJOH JUCFDPNFTBIBJLV FH೤͖݄࠭ʹຒΊͨΔΘ͕ࠢ Response to Individual Tasks with Task Outline and Few-shot Example: Support for writing Haiku with 2-shots
  33. - Product packages always contain a product summary, but in

    many cases, a advertising description is transcribed by a salesperson - In this demo, the parameters of HyperCLOVA Studio(Play-ground) are adjusted to generate an attractive description - From a food product title and a summary, HyperCLOVA Studio generate an attractive description to advertise it Document generation using Play-ground E.g. Product summary to the description
  34. ঎඼໊: ͬ͢ͺ ͔ʹͤΜ ֓ཁ: ◦εφοΫ՛ࢠ ◦ݪࡐྉ໊: খഴค(ࠃ಺੡଄)ɺ২෺༉ɺͰΜ คɺ͔ʹͷ਎ɺค຤ਣʢখഴΛؚ Ήʣɺ࠭౶ɺ৯Ԙɺ͔ʹύ΢μ ʔɺࠛ෍ύ΢μʔɺ࠭౶

    / ๲ு ࡎɺ؁ຯྉ(ఱ૲) ◦಺༰ྔ: 60g આ໌: ΄ΜͷΓࢎຯ ΄ΜͷΓࢎຯ͕ޮ͍ͨΧχ෩ຯͷ ͤΜ΂͍Ͱ͢ɻޱͷதʹೖΕΔͱ ϗϩϗϩͱ่Ε͍ͯ͘৯ײ͸ͨ· Γ·ͤΜɻ͓ञͷ͓ͭ·Έ΍͓΍ ͭʹͲ͏ͧɻ Base Prefix Output → ঎඼ͷ֓ཁ͔ΒࢥΘͣങ͍ͨ͘ͳΔઆ໌จΛੜ੒͢Δ ঎඼໊ۤʑ͍͠γϣίϥΧΧΦ ֓ཁ˓໊শνϣίϨʔτ˓ݪࡐྉ໊ΧΧΦϚε ࠃ಺੡଄ ɺίίΞύ΢μʔɺ࠭౶ɺίίΞόλʔೕԽࡎɺ߳ྉ˓಺༰ྔH˓อ ଘํ๏ˆҎԼͷྫྷ҉ॴͰอଘ͍ͯͩ͘͠͞ɻ આ໌݈߁ͷ͜ͱ͚ͩΛߟ͑ͨ௒ΧΧΦϙϦϑΣϊʔϧɻѹ౗తͳڧ͍ۤຯͱඓΛ؏͘઱྽ͳ߳ΓͷԞʹ΄ͷ͔ͳ؁ຯΛײ͡ΒΕ·͢ɻ͠ ͔΋Hͱେ༰ྔͰ๨೥ձͷേήʔϜʹ΋ϐολϦ One-shot => Increased the Temperature (randomness) and lowered the Repetition penalty (for control of repetition) to make the text contain the appeal
  35. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  36. Subjective evaluation of a dialogue system using HyperCLOVA with 6.7B/13B/39B

    JP Model for 4 tasks Annotations were made for all task/model combinations Subjective evaluation by the same 5 annotators Each session is N round-trip conversational pairs The user receives a list of N topics for evaluation Each session consumes one vocabulary item from the list Conducted in Play-ground 4. Free chatting 3. Reacting to user sentiment on a topic 2. Tracking different multiple topics 1. Understanding of basic vocabulary
  37. Exception: Free chat tasks did not evaluate the achievement of

    the goal Natural response Q: Was it a natural reaction? Are there any breakdowns or inconsistencies in the history of the conversation? Following a topic Q: Did it stay on topic? Did it lose track of the topic (in this case, did it lose track of what it was being asked about? Was it able to switch topics (in this case, was it able to pull back to the previous question? Providing a topic or asking a question Q: Did it provide a topic? Was it able to get the speaker to talk during the answer (most likely not)? Achievement of goals Q: Did it achieve your objective? Common evaluation criteria for all tasks !
  38. 1. Understanding of basic vocabulary elementary vocabulary secondary vocabulary খֶߍ

    ۚͮͪ தֶߍ Ԗච େਓ νϡʔϦοϓ ઌੜ ώϚϫϦ ϥΠΦϯ ص ΩϦϯ Ҝࢠ ిं ۺ ं αϯμϧ ηʔλʔ ΓΜ͝ εΧʔτ Έ͔Μ Ωϟϕπ αϯϚ ͖Ύ͏Γ Ϛάϩ εζϝ ϋʔϞχΧ Πϯί ϐΞϊ τϯϘ ΞϦ IUUQTSFQPTJUPSZOJOKBMBDKQ BDUJPOSFQPTJUPSZ@BDUJPO@DPNNPO@EPXOMPBEJUFN@JEJUFN@OPBUUSJCVUF@JEpMF@OP ToDO: Ask the HyperCLOVA questions in the form of Level 1 and Level 2 for each vocabulary
  39. 1. Understanding of basic vocabulary Does the system accurately answer

    word meanings (Level 1) and emotions (Level 2)? 6.7B 13B 39B Natural response 0.55 0.66 0.98 Following a topic 0.63 0.84 0.98 Providing a topic or asking a question 0.00 0.01 0.00 Achievement of goals 0.55 0.68 0.84
  40. Topic A Topic B Topic A Topic B ৽ܕίϩφ΢Πϧε Πϯό΢ϯυ

    Πνϩʔ େ୩ᠳฏ ۓٸࣄଶએݴ ৽ܕίϩφϫΫνϯ AR(֦ுݱ࣮) ࣗಈӡసٕज़ YouTuber VTuber ϨΦφϧυɾμɾϰΟϯν ΫϩʔυɾϞω ฏ੒ ྩ࿨ Πϯλʔωοτ 5G σϑϨܦࡁ ௒ߴྸԽࣾձ ւ֎ཱྀߦ ࠃ಺ཱྀߦ ిؾࣗಈं ϦχΞதԝ৽װઢ 2. Tracking different multiple topics ToDO: Start a conversation about topic A and switch to topic B before 10 round trips ฏ੒
  41. Evaluation: Can the system move from topic A to topic

    B during a conversation? 6.7B 13B 39B Natural response 0.66 0.53 0.91 Following a topic 0.71 0.61 0.95 Providing a topic or asking a question 0.04 0.01 0.02 Achievement of goals 0.66 0.55 0.91 2. Tracking different multiple topics
  42. 3. Reacting to user sentiment on a topic Topic sentiment

    A sentiment B ͕Μ͹ͬͯཉ͍͠ ࣙΊͯཉ͍͠ ৽ܕίϩφ΢Πϧε ؤுΖ͏ ෆ҆ͩ Πϯό΢ϯυ ໭ͬͯ͘Δ ໭Βͳ͍ ৽ܕίϩφϫΫνϯ ଴ͱ͏ ͍ͭʹͳΔ YouTuber ΍Γ͍ͨ ΍Γͨ͘ͳ͍ େ୩ᠳฏ ׆༂ͯ͠ཉ͍͠ ࡾৼͯ͠ཉ͍͠ AR(֦ுݱ࣮) ໘ന͍ ๞͖ͨ ௒ߴྸԽࣾձ େৎ෉ ৺഑ ւ֎ཱྀߦ ߦ͖͍ͨ ߦ͖ͨ͘ͳ͍ ిؾࣗಈं ৐Γ͍ͨ ৐Γͨ͘ͳ͍ ϦχΞதԝ৽װઢ ৐Γ͍ͨ ৐Γͨ͘ͳ͍ ToDO: Have a 15 back and forth conversation about the Topic. Speak with the feeling of sentiment A at first, then sentiment B.
  43. Evaluation: Was the system able to agree with the user

    when he or she was feeling sentiment A about the topic? 3. Reacting to user sentiment on a topic 6.7B 13B 39B Natural response 0.69 0.45 0.90 Following a topic 0.74 0.52 0.95 Providing a topic or asking a question 0.04 0.02 0.03 Achievement of goals 0.68 0.46 0.90
  44. Evaluation When a user has sentiment B feelings about a

    topic, could the system disagree? 3. Reacting to user sentiment on a topic 6.7B 13B 39B Natural response 0.61 0.40 0.87 Following a topic 0.67 0.45 0.93 Providing a topic or asking a question 0.09 0.02 0.03 Achievement of goals 0.46 0.36 0.50
  45. Evaluation: Facilitate a free dialogue with the system 4. Free

    chatting 6.7B 13B 39B Natural response 0.65 0.40 0.92 Following a topic 0.76 0.40 0.94 Providing a topic or asking a question 0.12 0.04 0.09 Achievement of goals - - -
  46. Summary: Subjective Evaluation of 39B JP Model 1. Understanding of

    basic vocabulary 2. Tracking different multiple topics 3. Reacting to positive sentiment on a topic 3. Reacting to negative sentiment on a topic 4. Free chatting Natural response 0.978 0.908 0.908 0.872 0.925 Following a topic 0.984 0.952 0.951 0.930 0.935 Providing a topic or asking a question 0.003 0.023 0.033 0.035 0.086 Achievement of goals 0.835 0.907 0.899 0.505 -
  47. Summary: Subjective Evaluation of 39B JP Model 1. Understanding of

    basic vocabulary 2. Tracking different multiple topics 3. Reacting to positive sentiment on a topic 3. Reacting to negative sentiment on a topic 4. Free chatting Natural response 0.978 0.908 0.908 0.872 0.925 Following a topic 0.984 0.952 0.951 0.930 0.935 Providing a topic or asking a question 0.003 0.023 0.033 0.035 0.086 Achievement of goals 0.835 0.907 0.899 0.505 -
  48. Difficulties with the Japanese language Difficult to learn Japanese speakers

    use - Hiragana - Katakana - Kanji - Romaji - e.t.c. to write a single document Large amount of essential vocabulary Required for daily conversation - over 8,000 words Need to know many of - Homonyms - Honorifics - Dialects Omission of words Japanese speakers may omit following words in a document - Subject - Object Omitted words may not be uniquely inferred
  49. Conducting joint research using HyperCLOVA Hope to collaborate with more

    research institutions and companies in the future Providing HyperCLOVA’s APIs to universities, research institutes, and companies Collaborating to dramatically improve system performance and detect and eliminate bias in language models with - Osaka University Graduate School - Tokyo Metropolitan University - Waseda University
  50. Difficulties in text generation Potential risks of generated text The

    following technologies need to be developed - Improving the content bias of a corpus and its notation - Ensuring the truthfulness and security of a output text Implementation of AI Ethics Various ethical considerations need to be taken into account for input and output texts - Toxicity - Sexual - Offensive - Profane - Intimidating - Attack on identity Automation of intrinsic evaluation Need metrics that can be applied to dynamic text generation results - Accuracy of topical content - Consistency of generated text - Determination of achievement of objectives
  51. Automatic evaluation with 39B JP Model for a QA task

    Few-shots were created by randomly extracting a context from the RCQA possible only dev-set for each inference Create Few-shot with a context, a question text and an answer - If correct answer is contained and easily extracted from inference result, we judged it is correct TASK: RCQA* possible only - Removed unanswerable questions from dataset of the normal RCQA task * ղ౴Մೳੑ෇͖ಡղσʔληοτ: http://www.cl.ecei.tohoku.ac.jp/rcqa/
  52. Result of automatic evaluation with 39B JP Model for RCQA

    possible only task model / few-shot shot temperature top_p answer match 6.7B / contextual 0 0.5 0.8 - 4 0.1 0.9 66.52 13B / contextual 0 0.5 0.8 - 4 0.4 0.1 70.28 39B / contextual 0 0.4 0.5 80.51 1 0.4 0.5 89.18 2 0.4 0.5 89.31 3 0.4 0.5 89.09 4 0.4 0.5 89.83 39B / non-contextual 0 0.4 0.5 69.50 1 0.4 0.5 76.97 2 0.4 0.5 79.08 3 0.4 0.5 79.38 4 0.4 0.5 80.51
  53. HyperCLOVA’s LM vs BERT-large TASK: RCQA possible only (Removed unanswerable

    questions from the normal RCQA task) - It is possible that BERT can achieve higher results with fine-tuning on specific tasks - HyperCLOVA can achieve the same level of performance with Prompting and rough parameter search test acc test F1 memo HyperCLOVA 85.03 89.95 JP 39B 2-shots, temperature=0.4, top_p= 0.5 BERT-jp-large 86.68 90.49 Using subset of LINE LM corpus
  54. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  55. HyperCLOVA allows for generic role-playing Challenge: Features other than smooth

    conversation and topic tracking The truth of what it says should be verified before it responds Conversation is smooth and the meaning of what is said is understood Some ambiguous responses e.g. Temperature of hot water during washing Effect of data bias e.g. unsettled but became female The consistency of its persona is a bit suspect Start with NO character set
  56. Is HyperCLOVA really necessary for NLP? - YES !! -

    If you're on a budget … - The history of NLP is strongly linked to the development of AI-related technologies - LINE wants to move in the direction of building our own models and having customers use them Large-scale general-purpose LMs DNN Traditional only ML Rule only Small LM only
  57. Agenda - What’s HyperCLOVA - Inside of HyperCLOVA - Application

    development by Prompting - Evaluation of HyperCLOVA’s JP LMs - Application to Dialogue Systems - The future of LINE and NLP
  58. LINE was released to the public 10 years FAQ from

    NLPer: Isn’t there a challenge left to tackle?
  59. FAQ from NLPer: Isn’t there a challenge left to tackle?

    No, not yet !! It's not over yet !!
  60. Various issues related to HyperCLOVA • Building models and using

    those models to make inferences - the biggest challenge of all • Fine-tuning and other parameter-efficient transfer learning methods, as well as compact models • Responding to new topics/events that have arisen since a model was built • Implementing AI Ethics • Filtering according to the application and specifying the reason • Building a Web corpus • Removing duplicate data • Realization of accountability for each entry used • Responding to deletion requests on a URL/ID basis • Detection and anonymization of personal information
  61. LINE’s NLP journey is still in its early stages Let's

    challenge together at LINE LINE's various services needs essential improvements using NLP technology ! • Large-scale general-purpose LMs • “High Road” NLP • Information Retrieval • String processing • Data creation • Evaluation tasks, e.t.c.
  62. HyperCLOVA Hands-on to be held during 2021 Hands-on with HyperCLOVA

    Studio and APIs for engineers Please wait for informations from LINE Developers ( @LINE_DEV) A Python SDK to using HyperCLOVA API will be provided
  63. LINE’s LMs for OSS start in FY2021 Of course, for

    models “other than” HyperCLOVA Performance target: LINE's LMs for OSS > other OSS LMs Would like to update a few times a year, if possible !! Train using a subset of the corpus for HyperCLOVA(LINE LM Corpus) !
  64. Summary - Updated the current status of HyperCLOVA in LINE

    - Reported on large-scale general-purpose LMs and Prompting, using several topics as examples - There are cases where surprisingly high quality can be achieved - There are issues that cannot be solved ad hoc - On LINE, we can work on all layers of NLP R&D for not only HyperCLOVA - Please stay tuned for a next NLP information by LINE