Upgrade to Pro — share decks privately, control downloads, hide ads and more …

nlp-survey

 nlp-survey

BERT後の自然言語処理についてのサーベイ

D8b77b4b3b0373eaee6c6077c4d7330a?s=128

KARAKURI Inc.

April 09, 2021
Tweet

Transcript

  1. BERTޙͷࣗવݴޠॲཧʹ͍ͭͯͷαʔϕΠ ߴ໦ࢤ࿠ 1

  2. αʔϕΠͷ໨త 2 1. BERTҎ߱ͷࣄલֶशࣗવݴޠϞσϧͷಈ޲Λ஌Γ͍ͨʂ • ࠷৽ͷঢ়گΛ஌Γ͍ͨ • ҙຯͷ͋Γͦ͏ͳվળ͕஌Γ͍ͨ  ࠷ۙͲΜͳࣗવݴޠλεΫ͕ఏҊ͞Ε͍ͯΔͷ͔஌Γ͍ͨʂ

    ˠ ࣄલֶशࣗવݴޠϞσϧͷ࠷ۙͷಈ޲΍/-1λεΫΛ޿͘ઙ͘঺հ
  3. ࠓ೔ͷྲྀΕ 3 ̍ɽ൚༻ࣄલֶशݴޠϞσϧͷಈ޲ ̎ɽ֤λεΫʹಛԽͨ͠Ϟσϧͷಈ޲ ̏ɽTransformerͷ෼ੳ ධՁͷݟ௚͠ ̐ɽ·ͱΊ

  4. ̍ɽ൚༻ࣄલֶशݴޠϞσϧͷಈ޲ 4

  5. ಋೖ 5

  6. ࣄલֶशݴޠϞσϧҰཡ 6 https://github.com/thunlp/PLMpapers

  7. ࣄલֶशݴޠϞσϧҰཡ 7 [Qiu+ 2020 Pre-trained Models for Natural Language Processing:

    A Survey]
  8. Pre-training & Fine-tuning 8 ࣄલֶश 'JOFUVOJOH

  9. Self-attention 9 [Cui+ EMNLP 2019] Values Query Softmax Keys token

    Query Key Value 𝑊! 𝑊" 𝑊#
  10. GLUE [Wang+ ICLR 2019] 10 • ૯߹తͳݴޠཧղೳྗΛଌΔͨΊͷϕϯνϚʔΫ • จ๏൑ఆɼײ৘෼ੳɼಉٛจ൑ఆɼྨࣅจ൑ఆɼಉ࣭ٛ໰൑ఆɼؚҙؔ܎ ൑ఆɼ࣭໰Ԡ౴ɼݴ͍׵͑൑ఆɼͳͲ

  11. SuperGLUE [Wang+ NeurIPS 2019] 11 • ೉͍͠GLUE

  12. ࣗݾූ߸ԽܕϞσϧ 12

  13. BERT [Delvin+ NAACL 2019] 13 • ૒ํ޲ͷจ຺Λߟྀͨ͠TransformerϕʔεͷࣄલֶशݴޠϞσϧ • Masked Language

    ModelͱNext Sentence Predictionͷͭͷࣄલֶश
  14. Masked Language Model 14 BERT ෢࢜ಓ͸ͦͷද௃ͨΔࡩՖͱಉ͘͡ɺ೔ຊͷ౔஍ʹݻ༗ͷՖͰ͋Δ [CLS] ෢࢜ಓ ೔ຊ

  15. Next Sentence Prediction 15 ߹ཧੑ͸͋͘·Ͱ͋ΜͨͷੈքͰͷϧʔϧ BERT ͦͷೄ͡ΌΦϨ͸͠͹ΕͶ͑Α [SEP] [SEP] [CLS]

    [https://www.geeksforgeeks.org/understanding-bert-nlp/] /FYU4FOUFODF YES NO
  16. MT-DNN [Liu+ ACL 2019] 16 • 'JOFUVOJOH࣌ʹϚϧνλεΫֶशΛ௥ՃͰߦ͏͜ͱͰਫ਼౓޲্ • BERTΑΓ΋ޮ཰తʹυϝΠϯదԠ΋Մೳ

  17. SpanBERT [Joshi+ TACL 2019] 17 • ҰఆൣғΛϚεΫͨ͠.BTLFE-BOHVBHF.PEFM

  18. RoBERTa [Liu+ 2019] 18 • #&35ͷϋΠύϥ୳ࡧ • ΑΓେ͖ͳόοναΠζ ɼσʔλɼεςοϓ਺Ͱֶश •

    Next sentence predictionͷഇࢭ • GLUEɼSQuADɼRACEͰSOTA
  19. DeBERTa [He+ ICLR 2021] 19 • SoftmaxͷલʹτʔΫϯͷઈରҐஔͷ৘ใΛ෇Ճ • ୯ޠͷ಺༰ͱҐஔΛผʑʹ2ຊͷϕΫτϧͰຒΊࠐΉ disentangled

    attentionͷఏҊ • SuperGLUEͰਓؒ௒͑
  20. ࣗݾճؼܕϞσϧ 20

  21. Autoregressive Language Model 21 [http://peterbloem.nl/blog/transformers] [Yang+ NeurIPS 2019]

  22. GPT family [Radford+ 2018, Radford+ 2019, Brown+ 2020] 22 •

    ࣗݾճؼܕͷࣄલֶशݴޠϞσϧ • (15Ҏ߱ͷڻ͘΂͖ੜ੒݁ՌͰ஌ΒΕΔ • ύϥϝʔλ਺΍σʔλ਺ͷεέʔϧଇͷઌۦ͚
  23. XLNet [Yang+ NeurIPS 2019] 23 • ࣗݾճؼܕͱࣗݾූ߸Խܕͷ૒ํͷར఺Λ׆༻͢ΔࣄલֶशݴޠϞσϧ • ೖྗܥྻͷฒͼม͑ʹର͢Δ༧ଌֶश ࣗݾճؼܕ

    ࣗݾූ߸Խܕ
  24. 4FRUP4FR 24

  25. MASS [Song+ ICML 2019] 25 • &ODPEFSEFDPEFSϞσϧͷͨΊͷࣄલֶश๏ͷఏҊ • Masked language

    modelͷग़ྗ͕ෳ਺ʹͳͬͨܗ
  26. BART [Lewis+ ACL 2020] 26 • ૒ํ޲ͷจ຺Λߟྀ͠ͳ͕ΒςΩετੜ੒΋ՄೳͳݴޠϞσϧͷఏҊ • ༷ʑͳํ๏ͰจॻʹϊΠζΛՃ͑Δ͜ͱ͕Մೳ •

    จॻཁ໿ͳͲͰߴ͍ੑೳΛൃش
  27. T5 [Raffel+ JMLR 2020] 27 • ࣗવݴޠλεΫΛςΩετ͔ΒςΩετ΁ͷࣸ૾ͱͯ͠දݱ • ༷ʑͳࣗવݴޠλεΫΛ౷Ұతʹѻ͑Δࣄલֶख๏ͷఏҊ

  28. Prefix Language Model 28 • prefixͷ෦෼͚ͩ૒ํ޲ͷจ຺ͷ࢖༻ΛڐՄ

  29. UniLM [Dong+ NeurIPS 2019] 29 • ୯ํ޲ɼ૒ํ޲ɼTFRTFR ͷݴޠϞσϧΛಉ࣌ʹֶश • ҟͳΔattention

    maskΛ༻ ͍ͯtoken͕࢖༻Ͱ͖Δจ ຺৘ใΛ੍ޚ͢Δ͜ͱͰ্ هΛ࣮ݱ
  30. ࣄલֶशʹؔͯ͠ 30

  31. Masking 31 [Rugers+ 2020 A Primer in BERTology: What We

    Know About How BERT Works]
  32. Next Sentence Prediction 32 [Rugers+ 2020 A Primer in BERTology:

    What We Know About How BERT Works] [Shi+ ACL 2020 Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains Works]
  33. Pre-training Objectives 33 [Liu+ 2020 A Survey on Contextual Embeddings]

  34. 'JOFUVOJOHʹؔͯ͠ 34

  35. Fine-tuning 35 • ਂ૚Խ͕ॏཁ • ̎ஈ֊ࣄલֶश • ఢରతֶश • Data-augumentation

    [Rugers+ 2020 A Primer in BERTology: What We Know About How BERT Works] [Pang+ 2019, Garg+ 2020, Arase & Tsuji 2019, Pruksachatkun+ 2020, Glavas & Vulic 2020] [Zhu+ 2019, Jiang+ 2019] [Lee+ 2019]
  36. Ϟσϧͷখن໛Խ 36

  37. Compressed Transformers (1/2) 37 [Qiu+ 2020 Pre-trained Models for Natural

    Language Processing: A Survey]
  38. Compressed Transformers (2/2) 38 [Rogers+ 2020 A Primer in BERTology:

    What We Know About How BERT Works]
  39. ALBERT [Lan+ ICLR 2020] 39 • ຒΊࠐΈ࣍ݩ࡟ݮͱύϥϝʔλڞ༗ʹΑΔBERTͷܰྔԽ • ෛྫͷߏ੒Λվྑͨ͠next sentence

    predictionͷఏҊ 𝑉 𝐻 𝐸 𝐸 𝑉 𝐻 Sentence-order prediction Factorized embedding
  40. DistilBERT [Sanh + 2019] 40 • BERTΛৠཹͨ͠΋ͷ • 40%ͷϞσϧαΠζ࡟ݮɼ60%ͷߴ଎ԽɼΘ͔ͣ3%ͷੑೳྼԽ

  41. TinyBERT [Jiao+ EMNLP 2020] 41 • ಉ͘͡BERTͷৠཹͰɼ1/7ͷϞσϧαΠζɼ9ഒͷߴ଎Խ

  42. Q-BERT [Shen+ AAAI 2020] 42 • BERTͷྔࢠԽ • Hessianͷݻ༗஋ͷฏۉͱ෼ࢄʹج͍ͮͯਫ਼౓Λམͱ͢ͱ͜ΖΛܾΊΔ

  43. ԋࢉͷޮ཰Խ 43

  44. Efficient Transformers 44 [Tay+ 2020 Efficient Transformers: A Survey]

  45. Sparse Transformer [Child+ 2019] 45 • ہॴతͳؔ܎ʹ੍ݶ͞ΕͨattentionͷఏҊ

  46. Longformer [Beltagy+ 2020] 46 • ݻఆ૭಺ͷہॴతattentionͱλεΫʹจ຺͍ͮͨେҬతattentionͷซ༻ • attentionͷܭࢉ͕ઢܗΦʔμʔʹམͪΔ͜ͱͰ௕จʹରԠՄೳ

  47. Big Bird [Zaheer+ NeurIPS 2020] 47 • ϥϯμϜɼہॴɼେҬattentionͷซ༻

  48. Performer [Choromanski + ICLR 2021] 48 • ݩͷattentionΛ֬཰తʹਖ਼֬ʹਪఆͰ͖Δཧ࿦อূ͖ͭઢܗΦʔμʔͷ attentionͷఏҊ •

    εύʔεੑͷԾఆΛඞཁͱ͠ͳ͍ଞɼsoftmax Ҏ֎ʹ΋ద༻Մೳ
  49. Reformer [Kitaev+ ICLR 2021] 49 • ͍ۙϕΫτϧΛಉ͡ϋογϡ஋ʹׂΓ౰ͯΔattentionͷఏҊ • O(N^2)ͷattentionͷܭࢉΛO(N log

    N)ʹམͱ͢͜ͱͰ௕จʹ΋ରԠ
  50. Long Range Arena [Tay + ICLR 2021] 50 • ௕͍จষͷݴޠॲཧͷϕϯνϚʔΫ

    • Efficient transformersΛൺֱ͢ΔͨΊͷࢦඪ
  51. ELECTRA [Clark+ ICLR 2020] 51 • Masked language modelͷ୅ΘΓʹఢରతֶशΛ༻͍ΔࣄલֶशͷఏҊ •

    ߴʑ1/4ͷܭࢉࢿݯͰXLNet΍RoBERTaฒΈͷੑೳ • 1GPUͰ4೔ͷֶशͰGPTΛ྇կ
  52. Ϟσϧͷେن໛Խ 52

  53. Large Models 53 [State of AI Report 2020 (https://www.stateof.ai/)] •

    Megatron-LM (80ԯ) [Shoeiby+ ACL 2020] • Turing-NLG (170ԯ) [Microsoft 2020] • GPT-3 (1750ԯ) [Brown+ 2020] [https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/]
  54. ֎෦஌ࣝͷ׆༻ 54

  55. THU-ERNIE [Zhang + ACL 2019] 55 • ஌ࣝάϥϑΛ૊ΈࠐΜͩࣄલֶशݴޠϞσϧ • BERTͷຒΊࠐΈ͔Β஌ࣝάϥϑͷΤϯςΟςΟΛग़ྗ

  56. KnowBERT [Peters + EMNLP-IJCNLP 2019] 56 • ΤϯςΟςΟͷຒΊࠐΈʹΑͬͯBERTͷຒΊࠐΈΛจ຺͚ͮΔ

  57. K-BERT [Liu + AAAI 2020] 57 • ஌ࣝάϥϑͰݕࡧΛֻ͚͔ͯΒBERTʹ௨͢

  58. REALM [Guu+ 2020] 58 • ࣄલֶश࣌ʹ৘ใݕࡧʹΑͬͯ৘ใΛิ׬͢Δ

  59. ̎ɽ֤λεΫʹಛԽͨ͠Ϟσϧͷಈ޲ 59

  60. ࣭໰Ԡ౴ 60

  61. SQuAD [Rajpurkar+ EMNLP 2016] 61 • ࣭໰Ԡ౴ͷͨΊͷσʔληοτ • จষதʹ౴͕͑໌ࣔతʹଘࡏ͢Δ

  62. SQuAD2.0 [Rajpurkar+ ACL 2018] 62 • ஈམͷ৘ใ͚͔ͩΒͰ͸౴͑ΒΕͳ͍࣭໰Λ௥Ճͨ͠SQuAD • ͲΕ͕౴͑ΒΕͳ͍࣭໰Ͱ͋Δ͔Λ൑அ͢Δ͜ͱ΋ٻΊΒΕΔ

  63. DROP [Dua+ NAACL 2019] 63 • ஈམͷ༷ʑͳՕॴͷ৘ใΛ࢖Θ ͳ͍ͱ౴͑ΒΕͳ͍࣭໰

  64. QuAC [Choi+ EMNLP 2018] 64 • Wikipediaͷจষʹ͍ͭͯͷର࿩ܕ࣭໰ Ԡ౴σʔληοτ • ࣭໰͕ର࿩ͷจ຺ʹґଘ͢ΔͳͲจ຺

    ͷཧղΛཁ͢Δ
  65. CoQA [Reddy+ TACL 2019] 65 • ର࿩త࣭໰Ԡ౴ͷσʔληοτ

  66. HotpotQA [Yang+ EMNLP 2018] 66 • ෳ਺ஈམΛލ͙จষཧղ͕ඞཁͱ͞ΕΔ࣭໰Ԡ౴ͷσʔληοτ

  67. Natural Questions [Kwiatkowski+ TACL 2019] 67 • ࣮ࡍͷGoogleݕࡧͷ݁ՌΛݩʹͨ͠Open-domain QAͷσʔληοτ

  68. RACE [Lai+ EMNLP 2017] 68 • தࠃͷӳޠͷࢼݧͷσʔληοτ • ௕จಡղͷϕϯνϚʔΫ

  69. จॻੜ੒ 69

  70. GEM [Gehrmann+ 2021] 70 • ݴޠੜ੒λεΫͷϕϯνϚʔΫ

  71. BLEURT [Sellam+ 2020] 71 • ϊΠζ͕෇Ճ͞ΕͨWikipediaͰࣄલֶश͠ɼਓؒͷධՁͰfine-tuning ͨ͠BERTΛ༻͍ͨධՁ

  72. จॻཁ໿ 72

  73. ProphetNet [Qi+ EMNLP 2020] 73 • Nݸઌ·Ͱͷจষ༧ଌ

  74. HIBERT [Zhang+ ACL 2019] 74 • BERTʹΑΔநग़ܕཁ໿ • จॻϨϕϧͱจষϨϕϧͷϞσϧΛ༻͍ͯɼ͋Δจষ͕ཁ໿͔෼ྨ

  75. DiscoBERT [Xu+ ACL 2020] 75 • ҰจؙʑͰ͸ͳͦ͘ͷҰ෦Λநग़ • จষͷྲྀΕΛάϥϑͰཅʹදݱ

  76. BART [Lewis+ ACL 2020] 76 • ૒ํ޲ͷจ຺Λߟྀ͠ͳ͕ΒςΩετੜ੒΋ՄೳͳݴޠϞσϧͷఏҊ • ༷ʑͳํ๏ͰจॻʹϊΠζΛՃ͑Δ͜ͱ͕Մೳ •

    จॻཁ໿ͳͲͰߴ͍ੑೳΛൃش ࠶ܝ
  77. BERTSum [Liu+ EMNLP 2019] 77 • BERTʹΑΔநग़ܕཁ໿ͱந৅ܕཁ໿ • ந৅ܕཁ໿ͷͨΊʹ̎ஈճͷfine-tuningͷఏҊ

  78. PEGASUS [Zhang+ ICML 2020] 78 • ந৅ܕཁ໿ͷͨΊͷࣄલֶश๏ͷఏҊ • ϚεΫ͞Εͨॏཁͳ୯ޠͷੜ੒ͱ࢒Γͷจষͷੜ੒

  79. QAGS [Wang + ACL 2020] 79 • ཁ໿͔Βੜ੒͞Ε࣭ͨ໰Λݪจͱཁ໿ͦΕͧΕΛ༻͍ͯ౴͑ͤ͞ɼͦͷҰ க౓ΛݟΔ͜ͱͰཁ໿ͷ࣭ΛධՁ

  80. Summarization by feedback [Stiennon + NeurIPS 2020] 80 • ਓؒͷϑΟʔυόοΫΛใुʹڧԽֶश

  81. ݻ༗දݱநग़ 81

  82. Named Entity Recognition 82 [Li+ 2020 A Survey on Deep

    Learning for Named Entity Recognition]
  83. LUKE [Yamada+ ACL 2020] 83 • ୯ޠͱݻ༗දݱʹର͢Δmasked language modeling •

    Tokenͷछྨʢ୯ޠ or ΤϯςΟςΟʣΛҙࣝͨ͠attentionͷఏҊ
  84. BERTͱݻ༗දݱ [Balasubramanian+ RepL4NLP 2020] 84 • BERT͸ݻ༗දݱͷೖΕସ͑ʹରͯ͠੬ऑ

  85. จॻ෼ྨ 85

  86. TopicBERT [Yamada+ ACL 2020] 86 • Topic modelingΛซ༻͢Δ͜ͱͰจॻ෼ྨͷޮ཰Λ͋͛ͨBERT

  87. ̏ɽTransformerͷ෼ੳ ධՁͷݟ௚͠ 87

  88. TransformerϞσϧͷ෼ੳ 88

  89. Multi-head attentionʹ͓͚Δϔουͷ໾ׂ 89 • ҟͳΔϔουͰ֫ಘ͞ΕΔύλʔϯ͸ݶΒΕ͍ͯΔ͕ϔου͕ੑೳʹ༩͑ ΔӨڹ͸͹Β͖͕ͭ͋Δ • ଟ͘ͷhead͸ੑೳʹӨڹͤͣॏཁ౓͸ֶशॳظʹܾ·Δ • Enc-dec

    attentionͷํ͕self-attentionΑΓmulti-head͕ॏཁ • ಉ͡૚ͷϔου͸ಉ͡Α͏ͳύλʔϯΛࣔ͢ • ݴޠֶͰ͍͏ߏจ΍ڞࢀরʹ஫໨͍ͯ͠Δϔου͕ଘࡏ • ๅ͘͡Ծઆ͕੒Γཱͭ [Kovaleva+ EMNLP 2019] [Michel+ NeurIPS 2020] [Michel+ NeurIPS 2020] [Clark+ BlackBoxNLP 2019] [Clark+ BlackBoxNLP 2019] [Chen+ NeurIPS 2020]
  90. ֤૚ຖͷදݱͷҧ͍ 90 • ઙ͍૚͸൚༻తͳɼਂ͍૚͸λεΫݻ༗ͷදݱΛ֫ಘ • ઙ͍૚͸token΍पғͷจ຺ʹґΔදݱΛ֫ಘ͢Δ͕૚ΛܦΔͱऑ·Δ • ਂ͍૚͸ΑΓ௕ظͷґଘؔ܎ٴͼҙຯతͳදݱΛ֫ಘ͢Δ [Aken+ CIKM

    2019], [Peters+ RepL4NLP 2019], [Hao+ EMNLP 2019] [Lin+ BlackBoxNLP 2019], [Voita+ EMNLP 2019], [Ethayarajh+ EMNLP 2019], [Brunner+ ICLR 2020] [Raganato 2018], [Vig BlackBoxNLP 2019], [Jawahar ACL 2019]
  91. BERTͱଟݴޠཧղ 91 • ୯ҰݴޠͰͷֶशͰෳ਺ݴޠʹ൚Խ͢ΔදݱΛ֫ಘՄೳ • ଟݴޠBERT͸ݴޠڞ௨දݱΛ֫ಘͯ͠Δ/ͳ͍ • ଟݴޠBERTͷදݱʹ΋ߏจ໦͕ຒΊࠐ·Ε͍ͯΔ • ޠኮͷҙຯͰॏͳ͍ͬͯΔࣄ͸ॏཁͰ͸ͳ͍

    [Artetxe 2019] [Libovicky 2019], [Singh+ ICLR 2019] [Chi ACL 2020] [Wang ICLR 2020]
  92. ݴޠֶతߏ଄ͷ෮ݩ 92 [Cenen NeurIPS 2019] • ݴޠ৘ใ͕ҙຯۭؒͱߏจۭؒʹผΕͯදݱ͞Ε͍ͯΔ • ELMOʹ΋BERTʹ΋ߏจ໦͕ຒΊࠐ·Ε͍ͯΔ •

    BERT͸֊૚ߏ଄Λ࣋ͭߏจදݱΛ֫ಘ͢Δ • Contextual model͸ྑ͍ߏจදݱΛ֫ಘ͢Δ͕ɼҙຯදݱͷҙຯͰ͸non- contextualͳख๏ͱେ͖ͳҧ͍͸ͳ͍ • BERT͸จ๏ͷଟ͘ͷ஌ࣝΛ֫ಘ͢Δ͕͹Β͖ͭ΋େ͖͍ • ʢ೔ຊޠʣBERT͸ޠॱͷ৘ใΛ׆༻͍ͯ͠Δ [Hewitt NAACL-HLT 2019] [Goldberg 2019] [Tenney ICLR 2019] [Warstadt EMNLP 2019] [Kuribayashi ACL 2020]
  93. TransformerϞσϧͷऑ఺ 93 [Lin ACL 2020] • ఢରతֶशʹରͯ͠ؤ݈Ͱ͸ͳ͍ • BERT͸ٖ૬ؔΛ׆༻͍ͯ͠Δ •

    Ұ౓தؒతλεΫʹfine-tuning͢Δͷ͸ѱӨڹΛ༩͑ΔՄೳੑ • ൱ఆʹऑ͍ • Common Sense Knowledge͸ͳ͍ [Jin+ AAAI 2020] [Niven+ ACL 2019] [Wang ACL 2020] [Ettinger ACL 2019], [Kassner ACL 2020]
  94. exBERT [Hoover+ ACL 2020] 94 • ֶशࡁΈͷBERT͕֫ಘͨ͠දݱΛՄࢹԽ͢ΔͨΊͷπʔϧ

  95. ධՁํ๏ͷݟ௚͠ 95

  96. SWAG [Zellers+ EMNLP 2018] 96 • Common sense inferenceͷϕϯνϚʔΫ •

    Annotationʹ൐͏όΠΞεΛ࡟ݮ͢ΔAdversarial FilteringΛఏҊ
  97. HAMLET [Nie+ ACL 2020] 97 • ٖ૬ؔʹ࿭Θ͞Εͳ͍ݴޠϞσϧΛֶश͢ΔͨΊͷɼσʔλͷऩू͔ Β܇࿅ͷ݁ՌΛड͚ͨվળʹࢸΔ·ͰͷաఔͷఏҊ

  98. CheckList [Riberio+ ACL 2020 (Best Paper)] 98 • ϒϥοΫϘοΫεςετʹΑΔϞσϧධՁ

  99. ࣗಈධՁࢦඪͷ໰୊఺ʹ͍ͭͯ [Mathur+ ACL 2020] 99 • ෳ਺ͷػց຋༁γεςϜΛධՁͨ͠ࡍʹɼ֎Ε஋ͱͳΔγεςϜ͕ଘ ࡏ͢ΔͱࣗಈධՁࢦඪʹΑͬͯධՁΛߦ͏͜ͱ͕ࠔ೉ʹͳΔ఺Λࢦఠ

  100. ɽ·ͱΊ 100

  101. ࢀߟࢿྉͳͲ 101

  102. NLP-progress 102 • ࣗવݴޠॲཧͷλεΫຖͷϕϯνϚʔΫͱSOTA͕·ͱ·͍ͬͯΔ [https://github.com/sebastianruder/NLP-progress]

  103. A Primer in BERTology [Rogers+ 2020] 103 • BERTͷதͰԿ͕ى͖͍ͯΔͷ͔ʁͱ͍͏͜ͱΛܦݧతʹௐ΂ͨݚڀͨͪ ͷαʔϕΠ

    • BERTͷݱঢ়Ͱͷऑ఺ͳͲ΋·ͱ·͍ͬͯΔ [https://arxiv.org/pdf/2002.12327.pdf]
  104. ࢀߟࢿྉ 104 • [MLPapers](https://github.com/thunlp/PLMpapers) • [Highlights of ACL 2020](https://medium.com/analytics-vidhya/highlights-of-acl-2020-4ef9f27a4f0c) •

    [BERT-related Papers](https://github.com/tomohideshibata/BERT-related-papers) • [ML and NLP Research Highlights of 2020](https://ruder.io/research-highlights-2020/) • [จॻཁ໿ͷྺ࢙ΛḷͬͯʢʴBERTʹจॻཁ໿ͤͯ͞ΈΔʣ](https://qiita.com/siida36/items/4c0dbaa07c456a9fadd0) • [ࣄલֶशݴޠϞσϧͷಈ޲](https://speakerdeck.com/kyoun/survey-of-pretrained-language-models) • [ʲNLPʳ2020೥ʹੜ·ΕͨBERTͷ೿ੜܗ·ͱΊ](https://kai760.medium.com/nlp- 2020%E5%B9%B4%E3%81%AB%E7%94%9F%E3%81%BE%E3%82%8C%E3%81%9Fbert%E3%81%AE%E6%B4%BE%E7%94%9F%E5%BD%A2% E3%81%BE%E3%81%A8%E3%82%81-36f2f455919d) • [GPT-3ͷিܸ](https://deeplearning.hatenablog.com/entry/gpt3) • [Rogers+ 2020 A Primer in BERTology: What we know about how BERT works](https://arxiv.org/pdf/2002.12327.pdf) • [Tay+ 2020 Efficient Transformers: A Survey](https://arxiv.org/pdf/2009.06732.pdf) • [Qiu+ 2020 Pre-trained Models for Natural Language Processing: A Survey](https://arxiv.org/pdf/2003.08271.pdf) • [Liu+ 2020 A Survey on Contextual Embeddings](https://arxiv.org/pdf/2003.07278.pdf) • [Xia+ EMNLP 2020 Which *BERT? A Survey Organizing Contextualized Encoders](https://arxiv.org/pdf/2010.00854.pdf) • [Li+ IEEE Transactions on Knowledge and Data Engineering 2018 A Survey on Deep Learning for Named Entity Recognition](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9039685)
  105. ͓·͚ 105

  106. GPT-3 [Brown+ 2020] 106 [https://deeplearning.hatenablog.com/entry/gpt3]

  107. GPT-3 [Brown+ 2020] 107 [https://twitter.com/sharifshameem/status/1283322990625607681?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweete mbed%7Ctwterm%5E1283322990625607681%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fdeeple arning.hatenablog.com%2Fentry%2Fgpt3]

  108. GPT-3 [Brown+ 2020] 108 [https://twitter.com/sh_reya/status/1284746918959239168?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7 Ctwterm%5E1284746918959239168%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fdeeplearning.ha tenablog.com%2Fentry%2Fgpt3]

  109. DALL•E [OpenAI 2021] 109 • ݴޠͷࢦࣔʹैͬͯਓ޻ը૾Λੜ੒͢ΔϞσϧΒ͍͠ • ݴޠͷߏ੒ੑΛखͳ͚͍ͮͯΔΑ͏ʹݟ͑Δͷ͕ͦ͢͝͏ [https://openai.com/blog/dall-e/]

  110. DALL•E [OpenAI 2021] 110 [https://openai.com/blog/dall-e/]

  111. DALL•E [OpenAI 2021] 111 [https://openai.com/blog/dall-e/]

  112. 112