Slide 8
Slide 8 text
ಋೖ: Semantic Textual Similarity (STS) λεΫ
•STSλεΫʹҰൠతʹ STS12-16 [4-8], STS Benchmark [9], SICK-R [10] ͕༻͍ΒΕΔ
• ͍ͣΕͷσʔληοτจϖΞͱ࣮ͷҙຯతྨࣅ͕ϥϕϧ͚͞Ε͍ͯΔ
• ҙຯతྨࣅͷൣғ STS12-16, STS Benchmark ͕ 0-5, SICK-R ͕ 1-5
• STS12-16test setͷΈɺSTS Benchmark ʹ train / dev / test set ͕ଘࡏ
•STS Benchmark dev setΛͬͯϋΠύϥௐ͢Δ͜ͱ͕͋Δ
• SimCSEֶशͳͲͷνϡʔχϯάͷ΄͔ɺධՁʹ༻͢ΔcheckpointͷબͷͨΊʹ
܇࿅த250step͝ͱʹධՁͯ͠࠷ྑ͍checkpointΛར༻
•STSλεΫධՁख๏͕จ͝ͱʹҟͳΔ͕࣌͋Γɺҙ͕ඞཁ
• Spearman / Pearson ͷ(ॱҐ)૬ؔͷ ୯७ / ॏΈ͖ ฏۉΛ͏…ͳͲ
• SimCSEจͷAppendix.Bʹهड़͕͋ΔͷͰҰಡΛਪ
8
[4] Agirre+: SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity, *SEM ’12
[5] Agirre+: *SEM 2013 shared task: Semantic Textual Similarity, *SEM ‘13
[6] Agirre+: SemEval-2014 Task 10: Multilingual Semantic Textual Similarity, SemEval ‘14
[7] Agirre+: SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability, SemEval ’15
[8] Agirre+: SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation, SemEval ’16
[9] Cer+: SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation, SemEval ’17
[10] Marelli+: A SICK cure for the evaluation of compositional distributional semantic models, LREC ‘14