Slide 50
Slide 50 text
Recent Research
Experiments
Environment
• Dataset
• SST, MRPC, QQP, MNLI, QNLI. RTE
• Experiment
• 2о ݽ؛(3-layer, 6-layer) ߂ 3о ߑߨ(FT, KD, PKD)۽ ೯
• #1:FT(Fine-Tuning): ೞਤ k-layer۽ initialize റ Fine-tuning
• #2:KD(Knowledge Distillation): ੌ߈ੋ KD(output݅ ਊ)
• #3:PKD(Patient Knowledge Distillation): Intermediate layer + output ਊ
֤ޙҗ ࠺Ү೮ਸ ٸ, ਵ۽ ઑӘঀ ծ ࣻ