Upgrade to Pro — share decks privately, control downloads, hide ads and more …

言語処理のための機械学習入門 5.4 条件付確率場

言語処理のための機械学習入門 5.4 条件付確率場

言語処理のための機械学習入門 (自然言語処理シリーズ)
5.4 条件付確率場
http://amzn.to/2f79qd9

yellow-black

April 16, 2020
Tweet

Other Decks in Science

Transcript

  1. ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM

    HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 2 / 42
  2. ܥྻϥϕϦϯάͷఆࣜԽ ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3

    HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 3 / 42
  3. ܥྻϥϕϦϯάͷఆࣜԽ ܥྻϥϕϦϯάͷఆࣜԽ ༩͑ΒΕͨܥྻΛ x, ग़ྗͷܥྻ (ϥϕϧྻ) Λ y ͱ͢Δɽ ͦͷ্Ͱɺ༩͑ΒΕֶͨशσʔλ

    D = {(x1, y1), (x2, y2), · · · , (x|D|, y|D|)} ͔Β x → y ͱ ͳΔΑ͏ͳରԠؔ܎ (ؔ਺) Λ౷ܭతʹਪఆ͢Δ໰୊ͱͯ͠ଊ͑ΒΕΔɽ ֶशσʔλ͸׬શͳ΋ͷͰͳ͘ɺະ஌ͷσʔλʹର͢Δ༧ଌ͸ৗʹ֬཰తͳৼΔ෣͍Λ͢Δ ͱߟ͑ΒΕΔ͜ͱ͔ΒɺҎԼͷΑ͏ͳ֬཰෼෍Λ༻͍Δͷ͕ଥ౰Ͱ͋Δɽ P(y|x) ͦͷ্Ͱɺ༧ଌΛ ˆ y = arg max y∈Y P(y|x) ͱ͢Δ͜ͱͰɺग़ྗͷܥྻΛٻΊΔ໰୊ͱͯ͠ఆࣜԽͰ͖Δɽ ͜͜ͰɺY ͸ϥϕϧྻͷީิू߹Ͱ͋Δɽͭ·ΓɺҐஔ t ʹ͓͚Δग़ྗܥྻதͷཁૉ yt ∈ y ͕औΓ͏Δϥϕϧͷू߹Λ Y ͱͨ͠ͱ͖ɺY = YT (T ͸ܥྻͷཁૉ਺ͱͨ͠) Ͱ͋Δɽ ߴ໺ ढ़࡞ April 23, 2020 4 / 42
  4. ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈

    ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 5 / 42
  5. ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ CRF ͱ HMM ͸ͱ΋ʹܥྻϥϕϦϯάͰ༻͍ΒΕΔख๏Ͱ͋Δɽ

    HMM ͱ CRF ͷओͳҧ͍͸ɺੜ੒ϞσϧͱࣝผϞσϧͰ͋Δ͔Ͱ͋Δɽ ߴ໺ ढ़࡞ April 23, 2020 6 / 42
  6. ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 7 / 42
  7. ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ ੜ੒ϞσϧͱࣝผϞσϧ ੜ੒Ϟσϧ σʔλͷഎޙʹ͋Δੜ੒ϝΧχζϜΛ௚઀ɺಉ࣌෼෍ P(x, y) ͱͯ͠ϞσϧԽ͢Δํ๏Ͱ͋

    Δɽࣝผʹ΋͍ͪΔ৔߹͸ɺP(x) = y P(x, y) ͱҎԼͷࣜʹΑΓ P(y|x) ΛٻΊΔɽ P(y|x) = P(x, y) P(x) • ໰୊఺͸ɺࣝผʹ༻͍Δ৔߹͸ɺؒ઀తͳͨΊਫ਼౓͕௿͘ͳΔՄೳੑ͕͋Δ఺ɽ • ར఺ͱͯ͠͸ɺx, y ͷ෼෍͕ಘΒΕΔͷͰɺٖࣅσʔλͷੜ੒΍ɺ֎Ε஋ݕ஌ͳͲʹ΋ Ԡ༻͢Δ͜ͱ͕Ͱ͖Δ఺ɽ • ྫɿHMMɺGANɺVAEɺLDAɺetc. ߴ໺ ढ़࡞ April 23, 2020 8 / 42
  8. ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ ੜ੒ϞσϧͱࣝผϞσϧ ࣝผϞσϧ ࣝผʹ༻͍ΒΕΔ P(y|x) Λ௚઀తʹϞσϧԽ͢Δɽ •

    ར఺ͱͯ͠͸ɺ௚઀తʹϞσϧԽ͍ͯ͠ΔͨΊɺൺֱతʹਫ਼౓͕ߴ͘ͳΔՄೳੑ͕͋ Δ఺ɽ • ੜ੒ϞσϧΑΓ΋ɺඞཁͳԾఆͷ਺͕গͳ͍έʔε͕ଟ͍ɽ • ྫɿCRFɺLogistic RegressionɺSoftmax + DNNɺetc. ߴ໺ ढ़࡞ April 23, 2020 9 / 42
  9. HMM ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3

    HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 10 / 42
  10. HMM HMM HMM ͸ੜ੒ϞσϧͰ͋Δɽͭ·ΓɺP(x, y) ΛϞσϦϯά͍ͯ͠Δɽਪ࿦ (༧ଌ) ͸ҎԼͷ ܗͰߦΘΕΔɽ ˆ

    y = arg max y∈Y P(y|x) = arg max y∈Y P(x, y) P(x) ͜͜ͰɺP(x) ͸ y ʹ͸ؔ܎ͳ͍ͷͰɺ ˆ y = arg max y∈Y P(y|x) = arg max y∈Y P(x, y) ͱͳΔ. ߴ໺ ढ़࡞ April 23, 2020 11 / 42
  11. HMM HMM ͷ໰୊఺ ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈

    ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 12 / 42
  12. HMM HMM ͷ໰୊఺ HMM ͷ໰୊఺ HMM ͷ໰୊఺ͱͯ͠͸ɺҎԼͷ 2 ͕ͭߟ͑ΒΕΔɽ •

    ಛ௃ྔ͕ॊೈʹઃܭͰ͖ͳ͍఺Ͱ͋Δɽྫ͑͹ɺHMM Ͱ͸ɺ୯ޠͷܥྻͷฒͼॱΛߟ ྀ͢Δ͜ͱ͕Ͱ͖ͳ͍఺͕ڍ͛ΒΕΔɽଞʹ΋ɺing ͰऴΘ͍ͬͯΔ΋ͷ͸ಈ໊ࢺ΍ݱ ࡏਐߦܥͷಈࢺͰ͋Δͱ͍ͬͨ͜ͱΛߟྀͰ͖ͳ͍఺͕ڍ͛ΒΕΔɽ • (ҰൠʹɺࣝผͷͨΊʹ༻͍Δੜ੒Ϟσϧͷ໰୊఺) ༧ଌͷͨΊʹ͸ P(y|x) Λਫ਼౓Α͘ ٻΊΕ͹ྑ͍͸ͣͳͷʹɺP(x, y) Λֶश͍ͯ͠Δ఺Ͱ͋Δɽ͢ͳΘͪ P(y|x) · P(x) Λਪఆ͠Α͏ͱ͍ͯ͠ΔͨΊɺP(x) ͷͿΜ͚ͩ༨෼ͳֶशΛ͍ͯ͠Δ. ͦͷͨΊɺύϥ ϝʔλۭ͕ؒڊେʹͳΓɺֶशσʔλ͕େྔͰ͔ͭۉ࣭ͳ΋ͷͰͳ͍ͱ P(x, y) ֶश͕ ͏·͍͔͘ͳ͍Մೳੑ͕͋Δɽ ߴ໺ ढ़࡞ April 23, 2020 13 / 42
  13. ৚݅෇֬཰৔ (CRF) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ

    3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 14 / 42
  14. ৚݅෇֬཰৔ (CRF) ఆࣜԽ ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM) ͱͷؔ࿈

    ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 16 / 42
  15. ৚݅෇֬཰৔ (CRF) ఆࣜԽ ఆࣜԽ CRF ͸ର਺ઢܗϞσϧ 1 ͷҰछͰ͋Δɽ ͦͷͨΊɺσʔλ D

    = {(x1, y1), (x2, y2), · · · , (x|D|, y|D|)} ͕༩͑ΒΕͨͱ͖ͷΫϥεͷ֬ ཰͸ҎԼͷΑ͏ʹϞσϧԽ͞ΕΔɽ P(y|x) = 1 Zx,w exp(w · ϕ(x, y)) ͜͜ͰɺZx,w ͸ਖ਼نԽఆ਺ɺw ͸ॏΈɺϕ(x, y) ͸ૉੑؔ਺ͱݺ͹ΕΔ΋ͷͰɺσʔλͱϥ ϕϧ͔Βಛ௃ྔϕΫτϧΛ࡞੒͢Δؔ਺Ͱ͋Δ (͜ΕʹΑΓɺHMM ΑΓ΋ॊೈʹಛ௃ྔ͕ઃ ܭͰ͖Δ)ɽ Zx,w = y∈Y exp(w · ϕ(x, y)) Ͱ͋Γɺ࣮͸ P(y|x) ͸ιϑτϚοΫεؔ਺ͱݺ͹ΕΔ΋ͷʹͳ͍ͬͯΔɽ 1ର਺ઢܗϞσϧͱ͸ɺ log Λͱͬͨࡍʹɺͦͷύϥϝʔλ͕ઢܗ݁߹Ͱද͞ΕΔ΋ͷͷ͜ͱͰ͋Δɽ ߴ໺ ढ़࡞ April 23, 2020 17 / 42
  16. ৚݅෇֬཰৔ (CRF) ༧ଌ (ਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 18 / 42
  17. ৚݅෇֬཰৔ (CRF) ༧ଌ (ਪఆ) ༧ଌ ༧ଌͷࡍʹ͸ɺ ˆ y = arg

    max y∈Y P(y|x) = arg max y∈Y 1 Zx,w exp(w · ϕ(x, y)) = arg max y∈Y (w · ϕ(x, y)) ͱͳΔɽҰൠʹɺߴ଎ʹղ͘ͷ͕ࠔ೉ͳͨΊɺCRF Ͱ͸ɺૉੑؔ਺ʹҎԼͷ੍ݶΛஔ͘ɽ Y = YT Ͱ͋ΔͷͰɺྫ͑͹ɺܥྻ͕ 20 ୯ޠɺ඼ࢺ਺͕ 10 ͩͱ͢Δͱɺ1020 ͱͳΔɽ ߴ໺ ढ़࡞ April 23, 2020 19 / 42
  18. ৚݅෇֬཰৔ (CRF) ༧ଌ (ਪఆ) ہॴੑͷ੍ݶ ͜Ε͸ɺೖྗܥྻ x ʹ͍ͭͯͷ৘ใ͸ɺৗʹ༻͍Δ͜ͱ͕Մೳ͕ͩɺग़ྗܥྻ y ͷ৘ใ͸

    ྡ઀͢Δ΋ͷʹ੍ݶ͍ͯ͠Δɽಛ௃ྔͷઃܭͷࡍʹɺ͜ͷ੍ݶΛຬͨ͢Α͏ʹઃܭ͢Δ͜ͱ ͕ཁ੥͞ΕΔɽ ϕk(x, y) = t ϕk(x, yt, yt−1) ͜͜Ͱɺk ͸ɺϕ ͷ k ൪໨ͷཁૉͰ͋Δɽ ϕ ͷ࣍ݩʹ͍ͭͯ͸ɺઃܭ࣍ୈͰม͑Δ͜ͱ͕ՄೳɽԾʹ K ͱ͓͘ͱɺK ݸͷಛ௃ྔΛ༻ ͍Δ͜ͱ͕ՄೳʹͳΔ (߹Θͤͯɺw ͷ࣍ݩ΋มԽ)ɽ ߴ໺ ढ़࡞ April 23, 2020 20 / 42
  19. ৚݅෇֬཰৔ (CRF) ༧ଌ (ਪఆ) Viterbi algorithm Αͬͯɺ ˆ y =

    arg max y∈Y (w · ϕ(x, y)) = arg max y∈Y (w · t ϕ(x, yt, yt−1)) = arg max y∈Y ( t w · ϕ(x, yt, yt−1)) ͱͳΔɽ͜͜·Ͱɺམͱ͢ͱ Viterbi algorithm Ͱߴ଎ʹղ͘͜ͱ͕ՄೳʹͳΔɽ ߴ໺ ढ़࡞ April 23, 2020 21 / 42
  20. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 22 / 42
  21. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨తؔ਺ CRF Ͱ͸ɺP(y|x) ͕ϞσϧԽ͞Ε͍ͯΔͨΊɺֶशσʔλ D ʹରͯ͠ɺ͜ΕΛ௚઀࠷దԽ

    ͢Δɽͦ͜Ͱɺ໨తؔ਺ͱͯ͠ɺҎԼͷΑ͏ͳ৚݅෇͖໬౓Λߟ͑Δɽ (xi,yi)∈D P(yi|xi; w) ͜Ε͕࠷΋େ͖͘ͳΔΑ͏ͳύϥϝʔλΛٻΊΔͨΊɺҎԼͷղ͚͹ྑ͍ɽ ˆ w = arg max w (xi,yi)∈D P(yi|xi; w) ߴ໺ ढ़࡞ April 23, 2020 23 / 42
  22. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ର਺໬౓ L(w) = (xi,yi)∈D log P(yi|xi;

    w) = (xi,yi)∈D log 1 Zxi,w exp(w · ϕ(xi, yi)) = (xi,yi)∈D (w · ϕ(xi, yi) − log Zxi,w ) ߴ໺ ढ़࡞ April 23, 2020 24 / 42
  23. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ύϥϝʔλͷภඍ෼ ∂L(w) ∂w = ∑ (xi,yi)∈D

    ( ∂ ∂w w · ϕ(xi, yi) − ∂ ∂w (log Zxi,w ) ) = ∑ (xi,yi)∈D ( ϕ(xi, yi) − 1 Zxi,w ∂ ∂w Zxi,w ) = ∑ (xi,yi)∈D  ϕ(xi, yi) − 1 Zxi,w ∑ y∈Y ∂ ∂w exp(w · ϕ(xi, y))   = ∑ (xi,yi)∈D  ϕ(xi, yi) − 1 Zxi,w ∑ y∈Y ϕ(xi, y)exp(w · ϕ(xi, y))   = ∑ (xi,yi)∈D  ϕ(xi, yi) − ∑ y∈Y ϕ(xi, y) exp(w · ϕ(xi, y)) Zxi,w   = ∑ (xi,yi)∈D  ϕ(xi, yi) − ∑ y∈Y ϕ(xi, y)P(y|xi)   ߴ໺ ढ़࡞ April 23, 2020 25 / 42
  24. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ࠷ٸޯ഑๏ ্هͷࣜ͸ɺHMM ͷ৔߹ͱҟͳΓɺղੳతʹղ͘͜ͱ͸ࠔ೉ͳͷͰɺ࠷ٸޯ഑๏Λ༻͍Δɽ (L(w) ͸ Ԝ

    ͳͷͰɺ࠷దղ͸อূ͞Ε͍ͯΔɽ͜Ε͸ɺlog exp ͷܗʹͳ͍ͬͯΔ͜ͱʹ ΑΔɽ) wnew = wold + ϵ · ∂L(wold) ∂wold ͦͷࡍɺ y∈Y ϕ(xi, y)P(y|xi) ͸͢΂ͯͷՄೳͳϥϕϧܥྻ (Y = YT ) Λܭࢉ͢Δඞཁ͕ ͋ΔɽͦͷͨΊɺҎԼͰߴ଎ʹܭࢉ͢Δํ๏Λߟ͑Δɽ ߴ໺ ढ़࡞ April 23, 2020 26 / 42
  25. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 27 / 42
  26. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) y∈Y ϕ(x, y)P(y|x) ͷܭࢉ y∈Y ϕ(x,

    y)P(y|x) = y∈Y       P(y|x) t ϕ(x, yt−1 , yt ) ہॴੑͷ੍ݶΑΓ       = y1y2···yT ∈Y×Y×···Y P(y1 , y2 , . . . , yT |x) t ϕ(x, yt−1 , yt ) = y1∈Y y2∈Y · · · yT ∈Y P(y1 , y2 , . . . , yT |x) t ϕ(x, yt−1 , yt ) ߴ໺ ढ़࡞ April 23, 2020 28 / 42
  27. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) = y1∈Y y2∈Y · · ·

    yT ∈Y (P(y1 , y2 , . . . , yT |x)ϕ(x, y0 , y1 )) + y1∈Y y2∈Y · · · yT ∈Y (P(y1 , y2 , . . . , yT |x)ϕ(x, y1 , y2 )) + · · · + y1∈Y y2∈Y · · · yT ∈Y (P(y1 , y2 , . . . , yT |x)ϕ(x, yT −1 , yT )) = y1∈Y ϕ(x, y0 , y1 ) y2∈Y · · · yT ∈Y (P(y1 , y2 , . . . , yT |x)) + y1∈Y y2∈Y ϕ(x, y1 , y2 ) y3∈Y · · · yT ∈Y (P(y1 , y2 , . . . , yT |x)) + · · · + y1∈Y y2∈Y · · · yT −1∈Y yT ∈Y (ϕ(x, yT −1 , yT )P(y1 , y2 , . . . , yT |x)) ߴ໺ ढ़࡞ April 23, 2020 29 / 42
  28. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) = y1∈Y ϕ(x, y0, y1)P(y1|x) +

    y1∈Y y2∈Y ϕ(x, y1, y2)P(y1, y2|x) + · · · + yT −1∈Y yT ∈Y ϕ(x, yT−1, yT )P(yT−1, yT |x) = t   yt−1∈Y yt∈Y ϕ(x, yt−1, yt)P(yt−1, yt|x)   ্هʹ͓͍ͯɺP(yt−1, yt|x) ͕ܭࢉͰ͖Ε͹ྑ͍ͷͰɺ͔͜͜Βܭࢉ͍ͯ͘͠ɽ (y0, yT+1 ͷ৔߹ɺͦΕͧΕ Y = {BOS}, Y = {EOS} Ͱ͋Δɽ) ߴ໺ ढ़࡞ April 23, 2020 30 / 42
  29. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 31 / 42
  30. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ه๏ͷಋೖ ೖྗܥྻͷ௕͞Λ T ͱ͢Δ (T ͸ܥྻ͝ͱʹҟͳ͍ͬͯͯ΋ྑ͍)ɽ

    ֤ϥϕϧྻ y ʹ͓͍ͯɺy0 = BOS, yT+1 = EOS ͱͳ͍ͬͯΔͱ͢Δɽ هड़ͷ؆ུԽͷͨΊɺҎԼͷΑ͏ͳه๏Λಋೖ͢Δɽ y0:T −2 = y0y1y2···yT −2∈Y×Y×Y×···Y ·ͨɺψt(yt−1, yt) = exp(w · ϕ(x, yt−1, yt)) ͱ͓͘ɽ͕ͨͬͯ͠ɺ P(y|x) = t ψt(yt−1, yt) Zx,w ࠓճͷܭࢉͷࡍ͸ɺw, x ͸Ұఆͱߟ͑ΒΕΔͷͰɺҾ਺ʹೖΕ͍ͯͳ͍ɽ ߴ໺ ढ़࡞ April 23, 2020 32 / 42
  31. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) P(yt−1 , yt |x) ͷܭࢉ ্ه͔Βɺ

    P(yt−1 , yt |x) = y0:t−2 yt+1:T +1 P(y|x) = y0:t−2 yt+1:T +1 t′ ψt′ (yt′−1 , yt′ ) = 1 Zx,w y0:t−2 yt+1:T +1 t′ ψt′ (yt′−1 , yt′ ) = ψt (yt−1 , yt ) Zx,w y0:t−2 yt+1:T +1 t̸=t′ ψt′ (yt′−1 , yt′ ) ߴ໺ ढ़࡞ April 23, 2020 33 / 42
  32. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ∑ y0:t−2 ∑ yt+1:T +1 ∏

    t̸=t′ ψt′ (yt′−1 , yt′ ) = ∑ y0:t−2 ∑ yt+1:T +1 (ψ1(y0, y1) × ψ2(y1, y2) × · · · × ψt−1(yt−2, yt−1) × ψt+1(yt, yt+1) × · · · × ψT +1(yT , yT +1)) = ∑ y0:t−2 ψ1(y0, y1) × ψ2(y1, y2) × · · · × ψt−1(yt−2, yt−1) ∑ yt+1:T +1 ψt+1(yt, yt+1) × · · · × ψT +1(yT , yT +1) = ∑ y0:t−2 ψ1(y0, y1) × ψ2(y1, y2) × · · · × ψt−1(yt−2, yt−1) ∑ yt+1:T +1 T +1 ∏ t′=t+1 ψt′ (yt′ , yt′−1 ) = ∑ y0:t−2 t−1 ∏ t′=1 ψt′ (yt′ , yt′−1 ) ∑ yt+1:T +1 T +1 ∏ t′=t+1 ψt′ (yt′ , yt′−1 ) =   ∑ y0:t−2 t−1 ∏ t′=1 ψt′ (yt′ , yt′−1 )     ∑ yt+1:T +1 T +1 ∏ t′=t+1 ψt′ (yt′ , yt′−1 )   ߴ໺ ढ़࡞ April 23, 2020 34 / 42
  33. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) Αͬͯɺ α(yt, t) = y0:t−1 t

    t′=1 ψt′ (yt′ , yt′−1), β(yt, t) = yt+1:T +1 T+1 t′=t+1 ψt′ (yt′ , yt′−1) ͱ͓͘ͱ (ͨͩ͠ɺα(y0, 0) = 1, β(yT+1, T + 1) = 1) P(yt−1, yt|x) = 1 Zx,w ψt(yt−1, yt)α(yt−1, t − 1)β(yt, t) ͱදͤΔɽ ߴ໺ ढ़࡞ April 23, 2020 35 / 42
  34. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) α(yt , t) ͷมܗ α(yt ,

    t) = y0:t−1 t t′=1 ψt′ (yt′ , yt′−1 ) = y0:t−1 ψ1 (y0 , y1 ) × ψ2 (y1 , y2 ) × · · · × ψt (yt−1 , yt ) = yt−1 y0:t−2 ψ1 (y0 , y1 ) × ψ2 (y1 , y2 ) × · · · × ψt (yt−1 , yt ) = yt−1 ψt (yt−1 , yt ) y0:t−2 ψ1 (y0 , y1 ) × ψ2 (y1 , y2 ) × · · · × ψt (yt−3 , yt−2 ) × ψt (yt−2 , yt−1 ) = yt−1 ψt (yt−1 , yt ) y0:t−2 t−1 t′=1 ψt′ (yt′ , yt′−1 ) = yt−1 ψt (yt−1 , yt )α(yt−1 , t − 1) ߴ໺ ढ़࡞ April 23, 2020 36 / 42
  35. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) β(yt , t) ͷมܗ ಉ༷ʹͯ͠ɺ β(yt,

    t) = y+1 ψt+1(yt, yt+1)β(yt+1, t + 1) ߴ໺ ढ़࡞ April 23, 2020 37 / 42
  36. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 38 / 42
  37. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) 1: function Forward(x) 2: α(BOS, 0)

    = 1 3: for t = 1, . . . , T do 4: for y ∈ y1, . . . , ym do 5: if t = 1 then 6: α(y, t) = α(BOS, t − 1) ∗ ψt(y, BOS) 7: else 8: α(y, t) = 0 9: for y′ ∈ y1, . . . , ym do 10: α(y, t) = α(y, t) + α(y′, t − 1) ∗ ψt(y, y′) ߴ໺ ढ़࡞ April 23, 2020 39 / 42
  38. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) 1: function Backward(x) 2: β(EOS, T

    + 1) = 1 3: for t = T, . . . , 1 do 4: for y ∈ y1, . . . , ym do 5: if t = T then 6: β(y, t) = β(EOS, T + 1) ∗ ψt+1(y, EOS) 7: else 8: β(y, t) = 0 9: for y′ ∈ y1, . . . , ym do 10: β(y, t) = β(y, t) + β(y′, t + 1) ∗ ψt+1(y, y′) ߴ໺ ढ़࡞ April 23, 2020 40 / 42
  39. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) ໨࣍ 1 ܥྻϥϕϦϯάͷఆࣜԽ 2 ӅΕϚϧίϑϞσϧ (HMM)

    ͱͷؔ࿈ ੜ੒ϞσϧͱࣝผϞσϧ 3 HMM HMM ͷ໰୊఺ 4 ৚݅෇֬཰৔ (CRF) ఆࣜԽ ༧ଌ (ਪఆ) ֶश (ύϥϝʔλਪఆ) ∑ y∈Y ϕ(x, y)P(y|x) ͷܭࢉ P(yt−1 , yt |x) ͷܭࢉ forward-backword algorithm Zx,w ͷܭࢉ ߴ໺ ढ़࡞ April 23, 2020 41 / 42
  40. ৚݅෇֬཰৔ (CRF) ֶश (ύϥϝʔλਪఆ) Zx,w ͷܭࢉ Zx,w = y∈Y exp(w

    · ϕ(x, y)) = y∈Y exp(w · t ϕ(x, yt, yt−1)) = y∈Y t exp(w · ϕ(x, yt, yt−1)) = y∈Y t ψt(yt, yt−1) = yT y0:T −1 t ψt(yt, yt−1) = yT α(yT , T) ߴ໺ ढ़࡞ April 23, 2020 42 / 42