Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PRML(回帰編)

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for gucchi gucchi
August 16, 2019
580

 PRML(回帰編)

Avatar for gucchi

gucchi

August 16, 2019
Tweet

Transcript

  1. ໨࣍ 1. ಋೖ 1-1. ؆୯ͳճؼͷྫ (PRML 1.1) 1-2. ֬཰࿦ͱ֬཰෼෍ 1-3.

    ࠷໬ਪఆͱϕΠζਪఆ 2. ઢܗճؼϞσϧ 2-1. ઢܗجఈؔ਺Ϟσϧ 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ 2-3. ϕΠζઢܗճؼ 2 / 47
  2. 1. ಋೖ ػցֶशɺಛʹͦͷதͰ΋ڭࢣ͋ΓֶशͰ͸ɺ·ͣೖྗσʔλͷू߹ {x1 , x2 , · · ·

    , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ(܇࿅σʔλɺ·ͨ͸ڭࢣσʔλ) ༻ҙͨ͠܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔Β໨ඪϕΫτϧΛ༧ଌ͢Δ ؔ਺ y(x) Λ࡞Δɻ(ֶश) ֶशऴྃޙɺະ஌ͷσʔλ x ͷ໨ඪϕΫτϧΛ y(x) Ͱ༧ଌ͢Δɻ 3 / 47
  3. 1-1. ؆୯ͳճؼͷྫ ·ͣ͸ɺ؆୯ͳճؼ໰୊Λ௚ײతʹઆ໌͢Δ (PRML 1.1 ʹରԠ)ɻ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x =

    (x1 , x2 , · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ(ճؼͳͷͰɺ ग़ྗ tn ͸࿈ଓతͳ஋ΛͱΔ) ࠓճɺ܇࿅σʔλͷग़ྗͰ͋Δ tn ͸ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨ΢ε ෼෍ (ޙ΄Ͳઆ໌͢Δ) ʹै͏ϥϯμϜϊΠζ ϵ ΛՃ͑ͨ΋ͷͱͯ͠࡞ ੒͢Δɻ(PRML ෇࿥ A ࢀߟ) tn = sin(2πxn ) + ϵ (1.1) ճؼͷ໨త͸܇࿅σʔλ (x, t) Λ࢖ͬͯɺ(܇࿅σʔλʹؚ·Εͳ͍) ৽ ͨͳೖྗ ˆ x ͕༩͑ΒΕͨ࣌ͷग़ྗ ˆ t Λ༧ଌ͢Δ͜ͱͰ͋Δɻ 5 / 47
  4. 1-1. ؆୯ͳճؼͷྫ Լͷਤ͸܇࿅σʔλͷ਺ N = 10 ͷ৔߹ͷྫͰ͋Δɻ(੨ؙ͕܇࿅ σʔλ) ·ͨɺ྘ͷۂઢ͸ sin(2πx)

    Ͱ͋Δɻ ੨ؙ͕྘ͷۂઢ্ʹ৐͍ͬͯͳ͍ͷ͸ɺ(1.1) ͷΨ΢εͷϊΠζͷӨڹ Ͱ͋Δɻ 6 / 47
  5. 1-1. ؆୯ͳճؼͷྫ ͦΕͰ͸ɺ܇࿅σʔλΛ༻͍ͯະ஌ͷೖྗʹର͢Δग़ྗΛ༧ଌΛߦ͏ɻ ͱΓ͋͑ͣ͜ͷઅͰ͸ɺೖྗม਺ x ΛೖΕͨΒରԠ͢Δ໨ඪม਺ t ͷ༧ ଌ஋Λฦؔ͢਺ y(x)

    Λ܇࿅σʔλΛ࢖ͬͯ࡞੒͢Δ͜ͱΛߟ͑Δɻ ۩ମతʹࠓճ͸ҎԼͷΑ͏ͳଟ߲ࣜ y(x, w) Λߟ͑Δɻ y(x, w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (1.2) ͜͜ͰϕΫτϧ w = (w0 , w1 , · · · , wM )T ͸ॏΈύϥϝʔλͱݺ͹ΕΔɻ զʑͷ໨ඪ͸܇࿅σʔλ (x, t) Λ࢖ͬͯɺy(x, w) ͕༧ଌؔ਺ͱͳΔΑ ͏ʹύϥϝʔλ w Λద੾ʹௐઅ͢Δ͜ͱͰ͋Δɻ 7 / 47
  6. 1-1. ؆୯ͳճؼͷྫ ͦΕͰ͸ɺ࣍ʹ w ΛͲͷΑ͏ʹௐઅ͢Ε͹ y(x, w) ͕༧ଌؔ਺ͱͯ͠ ૬Ԡ͘͠ͳΔͷ͔Λ௚ײతʹߟ͑Δɻ w

    ͸͢΂ͯͷ܇࿅σʔλʹରͯؔ͠਺ y(xn , w) ͕໨ඪม਺ tn ʹۙ͘ͳ ΔΑ͏ͳ w Ͱ͋Δͱྑ͍ͱࢥΘΕΔɻ ͦ͜ͰɺҎԼͷޡࠩؔ਺ E(w) Λ࠷খʹ͢ΔΑ͏ͳ w(= w⋆) ΛٻΊΔ ͜ͱΛߟ͑Δɻ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 (1.3) y(x, w⋆) ͸༧ଌؔ਺ͱͯ͠૬Ԡ͍͠ͱߟ͑ΒΕΔɻ ͜ͷؔ਺ E(w) ͷ۩ମతͳ࠷খԽํ๏͸ޙʹ (εϥΠυͷ 2 ষ) Ͱٞ࿦ ͢Δɻ 8 / 47
  7. 1-1. ؆୯ͳճؼͷྫ ্ͷਤ͸ଟ߲ࣜͷ࣍ݩ M = 0, 1, 3, 9 ͷϑΟοςΟϯά݁ՌͰ͋Δɻ(྘

    ͕ sin(2πx) Ͱɺ੺͕ y(x, w⋆)) ͜ͷதͰ͸ɺM = 3 ͕Ұ൪ sin(2πx) ʹ౰ͯ͸·͍ͬͯΔΑ͏ʹݟ͑Δɻ M = 9 Ͱ͸ɺE(w⋆) = 0 ͕ͩɺsin(2πx) ʹ͸౰ͯ͸·͍ͬͯͳ͍ɻ(ա ֶश) 9 / 47
  8. 1-2. ֬཰࿦ͱ֬཰෼෍ ύλʔϯೝࣝʹ͓͍ͯɺॏཁͳෆ࣮֬ੑΛఆྔతʹධՁ͢ΔͨΊʹ֬཰ ࿦Λಋೖ͢Δɻ ֬཰ม਺ X, Y Λߟ͑ɺ͜ΕΒ͸ X =

    xi (i = 1, 2, · · · , M)ɺ Y = yj (j = 1, 2, · · · , L) ΛͱΔͱ͠ɺX = xi , Y = yj ͱͳΔ֬཰ (ಉ࣌ ֬཰) Λ p(X = xi , Y = yj ) ͱ͔͘ɻ X = xi ͱͳΔ֬཰ p(X = xi ) ͸ɺp(X = xi , Y = yj ) Λ༻͍ͯҎԼͷ Α͏ʹ͔͚Δɻ(Ճ๏ఆཧ) p(X = xi ) = L ∑ j=1 p(X = xi , Y = yj ) (1.4) ·ͨɺX = xi ͕༩͑ΒΕ্ͨͰɺY = yj ͱͳΔ֬཰ (৚݅෇͖֬཰) Λ p(Y = yj |X = xi ) ͱ͢ΔͱɺҎԼͷΑ͏ͳؔ܎͕ࣜ੒ཱ͢Δɻ(৐๏ ఆཧ) p(X = xi , Y = yj ) = p(Y = yj |X = xi )p(X = xi ) (1.5) 11 / 47
  9. 1-2. ֬཰࿦ͱ֬཰෼෍ ৐๏ఆཧͱಉ࣌֬཰ͷରশੑ p(X, Y ) = p(Y, X) Λ༻͍ΔͱɺϕΠζͷ

    ఆཧ͕ಋ͚Δɻ p(Y |X) = p(X|Y )p(Y ) p(X) (1.6) ͜͜Ͱɺp(Y ) Λࣄલ֬཰ (X ͕༩͑ΒΕΔલͷ֬཰) ͱ͍͍ɺp(Y |X) Λࣄޙ֬཰ (X ͕༩͑ΒΕͨޙͷ֬཰) ͱ͍͏ɻ ϕΠζͷఆཧ͸ࣄલ֬཰ p(Y ) ʹ໬౓ p(X|Y ) Λ͔͚Δͱɺࣄޙ֬཰ p(X|Y ) ʹͳΔͱ͍͏͜ͱΛද͢ (p(X) ͸ p(Y |X) ͕ Y ʹରͯ͠ن֨ Խ͞Ε͍ͯΔ͜ͱΛอূ͢Δن֨Խఆ਺)ɻ ͞Βʹɺಉ࣌෼෍ p(X, Y ) ͕ҎԼͷΑ͏ʹपล෼෍ͷੵͰදͤΔ࣌ɺX ͱ Y ͸ಠཱͰ͋Δͱ͍͏ɻ p(X, Y ) = p(X) p(Y ) (1.7) 12 / 47
  10. 1-2. ֬཰࿦ͱ֬཰෼෍ ͜Ε·Ͱ͸཭ࢄతͳ֬཰ม਺ʹ͍ͭͯߟ͖͑ͯͨɻ࣍ʹ࿈ଓతͳ֬཰ ม਺ͷ෼෍ʹ͍ͭͯߟ͑Δɻ ֬཰ม਺ x ͕ (x, x +

    δx) ͷൣғʹೖΔ֬཰͕ δx → 0 ͷ࣌ʹ p(x) δx ͱ ༩͑ΒΕΔ࣌ɺp(x) Λ֬཰ີ౓ͱ͍͏ɻ ͜ͷ࣌ɺม਺ x ͕۠ؒ (a, b) ʹ͋Δ֬཰͸ҎԼͷࣜͰ༩͑ΒΕΔɻ p(x ∈ (a, b)) = ∫ b a p(x) dx (1.8) ·ͨɺ֬཰ͷඇෛੑͱن֨ԽΑΓɺp(x) ͸ҎԼͷੑ࣭Λ࣋ͭɻ p(x) ≥ 0 (1.9) ∫ ∞ −∞ p(x) dx = 1 (1.10) 13 / 47
  11. 1-2. ֬཰࿦ͱ֬཰෼෍ ֬཰࿦Ͱͷॏཁͳܭࢉͱͯ͠ɺॏΈ෇͖ฏۉ͕͋Δɻ ࿈ଓతͳ֬཰ม਺ x ʹରͯ͠ɺؔ਺ f(x) ͷ֬཰෼෍ p(x) ͷԼͰͷฏۉ

    ஋͸ҎԼͷΑ͏ʹͳΔɻ E[f] = ∫ p(x)f(x) dx (1.11) ͜͜Ͱه๏ͱͯ͠ɺͲͷม਺ʹ͍ͭͯ࿨ (΋͘͠͸ੵ෼) Λͱ͍ͬͯΔ ͷ͔ΛఴࣈͰද͢͜ͱʹ͢Δɻྫ͑͹ɺҎԼͷྔ͸ x ͍ͭͯ࿨ (΋͘͠ ͸ੵ෼) Λͱͬͨ΋ͷͰ͋Δɻ Ex [f(x, y)] (1.12) 14 / 47
  12. 1-2. ֬཰࿦ͱ֬཰෼෍ ҎԼ͕ؔ਺ f(x) ͷ֬཰෼෍ p(x) ͷԼͰͷ෼ࢄͰ͋Δɻ(ؔ਺ f(x) ͕ͦ ͷฏۉ஋

    E[f(x)] ͷपΓͰͲΕ͚ͩόϥ͍͍ͭͯΔͷ͔Λද͢) var[f] = E [ (f(x) − E[f(x)])2 ] (1.13) ಛʹ f(x) = x ͷ࣌͸ҎԼ͕੒ཱ͢Δɻ var[x] = E[x2] − E[x]2 (1.14) ·ͨɺ2 ͭͷ֬཰ม਺ x ͱ y ͷؒͷڞ෼ࢄ (2 ͭͷ֬཰ม਺ͷґଘੑΛ ද͢) ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ cov[x, y] = Ex,y [ {x − E[x]}{y − E[y]} ] = Ex,y [xy] − E[x]E[y] (1.15) 2 ͭͷ֬཰ม਺ x ͱ y ͕ಠཱͷ࣌ɺcov[x, y] = 0 ͱͳΔɻ 15 / 47
  13. 1-2. ֬཰࿦ͱ֬཰෼෍ Ψ΢ε෼෍ͷॏཁͳੑ࣭ͱͯ͠ɺx ͷฏۉ஋Λ෼ࢄ͕ͦΕͧΕ µ ͱ σ2 Ͱ༩͑ΒΕΔ͜ͱͰ͋Δɻ E[x] =

    ∫ ∞ −∞ N(x|µ, σ2)x dx = µ (1.17) var[x] = E[x2] − E[x]2 = σ2 (1.18) 17 / 47
  14. 1-2. ֬཰࿦ͱ֬཰෼෍ ࣍ʹɺҎԼͷ D ࣍ݩͷϕΫτϧ x ʹର͢ΔଟมྔΨ΢ε෼෍Λಋೖ ͢Δɻ N(x|µ, Σ)

    = 1 (2π)D/2 1 |Σ|1/2 exp { − 1 2 (x − µ)TΣ−1(x − µ) } (1.19) ͜͜Ͱɺµ Λ D ࣍ݩͷฏۉϕΫτϧͱ͠ɺΣ Λ D × D ͷڞ෼ࢄߦྻͱ ͢Δɻ ͜ͷ৔߹Ͱ΋ฏۉͱڞ෼ࢄ͸ҎԼͷੑ࣭Λຬͨ͢ɻ E[x] = ∫ N(x|µ, Σ)x dx = µ (1.20) cov[x] = E[(x − E[x])(x − E[x])T] = Σ (1.21) 18 / 47
  15. 1-2. ֬཰࿦ͱ֬཰෼෍ Ҏ߱ͷٞ࿦ͰΑ͘࢖͏Ψ΢ε෼෍ͷެࣜΛ঺հ͢Δɻ ҎԼͷपล֬཰ p(x) ͱ৚݅෇͖֬཰ p(y|x) ͕༩͑ΒΕ͍ͯΔͱ͢Δɻ p(x) =

    N(x|µ, Λ−1) (1.22) p(y|x) = N(y|Ax + b, L−1) (1.23) ͜͜Ͱɺµ, A, b ͸ฏۉʹؔ͢ΔύϥϝʔλͰɺΛ, L ͸ਫ਼౓ߦྻͰ ͋Δɻ ͜ͷ࣌ɺपล֬཰ p(y) ͱ৚݅෇͖֬཰ p(x|y) ͸ҎԼͷΑ͏ʹͳΔɻ p(y) = N(y|Aµ + b, L−1 + AΛ−1AT) (1.24) p(x|y) = N(x|Σ{ATL(y − b) + Λµ}, Σ) (1.25) ͜͜ͰɺΣ ͸ҎԼͰఆٛ͞ΕΔɻ Σ = (Λ + ATLA)−1 (1.26) (ৄ͍͠ಋग़͸ PRML ͷ 2.3.3 Λࢀߟ) 19 / 47
  16. 1-2. ֬཰࿦ͱ֬཰෼෍ ·ͨɺಉ࣌෼෍ p(xa , xb ) ͕ҎԼͰ༩͑ΒΕ͍ͯͨͱ͢Δɻ p(xa ,

    xb ) = N(x|µ, Σ) (1.27) ͜͜Ͱɺx = (xa , xb )T Ͱ͋Δɻ ͜ͷͱ͖ɺपล෼෍ p(xa ) ͸ҎԼͷΑ͏ͳΨ΢ε෼෍ʹͳΔ͜ͱ͕஌Β Ε͍ͯΔɻ(ৄ͍͠ಋग़͸ PRML ͷ 2.3.2 Λࢀߟ) p(xa ) = ∫ p(xa , xb ) dxb = N(xa |µa , Σaa ) (1.28) ͜͜Ͱɺµa ͱ Σaa ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ µ = ( µa µb ) , Σ = ( Σaa Σab Σba Σbb ) (1.29) 20 / 47
  17. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ϕΠζਪఆΛଟ߲ࣜۂઢϑΟοςΟϯάΛྫʹઆ໌͢Δɻ ϕΠζతͳ֬཰ղऍͰ͸ɺ·ͣσʔλΛ؍ଌ͢Δલʹɺզʑͷύϥϝʔ λ w ΁ͷԾઆΛࣄલ֬཰ p(w) ͷܗͰऔΓࠐΜͰ͓͘ɻ ࣮ࡍʹೖྗσʔλ

    x = (x1 , x2 , · · · , xN )T ͱ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻͍ͯɺ໬౓ؔ਺ p(t|x, w) ΛٻΊΔɻ ϕΠζͷఆཧΑΓɺࣄޙ֬཰ p(w|t, x) ΛٻΊΔɻ p(w|t, x) = p(t|x, w)p(w) p(t) (1.30) 21 / 47
  18. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ϕΠζਪఆͰ͸ɺ܇࿅σʔλ x, t ͱະ஌ͷೖྗσʔλ x ͕༩͑ΒΕͨ ࣌ͷ༧ଌ t

    ͷ֬཰ p(t|x, t, x) ͕ҎԼͷΑ͏ʹٻ·Δɻ p(t|x, t, x) = ∫ p(t|x, w)p(w|t, x) dw (1.31) (͜ͷ༧ଌ෼෍ͷಋग़ํ๏͸ҎԼͷ Qiita هࣄͰ·ͱΊͯ·͢ɻ͝ཡ͘ ͍ͩ͞ɻ͍͍ͦͯ͠Ͷ͍ͩ͘͞ɻ) https://qiita.com/gucchi0403/items/bfffd2586272a4c05a73 22 / 47
  19. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ස౓ओٛతͳ֬཰ղऍͱϕΠζతͳ֬཰ղऍͰɺ໬౓ؔ਺ p(D|w) ͷ໾ ׂ͕มΘΔɻ ස౓ओٛతͳ֬཰ղऍͰ͸ɺw ͸͋Δݻఆ͞Εͨύϥϝʔλͱͯ͠ଊ ͑ɺ໬౓ؔ਺ p(D|w)

    Λ࠷େʹ͢ΔΑ͏ͳ w Λਪఆྔͱͯ͠ఆΊΔɻ (w ͸ 1 ͭʹఆ·Δ) ϕΠζతͳ֬཰ղऍͰ͸ɺ໬౓ؔ਺͸ࣄલ෼෍Λ؍ଌσʔλ D ʹΑͬ ͯɺࣄޙ෼෍ʹߋ৽͢ΔͨΊʹ࢖͏ (ࣄޙ෼෍ p(w|D) ͸ w ͷ֬཰෼෍ Ͱ͋Γɺw ͸ෆ࣮֬ੑΛ΋ͭ) ޙऀͷ໬౓ؔ਺ͷ࢖༻ํ๏ͷ۩ମྫ͸ޙ΄Ͳ঺հ͢Δɻ 23 / 47
  20. 2-1. ઢܗجఈؔ਺Ϟσϧ ͸͡Ίʹઆ໌ͨ͠؆୯ͳճؼϞσϧ͸ɺग़ྗ y(x, w) ΛҎԼͷΑ͏ʹೖ ྗม਺ x ͷଟ߲ࣜͱ͢Δ΋ͷͰ͋ͬͨɻ y(x,

    w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (2.1) ͜͜Ͱɺw = (w0 , w1 , · · · , wM )T ͸ύϥϝʔλϕΫτϧͰ͋Δɻ ͜ͷষͰ͸ɺҰൠԽͱͯ͠ೖྗΛϕΫτϧ x ͱ͠ɺඇઢܗͳجఈؔ਺ ϕj (x) (j = 1, · · · , M − 1) Ͱؔ਺ y(x, w) ΛҎԼͷΑ͏ʹల։͢Δ͜ͱ Λߟ͑Δɻ y(x, w) = w0 + M−1 ∑ j=1 wj ϕj (x) (2.2) 24 / 47
  21. 2-1. ઢܗجఈؔ਺Ϟσϧ ·ͨࣜΛ୹ॖ͢ΔͨΊɺϕ0 (x) = 1 ͱ͠ɺ ϕ(x) = (ϕ0

    (x), ϕ1 (x), · · · , ϕM−1 (x))T ͱఆٛ͢Δͱɺ(2.2) ͸ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (2.3) ͱॻ͚Δɻ ྫ͑͹ɺجఈؔ਺ ϕj (x) ͱͯ͠ҎԼͷΨ΢εجఈؔ਺͕͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (2.4) ͜ͷجఈؔ਺͸ x = µj Λத৺ʹͯ͠ɺ෼ࢄ s2 ʹΑͬͯࢧ഑͞ΕΔ޿͕ ΓΛ࣋ͭΨ΢εجఈؔ਺Ͱ͋Δɻ Ҏ߱͸Ұൠͷجఈؔ਺ ϕj (x) Λ༻͍ͯٞ࿦͢Δɻ 25 / 47
  22. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ॳΊͷষͰઆ໌ͨ͠ճؼ໰୊Ͱ͸ɺೋ৐࿨ޡࠩΛ࠷খʹ͢ΔΑ͏ʹσʔ λ఺Λଟ߲ࣜؔ਺ʹϑΟοςΟϯάͤͨ͞ɻ ࠓճ͸ɺ໨ඪม਺ t ͕ҎԼͷΑ͏ʹܾఆ࿦తͳؔ਺ y(x, w) ͱظ଴஋͕

    0 Ͱਫ਼౓͕ β > 0 ͷΨ΢ε෼෍ N(ϵ|0, β−1) ʹै͏ ϵ ͷ࿨Ͱॻ͚Δͱ ͢Δɻ t = y(x, w) + ϵ (2.5) ϵ = t − y(x, w) ΑΓɺҎԼͷΑ͏ʹ໨ඪม਺ t ΋Ψ΢ε෼෍ʹै͏ɻ p(t|x, w, β) = N(t − y(x, w)|0, β−1) = N(t|y(x, w), β−1) (2.6) 26 / 47
  23. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜Ͱɺೖྗσʔλͷू߹ X = {x1 , x2 , ·

    · · , xN } ͱͦΕͧΕʹରԠ͢ Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺ໨ඪม਺Λॎʹฒ΂ͨϕ Ϋτϧ t = (t1 , t2 , · · · , tN )T Λఆٛ͢Δɻ ؍ଌ఺ {t1 , t2 , · · · , tN } ͕෼෍ (2.6) ͔Βಠཱʹੜ੒͞Εͨͱ͢Δͱɺ໬ ౓ؔ਺͸ҎԼͷΑ͏ʹݸʑͷσʔλ఺ͷ෼෍ͷੵͰॻ͚Δɻ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.7) ͜͜ͰɺΨ΢ε෼෍ N(x|µ, σ2) ͸ N(x|µ, σ2) = 1 (2πσ2)1/2 exp { − 1 2σ2 (x − µ)2 } (2.8) Ͱ͋Δɻ 27 / 47
  24. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ໬౓ؔ਺ͷ (2.7) Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔ୅ΘΓʹ໬౓ ؔ਺ͷର਺Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔɻ ·ͣɺ ln { N(tn

    |y(xn , w), β−1) } = ln [ β1/2 (2π)1/2 exp { − β 2 (tn − y(xn , w))2 }] = 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 (2.9) ΑΓɺln p(t|X, w, β) ͸ҎԼͷΑ͏ʹͳΔɻ ln p(t|X, w, β) = N ∑ n=1 ln N(tn |y(xn , w), β−1) = N ∑ n=1 [ 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 ] = N 2 ln β − N 2 ln (2π) − β 2 N ∑ n=1 (tn − y(xn , w))2 (2.10) 28 / 47
  25. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜Ͱɺೋ৐࿨ޡࠩ ED (w) Λ ED (w) = 1

    2 N ∑ n=1 (tn − y(xn , w))2 (2.11) ͱఆٛ͢Δͱɺln p(t|X, w, β) ͸ ln p(t|X, w, β) = N 2 ln β − N 2 ln (2π) − βED (w) (2.12) ͱͳΔɻ ࠷໬ਪఆղ wML , βML ΛٻΊΔͨΊʹର਺໬౓ ln p(t|X, w, β) ͷޯ഑ ΛٻΊΔɻ ର਺໬౓ͷ w ʹର͢Δޯ഑͸ β ʹґଘ͠ͳ͍ͷͰɺઌʹ wML ΛٻΊ ͯɺͦͷ͋ͱʹ ln p(t|X, wML , β) Λ༻͍ͯ βML ΛٻΊΔ͜ͱ͕Ͱ ͖Δɻ 29 / 47
  26. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ·ͣɺର਺໬౓ (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δͱɺ (2.12) ͷӈลͷ 1,

    2 ߲໨͸ w ʹґଘ͠ͳ͍ͷͰɺ3 ߲໨ͷ −βED (w) Λ࠷େԽ͢Δ͜ͱͱ౳ՁͰ͋Δɻ β > 0 ΑΓɺର਺໬౓ (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱ͸ೋ৐࿨ޡ ࠩ ED (w)(2.11) Λ w ʹؔͯ͠࠷খʹ͢Δ͜ͱͱ౳ՁͰ͋Δɻ 1-1 Ͱൃݟ๏తʹೋ৐࿨ޡࠩ (1.3) Λ࠷খԽ͕ͨ͠ɺೋ৐࿨ޡࠩ (1.3) ͷ ࠷খԽ͸֬཰࿦Λ༻͍Δͱ໬౓ؔ਺ΛΨ΢ε෼෍ͱԾఆͨ͠ͱ͖ͷ࠷ ໬ਪఆͷ݁ՌͰ͋Δࣄ͕Θ͔Δɻ ͦΕͰ͸࣮ࡍʹର਺໬౓ͷ w ʹର͢Δޯ഑ΛٻΊΔɻ 30 / 47
  27. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ର਺໬౓ͷ w ʹର͢Δޯ഑͸ y(x, w) = wTϕ(x) ΑΓɺ

    ∂ ∂w ln p(t|X, w, β) = − β ∂ ∂w ED (w) = − β 2 N ∑ n=1 ∂ ∂w (tn − wTϕ(xn ))2 =β N ∑ n=1 (tn − wTϕ(xn ))ϕ(xn ) =β { N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )Tw } (2.13) ͱͳΓɺ࠷໬ਪఆղ wML ͸ҎԼͷࣜΛຬͨ͢ɻ N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )TwML = 0 (2.14) 31 / 47
  28. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜ͰɺҎԼͷܭըߦྻ Φ Λఆٛ͢Δɻ Φ =   

       ϕ0 (x1 ) ϕ1 (x1 ) · · · ϕM−1 (x1 ) ϕ0 (x2 ) ϕ1 (x2 ) · · · ϕM−1 (x2 ) . . . . . . ... . . . ϕ0 (xN ) ϕ1 (xN ) · · · ϕM−1 (xN )       =       ϕ(x1 )T ϕ(x2 )T . . . ϕ(xN )T       (2.15) ҎԼͷ͕ࣜ੒Γཱͭࣄ͕Θ͔Δɻ ΦTΦ = N ∑ n=1 ϕ(xn )ϕ(xn )T (2.16) ΦTt = N ∑ n=1 tn ϕ(xn ) (2.17) ͜ΕΑΓɺ(2.14) ͸ҎԼͷΑ͏ʹͳΔɻ ΦTt − ΦTΦwML = 0 (2.18) 32 / 47
  29. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ Αͬͯɺ࠷໬ਪఆղ wML ͸ wML = (ΦTΦ)−1ΦTt (2.19) ͱͳΔɻ

    ࣍ʹɺ࠷໬ਪఆղ wML Λ୅ೖͨ͠ ln p(t|X, wML , β) ͷ β ͷඍ෼Λߟ ͑Δͱ ∂ ∂β ln p(t|X, wML , β) = N 2 1 β − ED (wML ) (2.20) ͱͳΔɻ ͜ΕΑΓɺ࠷໬ਪఆղ βML ͷٯ਺͸ҎԼͷΑ͏ʹͳΔɻ 1 βML = 2 N ED (wML ) = 1 N N ∑ n=1 (tn − wT ML ϕ(xn ))2 (2.21) 33 / 47
  30. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜ΕΑΓɺ৽ͨͳೖྗϕΫτϧ x ͕༩͑ΒΕͨ࣌ͷ໨ඪม਺ t ͷ༧ଌ෼ ෍ p(t|x, wML

    , βML ) ͸ҎԼͷΑ͏ʹͳΔɻ p(t|x, wML , βML ) = N(t|y(x, wML ), β−1 ML ) (2.22) ͜͜ͰɺwML , βML ͸ (2.19) ͱ (2.21) Ͱ༩͑ΒΕΔɻ 34 / 47
  31. 2-3. ϕΠζઢܗճؼ ࣍͸ઢܗճؼϞσϧΛϕΠζతʹѻ͏͜ͱΛߟ͑Δɻ ͦ͜Ͱɺฏۉ͕ m0 Ͱڞ෼ࢄ͕ S0 ͷҎԼͷࣄલ෼෍ΛԾఆ͢Δɻ p(w) =

    N(w|m0 , S0 ) (2.23) ·ͨɺ໬౓ؔ਺͸ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.24) Ͱ͋ΔͷͰɺࣄޙ෼෍ p(w|t) ͸ϕΠζͷఆཧʹΑΓɺҎԼͷΑ͏ʹ ͳΔɻ p(w|t) ∝ p(t|X, w, β)p(w) ∝ exp ( − β 2 N ∑ n=1 (tn − wTϕ(xn ))2 ) × exp ( − 1 2 (w − m0 )TS−1 0 (w − m0 ) ) (2.25) 35 / 47
  32. 2-3. ϕΠζઢܗճؼ (2.25) ΑΓɺࢦ਺ͷݞ͕ w ͷ 2 ࣍Ͱ͋ΔͷͰ p(w|t) ͸Ψ΢ε෼෍Ͱ

    ͋Δɻ ۩ମతʹ͸ɺp(w|t) ͸ҎԼͷΑ͏ʹͳΔɻ(PRML ͷԋश 3.7 ࢀর) p(w|t) = N(w|mN , SN ) (2.26) ͜͜ͰɺmN ͱ SN ͸ҎԼͰ͋Δɻ mN =SN (S−1 0 m0 + βΦTt) (2.27) S−1 N =S−1 0 + βΦTΦ (2.28) 36 / 47
  33. 2-3. ϕΠζઢܗճؼ ͜͜Ͱɺ࠷໬ਪఆղ wML (2.19) ͱࣄޙ෼෍ p(w|t) ͷϞʔυ wMAP (Ϟʔυͱ͸ɺp(w|t)

    Λ࠷େʹ͢Δ w) ͱࣄޙ෼෍ͷฏۉ஋ mN ͷؔ܎ Λߟ࡯͢Δɻ ·ͣɺΨ΢ε෼෍ͷϞʔυ͸ฏۉ஋ʹ౳͍͠ͱ͍͏ੑ࣭ (PRML ͷԋश 1.9 ࢀর) ͕͋ΔͷͰɺwMAP = mN Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͞Βʹɺແݶʹ޿͍ࣄલ෼෍ S0 = α−1I(α → 0) Λߟ͑Δͱ S−1 N = S−1 0 + βΦTΦ → βΦTΦ (2.29) ͱͳΓɺ mN = SN (S−1 0 m0 + βΦTt) → (ΦTΦ)−1ΦTt (2.30) ͱͳΔͷͰɺ͜ͷͱ͖ wMAP = mN = wML Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͭ·ΓɺԿ΋৘ใΛ࣋ͨͳ͍ (ແݶʹ޿͍) ࣄલ෼෍Λ࢖༻ͨ͠ͱ͖ͷ ࣄޙ෼෍Λ࠷େʹ͢Δύϥϝʔλ͸໬౓ؔ਺Λ࠷େʹ͢Δύϥϝʔλ ͱҰக͢Δͱ͍͏͜ͱͰ͋Δɻ 37 / 47
  34. 2-3. ϕΠζઢܗճؼ લͷষͰɺϕΠζతͳѻ͍Ͱ͸ɺ໬౓ؔ਺͸ࣄޙ෼෍Λߋ৽͢Δ΋ͷͰ ͋Δͱઆ໌͕ͨ͠ɺͦͷߋ৽ͷ༷ࢠΛྫΛ࢖ͬͯݟ͍ͯ͘ɻ ·ͣɺઃఆͱͯ͠໬౓ؔ਺ͷฏۉ஋͸ y(x, w) = w0 +

    w1 x ͱ͢Δɻ ·ͨɺڭࢣσʔλʹ͍ͭͯ͸ɺೖྗσʔλ xn ͸ −1 ͔Β 1 ͷҰ༷෼෍ ͔ΒબͼɺରԠ͢Δ໨ඪ஋ tn ͸ɺඪ४ภࠩ 0.2 Ͱฏۉ 0 ͷΨ΢εϊΠ ζ ϵ Λ༻͍ͯ tn = f(xn , a0 = −0.3, a1 = 0.5) + ϵ (2.31) ͜͜Ͱɺ f(x, a0 , a1 ) = a0 + a1 x (2.32) Ͱ͋Δɻ ͭ·Γɺ͜͜Ͱͷ໨ඪ͸ڭࢣσʔλΛ༻͍ͯύϥϝʔλ w0 , w1 ͕ a0 = −0.3, a1 = 0.5 Λ෮ݩ͢Δ͜ͱͰ͋Δɻ 38 / 47
  35. 2-3. ϕΠζઢܗճؼ ·ͨɺ໬౓ؔ਺ͷਫ਼౓͸ط஌Ͱ β = (1/0.2)2 = 25 ͱ͠ɺࣄલ෼෍͸Ҏ ԼͷΑ͏ͳ౳ํతΨ΢ε෼෍Λ༻͍ͯɺύϥϝʔλ

    α ͷ஋͸ α = 2.0 ͱ ͢Δɻ p(w) = N(w|0, α−1I) (2.33) ͜ͷઃఆͰڭࢣσʔλ͕૿͍͑ͯ͘ͱ͖ͷࣄޙ෼෍ͷߋ৽ʹ͍ͭͯݟ ͍ͯ͘ɻ 39 / 47
  36. 2-3. ϕΠζઢܗճؼ ·ͣ͸ڭࢣσʔλ͕؍ଌ͞ΕΔલͷஈ֊ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ࣄલ෼෍ p(w) Ͱ͋Γɺӈͷάϥϑ͸ͦͷࣄલ෼෍͔Βϥ ϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1

    Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ౰વڭࢣσʔλ͕ͳ͍ͷͰɺ6 ݸͷؔ਺͸·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔ਺͸ͳ͍ɻ 40 / 47
  37. 2-3. ϕΠζઢܗճؼ ࣍͸ڭࢣσʔλ͕Ұͭ؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸͜ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Λ w ͷؔ਺ͱͯ͠ϓ ϩοτͨ͠΋ͷͰ͋Δɻന͍ेࣈ͕

    a0 = −0.3, a1 = 0.5 ͷ఺Ͱ͋Δɻ ਅΜதͷάϥϑ͸ࣄޙ෼෍ɺͭ·Γࣄલ෼෍ p(w) ʹ໬౓ؔ਺ p(t|x, w) Λ͔͚ͯن֨Խͨ͠΋ͷͰ͋Γɺӈͷάϥϑ͸ͦͷࣄޙ෼෍͔Βϥϯμ Ϝʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ·ͩڭࢣσʔλ͕গͳ͍ͷͰɺ6 ݸͷؔ਺͸·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔ਺͸ͳ͍͕ɺ͢΂ͯ ͷઢ͕σʔλ఺ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔ͜ͱʹ஫ҙɻ 41 / 47
  38. 2-3. ϕΠζઢܗճؼ ࣍͸ೋͭ໨ͷڭࢣσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ಉ͘͡ɺ͜ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Ͱ͋Δɻ ਅΜதͷάϥϑ͸σʔλ఺͕Ұݸͩͬͨ࣌ͷࣄޙ෼෍Λࣄલ෼෍ͱ͠ ͯɺͦΕʹ໬౓ؔ਺Λ͔͚ͨ΋ͷͰ͋Γɺӈͷάϥϑ͸ͦͷࣄޙ෼෍͔ ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ

    w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ࣄޙ෼෍͕ a0 = −0.3, a1 = 0.5 ෇ۙʹ࠷େΛ࣋ͪɺ6 ݸͷؔ਺͕ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ෇ۙʹ·ͱ·Γ࢝Ίɺ͢΂ͯ ͷઢ͕ 2 ͭͷσʔλ఺ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔɻ 42 / 47
  39. 2-3. ϕΠζઢܗճؼ ࠷ޙʹ 20 ݸͷσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ɺ20 ݸ໨ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Ͱ͋Δɻ

    ਅΜதͷάϥϑ͸ 20 ݸ໨ͷσʔλ͢΂ͯΛؚΜͩࣄޙ෼෍Ͱ͋Γɺӈ ͷάϥϑ͸ͦͷࣄޙ෼෍͔ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦͬͯ ͍Δɻ ࣄޙ෼෍͕ a0 = −0.3, a1 = 0.5 ෇ۙʹӶ͍෼෍Λ࣋ͪɺ6 ݸͷؔ਺͕ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ෇ۙʹ·ͱ·͍͍ͬͯͯΔ͜ ͱ͕Θ͔Δɻ 43 / 47
  40. 2-3. ϕΠζઢܗճؼ ࣍ʹࣄޙ෼෍ p(w|t, α, β) ͱ໬౓ؔ਺ p(t|x, w, β)

    Λ༻͍ͯɺະ஌ͷೖ ྗϕΫτϧ x ʹର͢Δ༧ଌ t ͷ֬཰෼෍ΛٻΊΔɻ(ࣄޙ෼෍ʹϋΠ ύʔύϥϝʔλ α, β ΛҾ਺ʹ෮׆ͤͨ͞ɻ) (1.31) ΑΓɺ༧ଌ෼෍ p(t|t, α, β) ͸ҎԼͷΑ͏ʹͳΔɻ p(t|t, α, β) = ∫ p(t|x, w, β)p(w|t, α, β) dw (2.34) (2.6) ͱ (2.26) ͱ (1.24) Λ༻͍Δͱɺp(t|t, α, β) ͸ҎԼͷΑ͏ʹͳΔɻ (PRML ͷԋश 3.10 ࢀর) p(t|t, α, β) = N(t|mT N ϕ(x), σ2 N (x)) (2.35) ͜͜Ͱɺ༧ଌ෼෍ͷ෼ࢄ σ2 N (x) ͸ҎԼͰ༩͑ΒΕΔɻ σ2 N (x) = 1 β + ϕ(x)TSN ϕ(x) (2.36) 44 / 47
  41. 2-3. ϕΠζઢܗճؼ σ2 N (x) ͷҰ߲໨ͷ 1/β ͸໬౓ؔ਺ͷ෼ࢄͰ͋Γɺೖྗσʔλʹର͢Δ ग़ྗͷόϥ͖ͭ (ϊΠζ)

    Ͱ͋Δɻ Ұํɺೋ߲໨ͷ ϕ(x)TSN ϕ(x) ͸ w ͷෆ࣮֬ੑ (ࣄޙ෼෍ͷ෼ࢄ) ͔Β ͘Δ߲Ͱ͋Δɻ(ύϥϝʔλΛ఺ਪఆ͠ͳ͍ϕΠζਪఆಛ༗ͷϊΠζ) ͜ͷೋ߲໨͸৽ͨͳڭࢣσʔλ͕௥Ճ͞ΕΔ (N → N + 1) ͱখ͘͞ͳ Δɺͭ·Γ σ2 N+1 (x) ≤ σ2 N (x) ͱͳΔɻ(PRML ͷԋश 3.11 ࢀর) ͜Ε͸ڭࢣσʔλ͕૿͑Δͱɺग़ྗͷ༧ଌͷ࣮͕֬͞૿͑Δͱ͍͏͜ͱ Λද͢ɻ ࠷ޙʹྫΛ༻͍ͯɺڭࢣσʔλ͕૿͑Δͱ༧ଌͷෆ͔͕֬͞ݮΔ༷ࢠΛ ݟΔɻ 45 / 47
  42. 2-3. ϕΠζઢܗճؼ ྫ͸؆୯ͳճؼͷͱ͖ʹ࢖༻ͨ͠ࡾ֯ؔ਺ͷྫͰ͋Δɻ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x = (x1 , x2

    , · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ tn ͸ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨ΢ε෼෍ʹै͏ϥϯμϜϊΠζ ϵ Λ Ճ͑ͨ΋ͷͱ͢Δɻ tn = sin(2πxn ) + ϵ (2.37) ໬౓ؔ਺ͷฏۉ஋Ͱ͋Δ y(x, w) ͸Ψ΢εجఈؔ਺ (2.4) Ͱల։͢Δɻ ͜ͷઃఆͰڭࢣσʔλͷ਺͕ N = 1, 2, 4, 25 ͷͱ͖ͷάϥϑ͸ҎԼͷΑ ͏ʹͳΔɻ 46 / 47
  43. 2-3. ϕΠζઢܗճؼ ੨ؙ͕ڭࢣσʔλɺԫ྘ͷઢ͕ਖ਼ղͰ͋ΔαΠϯؔ਺ɺ੺͍ઢ͕༧ଌ෼ ෍ͷฏۉ mT N ϕ(x)ɺബ͍੺ͷྖҬ͕༧ଌ ±σN (x) ͷྖҬͰ͋Δɻ

    ڭࢣσʔλ͕૿͑Ε͹૿͑Δ΄Ͳɺ੺͍ઢ͕ԫ྘ͷઢʹۙ෇͖ɺബ͍੺ ͷྖҬ͕ݮ͍༷ͬͯ͘ࢠ͕ݟͯऔΕΔɻ 47 / 47