PRML第6章

370e1dde1ef2391bdebe02e4a777890e?s=47 gucchi
January 21, 2019

 PRML第6章

370e1dde1ef2391bdebe02e4a777890e?s=128

gucchi

January 21, 2019
Tweet

Transcript

  1. PRML ୈ 6 ষ Χʔωϧ๏ 2019/01/21 ࡔޱ ྒี 1 /

    34
  2. 3 ষͱ 4 ষͰ͸ɺճؼͱ෼ྨͷઢܕύϥϝτϦοΫϞσϧΛߟ͑ͨɻ ྫ͑͹ 3 ষͰ͸ɺग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσ

    ϧΛߟ͑ͨɻ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (3.3) ͜͜Ͱɺx = (x1 , x2 , · · · , xD )T ͸ D ࣍ݩͷೖྗϕΫτϧɻ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T ͸ೖྗϕΫτϧ x Λ M ࣍ݩͷ ಛ௃ۭؒʹࣸ૾͢ΔϕΫτϧؔ਺ɻ·ͨɺw = (w0 , w1 , · · · , wM−1 )T ͸ M ࣍ݩͷύϥϝʔλϕΫτϧͰ͋Δɻ 3 ষͰ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͦΕͧΕʹରԠ͢Δ໨ඪ ϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙͯ͠ɺ࠷খೋ৐๏Λ༻͍ͯɺxn Λೖྗͨ࣌͠ͷग़ྗ y(xn , w) ͕ tn Λ࠶ݱ͢ΔΑ͏ʹύϥϝʔλ w Λ ܾΊͨɻ 2 / 34
  3. 3 ষͱ 4 ষͰ͸ɺͬ͘͟Γݴ͏ͱɺϕΫτϧؔ਺ ϕ(x) ͷܗΛܾΊΔ (ྫ ͑͹ɺΨ΢εجఈؔ਺) ͜ͱ͕Ϟσϧߏஙͷग़ൃ఺Ͱ͋ͬͨɻ(ͪͳΈʹ 5

    ষͷχϡʔϥϧωοτͰ͸ɺϕ(x) ࣗମ΋ֶशύϥϝʔλʹґଘ͞ ͤͨ) ଟ͘ͷઢܕύϥϝτϦοΫϞσϧͰ͸ɺϞσϧΛ૒ରදݱͰॻ͖௚͢͜ ͱʹΑΓɺΧʔωϧؔ਺ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) Λ௨ͯ͠ͷΈ ϕ(x) ֶ͕शࡁΈͷύϥϝʔλ wML ΍ͦͷύϥϝʔλΛ ༻͍ͨग़ྗ y(x, wML ) ΁ґଘ͢ΔΑ͏ʹॻ͖௚ͤΔɻ(6.1 Ͱৄ͘͠ղ આ͢Δ) ·ͨɺճؼͱ෼ྨͷઢܕύϥϝτϦοΫϞσϧΛ֬཰తʹऔΓѻ͏͜ͱ ʹΑͬͯɺ͜ΕΒͷϞσϧ͕Ψ΢εաఔͷҰྫʹͳ͍ͬͯΔ͜ͱΛΈ Δɻ(6.4 Ͱৄ͘͠ղઆ͢Δ) 3 / 34
  4. 6.1 ૒ରදݱ ग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσϧΛߟ͑Δɻ y(x, w) = wTϕ(x)

    ҎԼͷਖ਼ଇԽ͞Εͨೋ৐࿨ޡࠩΛ࠷খԽ͢Δ͜ͱΛߟ͑Δɻ J(w) = 1 2 N ∑ n=1 {wTϕ(xn ) − tn }2 + λ 2 wTw (6.2) ͜͜Ͱɺೖྗσʔλͷू߹Λ {x1 , x2 , · · · , xN }ɺ໨ඪϕΫτϧͷू߹Λ {t1 , t2 , · · · , tN } ͱ͢Δɻ 4 / 34
  5. 6.1 ૒ରදݱ ఀཹ఺৚݅ ∂J(w)/∂w = 0 ͸ҎԼͷΑ͏ʹมܗͰ͖Δɻ(3.1.1 Λࢀর) w =

    N ∑ n=1 an ϕ(xn ) = ΦTa (6.3) ͜͜Ͱɺ an = − 1 λ {wTϕ(xn ) − tn } (6.4) Ͱ͋Γɺa = (a1 , · · · , aN )T ͱ͠ɺΦ = (ϕ(x1 ), · · · , ϕ(xN ))T ͸ܭըߦ ྻ (3.16) Ͱ͋Δɻ w = ΦTa Λ༻͍ͯɺJ(w) ΛύϥϝʔλϕΫτϧ a ͷؔ਺ʹॻ͖௚͢ ͱҎԼͷΑ͏ʹͳΔɻ J(a) = 1 2 aTΦΦTΦΦTa − aTΦΦTt + 1 2 tTt + λ 2 aTΦΦTa (6.5) ͜͜Ͱɺt = (t1 , · · · , tN )T Ͱ͋Δɻ 5 / 34
  6. 6.1 ૒ରදݱ ͜͜ͰɺάϥϜߦྻ K = ΦΦT Λఆٛ͢Δɻ͜ͷߦྻͷ੒෼ Knm ͸Ҏ ԼͷΑ͏ʹΧʔωϧͰॻ͚Δɻ

    Knm = ϕ(xn )Tϕ(xm ) = k(xn , xm ) (6.6) άϥϜߦྻ K Λ༻͍ͯ (6.5) ͷ J(a) ͸ҎԼͷΑ͏ʹॻ͚Δɻ J(a) = 1 2 aTKKa − aTKt + 1 2 tTt + λ 2 aTKa (6.7) ͜ͷΑ͏ʹύϥϝʔλ w ͷ୅ΘΓʹύϥϝʔλ a Ͱ࠷খೋ৐๏ͷΞϧ ΰϦζϜΛදݱ͢Δ͜ͱ͕Ͱ͖ɺ͜ͷදݱΛ૒ରදݱͱݴ͏ɻ ૒ରදݱͰॻ͖௚͢ͱɺJ(a) ͷ ϕ(x) ґଘ͸Χʔωϧ (6.6) Λ௨ͯ͠ͷ Έґଘ͍ͯ͠Δ͜ͱ͕Θ͔Δɻ(ੜͷ ϕ(x) ґଘ͸ͳ͍) 6 / 34
  7. 6.1 ૒ରදݱ ͜ͷ J(a) Λ࠷খʹ͢Δ a ΛٻΊΔͱɺҎԼͷΑ͏ʹͳΔɻ(௚઀ J(a) ͷ a

    ʹର͢Δޯ഑͕θϩʹͳΔΑ͏ͳ a ΛٻΊͯ΋ྑ͍͠ɺຊจͰ ΍͍ͬͯΔΑ͏ʹɺJ(w) ͷ w ʹର͢Δޯ഑͕θϩʹͳΔΑ͏ͳ w(6.3) ͱ a ͱ w ͷؔ܎ (6.4) Λ༻͍ͯٻΊͯ΋ྑ͍ɻ) a = (K + λIN )−1t (6.8) ͜͜ͰɺIN ͸ N × N ͷ୯ҐߦྻͰ͋Δɻ ͜ͷղ a ͱ w = ΦTa ͱ y(x, w) = wTϕ(x) Λ༻͍Δͱɺ৽͍͠ೖྗ x ʹର͢Δ༧ଌ y(x) ͸ҎԼͷΑ͏ʹͳΔɻ y(x) = aTΦϕ(x) = k(x)T(K + λIN )−1t (6.9) ͜͜Ͱɺk(x) = (k(x1 , x), k(x2 , x), · · · , k(xN , x))T Ͱ͋Δɻ ͜ΕΑΓɺ༧ଌ y(x) ΋Χʔωϧؔ਺ͷΈʹΑͬͯද͞Ε͍ͯΔɻ 7 / 34
  8. 6.1 ૒ରදݱ ૒ରදݱͰղ a ΛٻΊΔࡍ͸ɺ(6.8) ΑΓ N × N ͷߦྻͷٯߦྻΛٻΊ

    Δඞཁ͕͋Δɻ(N ͸ڭࢣσʔλͷ਺ɻ) Ұํɺओදݱ (ࠓͰ͸ύϥϝʔλ w Ͱͷදݱͷํ) Ͱͷղ w ͸ɺ w = ( λIM + ΦTΦ )−1 ΦTt (3.28) ͳͷͰɺM × M ͷߦྻͷٯߦྻΛٻΊΔඞཁ͕͋Δɻ(M ͸ಛ௃ྔۭ ؒͷ࣍ݩɻ) N ≫ M ͷ࣌ (͜ͷΑ͏ͳ৔߹͕େଟ਺)ɺओදݱͰղΛٻΊΔํָ͕ɻ Ұํɺ૒ରදݱͰ͸ M ͕ແݶେͷ࣌ͷಛ௃ۭؒ΋औΓѻ͏͜ͱ͕Ͱ͖ Δɻ(6,2 Ͱ M ͕ແݶେͷ࣌ͷಛ௃ۭؒͷྫΛڍ͛Δɻ) 8 / 34
  9. 6.2 Χʔωϧؔ਺ͷߏ੒ ͜ͷઅͰ͸ɺΧʔωϧؔ਺ͷఆٛΛ༻͍ͯɺ͍ΖΜͳΧʔωϧؔ਺Λ঺ հ͢Δɻ Χʔωϧؔ਺ͷఆٛ͸ɺೖྗ x ͔Βద੾ͳ M ࣍ݩಛ௃ۭؒ΁ͷࣸ૾ ϕ

    ͕ఆٛͰ͖ɺk(x, x′) ͕ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) ͱॻ͚Δ͜ͱͰ͋Δɻ Χʔωϧؔ਺ͷ؆୯ͳྫ͸ k(x, z) = (xTz)2 (6.11) Ͱ͋Δɻ 9 / 34
  10. 6.2 Χʔωϧؔ਺ͷߏ੒ ྫ͑͹ x = (x1 , x2 )T ͱ͠ɺࣸ૾

    ϕ Λ ϕ(x) = (x2 1 , √ 2x1 x2 , x2 2 )T ͱ͢ Δͱɺ k(x, z) = (xTz)2 = ϕ(x)Tϕ(z) (6.12) ͱॻ͚ΔͷͰɺk(x, z) = (xTz)2 ͸Χʔωϧؔ਺Ͱ͋Δɻ ࣮͸Χʔωϧؔ਺ͷఆٛ͸ (6.1) ͷଞʹ΋͏Ұͭ͋Γɺ੒෼͕ Knm = k(xn , xm ) Ͱ͋ΔάϥϜߦྻ K ͕൒ਖ਼ఆஔߦྻͰ͋Δ͜ͱͰ ͋Δɻ ɹ (͜ΕΒͷ 2 ͭͷఆ͕ٛ౳ՁͰ͋Δ͜ͱ͸ҎԼͷهࣄͰূ໌ͯ͠Έ·͠ ͨɻ͚ٓ͠Ε͹ɺ͝ཡ and άου͍ͩ͘͞) https://qiita.com/gucchi0403/items/544065345f91144524c4 10 / 34
  11. 6.2 Χʔωϧؔ਺ͷߏ੒ ࣍ʹɺطʹΧʔωϧؔ਺ͩͱΘ͔͍ͬͯΔؔ਺͔Βɺ৽͍͠Χʔωϧؔ ਺ k(x, x′) Λੜ੒͢Δํ๏ΛҎԼʹࣔ͢ɻ ͜͜Ͱɺؔ਺ k1 (·,

    ·), k2 (·, ·) ͸Χʔωϧؔ਺ɺc > 0 ͸ఆ਺ɺf(·) ͸೚ҙ ͷؔ਺ɺq(·) ͸ඇෛͷ܎਺Λ࣋ͭଟ߲ࣜɺϕ(·) ͸ M ࣍ݩϕΫτϧؔ਺ɺ k3 (·, ·) ͸ M ࣍ݩϕΫτϧ্ۭؒʹఆٛ͞ΕͨΧʔωϧؔ਺ɺA ͸ରশ ͳ൒ਖ਼ఆஔߦྻɺx = (xa , xb )ɺka (·, ·), kb (·, ·) ͸Χʔωϧؔ਺Ͱ͋Δɻ 11 / 34
  12. 6.2 Χʔωϧؔ਺ͷߏ੒ ͜ΕΒͷߏ੒๏Λ༻͍Δͱɺྫ͑͹ҎԼͷؔ਺͕ΧʔωϧͰ͋Δ͜ͱ͕ Θ͔Δɻ k(x, x′) = (xTx′ + c)M

    ͜͜Ͱɺc ≥ 0 ͷఆ਺ɺM ͸೚ҙͷࣗવ਺ɻ ·ͨɺҎԼͷඇৗʹॏཁͳΨ΢εΧʔωϧͱݴ͏Χʔωϧؔ਺Λߏ੒Ͱ ͖Δɻ k(x, x′) = exp (−∥x − x′∥/2σ2) (6.23) ͜͜Ͱɺσ2 ͸೚ҙͷਖ਼ͷఆ਺ɻ ͪͳΈʹΨ΢εΧʔωϧʹରԠ͢Δಛ௃ϕΫτϧ͸ແݶ࣍ݩͰ͋Δɻ (→ ԋश 6.11) 12 / 34
  13. 6.3 RBF ωοτϫʔΫ ͜ͷઅͰ͸ɺҰൠతʹΑ͘࢖ΘΕΔ RBF ͱ͍͏جఈؔ਺ʹ͍ͭͯड़΂ ΔɻRBF ͱ͸ɺத৺ µj ͔Βͷڑ཭ͷΈʹґଘ͍ͯ͠Δجఈؔ਺Ͱɺ

    ϕj (x) = h(∥x − µj ∥) ͱ͍͏ܗΛ͍ͯ͠Δɻ RFB ͕ొ৔͢Δͷ͸ɺೖྗม਺ʹϊΠζؚ͕·ΕΔ࣌Ͱ͋ΔɻϊΠζ ξ ͷ֬཰෼෍Λ ν(ξ) ͱ͢Δͱɺೋ৐࿨ޡࠩ͸ҎԼͷΑ͏ʹͳΔɻ E = 1 2 N ∑ n=1 ∫ {y(xn + ξ) − tn }2ν(ξ) dξ (6.39) 13 / 34
  14. 6.3 RBF ωοτϫʔΫ ͜ͷೋ৐࿨ޡࠩΛ࠷େʹ͢Δ y(x) ͸ม෼๏ʹΑΓҎԼͷΑ͏ʹͳΔ͜ ͱ͕Θ͔Δɻ y(x) = N

    ∑ n=1 tn h(x − xn ) (6.40) ͜͜Ͱɺh(x − xn ) ͸ҎԼͷΑ͏ʹ༩͑ΒΕΔɻ h(x − xn ) = ν(x − xn ) N ∑ m=1 ν(x − xm ) (6.41) ͜ͷΑ͏ͳϞσϧΛ Nadaraya-Watson Ϟσϧͱ͍͏ɻ ·ͨϊΠζ͕౳ํతɺͭ·Γ ∥ξ∥ ͷ࣌͸ (6.41) ͷجఈؔ਺ͷ౳ํతɺ ͭ·Γ h(∥x − xn ∥) ͱͳΓɺRBF ͱͳΔɻ (6.3.1 ͷ Nadaraya-Watson Ϟσϧ͸Ҏ߱ͷষͰ࢖༻͠ͳ͔ͬͨͷͰɺඈ ͹͠·͢ɻ) 14 / 34
  15. 6.4 Ψ΢εաఔ 6.1 Ͱ͸ɺઢܗճؼͷඇ֬཰తͳϞσϧ (ग़ྗ y(x, w) Λͦͷ··༧ଌʹ ࢖༻) ʹ͍ͭͯɺ૒ରදݱͰॻ͖௚͢͜ͱͰΧʔωϧ͕ग़ݱ͢Δ͜ͱΛ

    ݟͨɻ ɹ ࠓ౓͸ઢܗճؼͷ֬཰Ϟσϧ (ग़ྗ y(x, w) ͷ֬཰෼෍Λಋग़͢Δ) Λ ѻ͍ɺ͜͜Ͱ΋ࣗવʹΧʔωϧ͕ग़ͯ͘Δ͜ͱΛ֬ೝ͢Δɻ 15 / 34
  16. 6.4.1 ઢܗճؼ࠶๚ 6.1 ͱಉ༷ʹҎԼͷΑ͏ͳೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ͔Β࢝ΊΔɻ y(x, w) = wTϕ(x)

    (6.49) ࣍ʹɺύϥϝʔλϕΫτϧ w ͷࣄલ෼෍ p(w) = N(w|0, α−1I) (6.50) ΛԾఆ͢Δɻ ͜͜Ͱɺw ͕༩͑ΒΕͨͱ͢Δͱɺ(6.49) ΑΓ x ʹ͍ͭͯͷಛఆͷؔ ਺ y(x) ͕ܾ·Δɻͭ·Γɺw ͷ֬཰෼෍͸ y(x) ͷ֬཰෼෍Λಋ͘ɻ ࣮༻తʹ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͕༩͑ΒΕ͍ͯΔ࣌ ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y(x1 ), y(x2 ), · · · , y(xN )) ͕ w ͷ֬཰෼෍ ͱ (6.49) ʹΑΓಋ͔ΕΔɻ(ΑΓਖ਼֬ʹݴ͏ͱɺX = {x1 , x2 , · · · , xN } ͱͯ͠ɺp(y(x1 ), y(x2 ), · · · , y(xN )|X) Λߟ͑Δɻ) 16 / 34
  17. 6.4.1 ઢܗճؼ࠶๚ ͦ͜Ͱɺy = (y(x1 ), y(x2 ), · ·

    · , y(xN ))T ͱఆٛ͢Δͱɺ(6.49) ΑΓ y = Φw (6.51) ͕Θ͔Δɻ(Φ ͸ܭըߦྻ) ͜ͷ࣌ɺy ͸Ψ΢ε෼෍ (6.50) ʹै͏ w ͷઢܗม׵ΑΓɺy ΋Ψ΢ε ෼෍ʹै͏ɻΑͬͯɺ෼෍Λ׬શʹܾఆ͢ΔͨΊʹ͸ฏۉͱڞ෼ࢄ͕Θ ͔Ε͹Α͘ɺ E[y] = ΦE[w] = 0 (6.52) cov[y] = E[yyT] = ΦE[wwT]ΦT = 1 α ΦΦT = K (6.53) ͱΘ͔Δɻ͜͜ͰɺK ͸ҎԼͷΑ͏ʹ੒෼ʹΧʔωϧؔ਺Λ΋ͭάϥ ϜߦྻͰ͋Δɻ Knm = k(xn , xm ) = 1 α ϕ(xn )Tϕ(xm ) (6.54) 17 / 34
  18. 6.4.1 ઢܗճؼ࠶๚ Ҏ্Ͱઆ໌ͨ͠ઢܗճؼ͸Ψ΢εաఔͷҰྫͱͳ͍ͬͯΔɻ Ψ΢εաఔͱ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · ·

    · , xN } ͕༩͑ΒΕ͍ͯ Δ࣌ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) = p(y(x1 ), y(x2 ), · · · , y(xN )) ͕Ψ ΢ε෼෍ʹै͏ͱԾఆ͢Δ΋ͷͰ͋Δɻ ͦͷฏۉ͸θϩͱԾఆ͢Δ͜ͱ͕ଟ͘ɺ·ͨڞ෼ࢄ͸ҎԼͷΑ͏ʹΧʔ ωϧͱ͢Δɻ E[y(xn ), y(xm )] = k(xn , xm ) (6.55) ্Ͱઆ໌ͨ͠ઢܗճؼ͸͔֬ʹΨ΢εաఔͷҰྫͱͳ͍ͬͯΔ͜ͱ͕ Θ͔Δɻ 18 / 34
  19. 6.4.2 Ψ΢εաఔʹΑΔճؼ ͜͜Ͱ͸ɺΨ΢εաఔΛઢܗճؼʹదԠ͢Δɻ ໨ඪม਺ tn ͸ग़ྗؔ਺ yn = y(xn )

    Λฏۉͱͨ͠Ψ΢ε෼෍ʹै͏ͱ ͢Δɻ p(tn |yn ) = N(tn |yn , β−1) (6.58) β ͸ਫ਼౓ͷϋΠύʔύϥϝʔλɻ ಠཱੑʹΑΓɺy = (y(x1 ), y(x2 ), · · · , y(xN ))T ͕༩͑ΒΕͨ࣌ͷ t = (t1 , · · · , tN )T ͷ༧ଌ෼෍͸ҎԼͷΑ͏ʹͳΔɻ p(t|y) = N(t|y, β−1IN ) (6.59) ·ͨΨ΢εաఔʹΑΓɺपล෼෍ p(y) ͸ฏۉ͕ 0 Ͱڞ෼ࢄ͕άϥϜߦ ྻ K Ͱ͋ΔΨ΢ε෼෍ʹै͏ͱ͢Δɻ p(y) = N(y|0, K) (6.60) 19 / 34
  20. 6.4.2 Ψ΢εաఔʹΑΔճؼ (6.59) ͷ p(t|y) ͱ (6.60) ͷ p(y) Λ༻͍Δͱɺ{x1

    , x2 , · · · , xN } ͕༩͑ ΒΕ͍ͯΔ࣌ͷ໨తม਺ t ͷ෼෍ p(t) ͸ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ p(t) = ∫ p(t|y) p(y) dy = N(t|0, C) (6.61) ͜͜Ͱɺڞ෼ࢄ C ͷ੒෼ Cnm ͸ Cnm = k(xn , xm ) + β−1δnm (6.62) Ͱ͋Δɻ(ࣜ (2.113)ʙࣜ (2.115) Λ࢖༻ͨ͠ɻ) ڞ෼ࢄ C ʹग़ͯ͘ΔΧʔωϧؔ਺ͱͯ͠Α͘࢖༻͞ΕΔͷ͕ɺҎԼͷ Α͏ͳΧʔωϧͰ͋Δɻ k(xn , xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) θ0 , · · · , θ3 ͸ϋΠύʔύϥϝʔλɻ 20 / 34
  21. 6.4.2 Ψ΢εաఔʹΑΔճؼ զʑ͕஌Γ͍ͨͷ͸ɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · ·

    , xN } ͱ {t1 , t2 , · · · , tN } Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ ໨ඪม਺ tN+1 ͷ෼෍Ͱ͋Δɻͭ·ΓɺtN = (t1 , · · · , tN )T ͱఆٛͨ͠ ࣌ͷ p(tN+1 |tN ) Ͱ͋Δɻ(͜͜Ͱɺೖྗม਺ͷґଘੑ͸লུͨ͠ɻ) p(tN+1 |tN ) ΛٻΊΔͨΊʹɺ·ͣ͸पล֬཰ p(tN+1 ) ͔ΒٻΊΔɻ͜ ͜ͰɺtN+1 = (t1 , · · · , tN+1 )T Ͱ͋Δɻ (6.61) ͷ݁ՌΛར༻͢Δͱɺp(tN+1 ) ͸ p(tN+1 ) = N(tN+1 |0, CN+1 ) (6.64) ͱͳΔɻ 21 / 34
  22. 6.4.2 Ψ΢εաఔʹΑΔճؼ ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͸ CN+1 = ( CN k

    kT c ) (6.65) Ͱ͋Δɻ͜͜ͰɺCN ͸੒෼͕ (6.62) Ͱ͋ΔΑ͏ͳ N × N ͷߦྻͰɺ k = (k(x1 , xN+1 ), k(x2 , xN+1 ), · · · , k(xN , xN+1 ))T ͳΔϕΫτϧɺ c = k(xN+1 , xN+1 ) + β−1 Ͱ͋Δɻ ͜ͷ݁Ռͱ (2.81) ͱ (2.82) Λ༻͍Δͱɺp(tN+1 |tN ) ͸Ψ΢ε෼෍ʹै ͍ɺͦͷฏۉ m(xN+1 ) ͱ෼ࢄ σ2(xN+1 ) ͸ҎԼͷΑ͏ʹͳΔɻ m(xN+1 ) = kTC−1 N tN (6.66) σ2(xN+1 ) = c − kTC−1 N k (6.67) ͭ·Γɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ໨ඪม਺ tN+1 ͷ֬ ཰෼෍͸ฏۉͱ෼ࢄ͕ xN+1 ʹґଘ͢ΔΨ΢ε෼෍ͱͳΔɻ 22 / 34
  23. 6.4.3 ௒ύϥϝʔλͷֶश Χʔωϧ๏Ͱ͸ɺΧʔωϧؔ਺Λܾఆ͢Δඞཁ͕͋ΔɻҰ͔ΒΧʔωϧ ؔ਺Λܾఆ͢ΔΑΓ΋ɺԼͷ (6.63) ͷΑ͏ʹΧʔωϧؔ਺Λύϥϝʔ λԽͯ͠ɺ܇࿅σʔλ͔Β͜ͷϋΠύʔύϥϝʔλΛܾఆ͢Δͷָ͕ͳ ͱ͖΋͋Δɻ k(xn ,

    xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) ͜ͷϋΠύʔύϥϝʔλΛܾΊΔͨΊʹ͸ɺ(6.61) ͷ p(t|θ) ͷର਺ ln p(t|θ) Λͱͬͨ΋ͷΛ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛܾΊΕ͹ ͍͍ɻ ln p(t|θ) ͸ҎԼͷΑ͏ʹͳΔɻ ln p(t|θ) = − 1 2 ln |CN | − 1 2 tTC−1 N t − N 2 ln (2π) (6.69) ͜ͷ ln p(t|θ) Λ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛٻΊΔ͜ͱʹͳΔɻ 23 / 34
  24. 6.4.4 ؔ࿈౓ࣗಈܾఆ 6.4.3 ͰϋΠύʔύϥϝʔλͷ఺ਪఆΛߦ͕ͬͨɺ͜ͷ఺ਪఆͷ݁ՌΑ Γೖྗม਺ͷ༧ଌ΁ͷॏཁ౓͕Θ͔Δɻ ྫ͑͹ɺҎԼͷΑ͏ͳΧʔωϧΛߟ͑Δɻ k(x, x′) = θ0

    exp { − θ1 2 2 ∑ i=1 ηi (xi − x′ i )2 } (6.71) ͜͜Ͱɺθ0 , η1 , η2 ͸ϋΠύʔύϥϝʔλͰ͋Δɻ ͜ͷΧʔωϧΛ༻͍ͯɺy ͷࣄલ෼෍Λߟ͑Δɻ p(y) = N(y|0, K) (6.60) 24 / 34
  25. 6.4.4 ؔ࿈౓ࣗಈܾఆ ্ͷද͸ɺη1 , η2 ΛมԽͤͨ࣌͞ͷ y ͷࣄલ෼෍ʹΑͬͯಘΒΕΔαϯ ϓϧͰ͋Δɻ ηi

    Λখ͘͢͞Δͱɺxi ͷมԽʹΑΔ y ͷมԽ͸খ͘͞ͳΔ͜ͱ͕Θ ͔Δɻ ͜ͷ࣌ɺy ʹϊΠζΛ෇͚Ճ͑ͨ໨ඪม਺ t ͷ֬཰෼෍ p(t|θ) Λ࠷େʹ ͢ΔϋΠύʔύϥϝʔλΛٻΊΔͱɺηi ͸খ͍͞஋ʹͳΔɻ 25 / 34
  26. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ࠓ౓͸Ψ΢εաఔͰΫϥε෼ྨΛߦ͏ɻ ճؼͰ͸ɺ(6.60) ͷΑ͏ʹग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) ͕Ψ΢ε෼෍ ʹै͏ͱԾఆͨ͠ɻ͜ͷ࣌ɺyn ͸࣮਺શମͷ஋ΛͱΔɻ ෼ྨͰ͸ɺग़ྗ͸

    yn ͸ 0 ≤ yn ≤ 1 ͱͳΔ΂͖Ͱ͋Δɻͦ͜Ͱɺग़ྗͰ ͸ͳ͘׆ੑ an = a(xn ) ͷಉ࣌෼෍ؔ਺Λߟ͑Δ͜ͱʹ͠ɺग़ྗΛ yn = σ(an ) ͱ͢Δɻ ࣍ʹ֬཰ԽΛߦ͏ɻ໨తม਺ tn = 1 ͷ࣌ͷ֬཰Λ p(tn = 1|an ) = σ(an ) ͱ͢Δͱɺp(tn = 0|an ) = 1 − σ(an ) ΑΓɺ p(tn |an ) = σ(an )tn (1 − σ(an ))1−tn (6.73) ͱͳΔɻ ճؼͷ࣌ͱಉ༷ʹɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · · , xN } ͱ tN = (t1 , · · · , tN )T Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ ࣌ͷ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) ΛٻΊΔɻ(͜͜Ͱ΋ೖྗม਺ͷ ґଘੑ͸লུͨ͠ɻ) 26 / 34
  27. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ·ͣɺaN+1 = (a(x1 ), a(x2 ), · ·

    · , a(xN+1 ))T ͱͯ͠ɺΨ΢εաఔΑΓ ׆ੑͷಉ࣌෼෍ p(aN+1 ) ΛҎԼͷΑ͏ʹԾఆ͢Δɻ p(aN+1 ) = N(aN+1 |0, CN+1 ) (6.74) ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͷ੒෼͸ҎԼͱ͢Δɻ (CN+1 )nm = k(xn , xm ) + νδnm (6.75) ν ͸ϊΠζ߲Ͱ͋Δɻ ٻΊ͍ͨͷ͸ɺ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) Ͱ͋Γɺ2 ஋෼ྨͰ͸ p(tN+1 = 0|tN ) = 1 − p(tN+1 = 1|tN ) ͳͷͰɺp(tN+1 = 1|tN ) ͷΈΛ ٻΊΕ͹ྑ͍ɻ 27 / 34
  28. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ͜͜Ͱɺ p(tN+1 = 1, tN ) = ∫

    p(tN+1 = 1, tN , aN+1 ) daN+1 = ∫ p(tN+1 = 1|tN , aN+1 )p(aN+1 |tN )p(tN ) daN+1 = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN )p(tN ) daN+1 ΑΓɺp(tN+1 = 1|tN ) ͸ҎԼͷΑ͏ʹܭࢉ͞ΕΔɻ p(tN+1 = 1|tN ) = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN ) daN+1 (6.76) ͜͜Ͱɺp(tN+1 = 1|aN+1 ) = σ(aN+1 ) Ͱ͋Δɻ ͜ͷੵ෼͸ղੳతʹ࣮ߦ͢Δ͜ͱ͸ෆՄೳͰ͋Γɺ༷ʑͳํ๏Λ༻͍ͯ ۙࣅతʹٻΊΔ͜ͱ͕͞Ε͍ͯΔɻࠓճ͸ϥϓϥεۙࣅΛ༻͍Δɻ 28 / 34
  29. 6.4.6 ϥϓϥεۙࣅ ͜ͷઅͰ͸ɺϥϓϥεۙࣅΛ༻͍ͯੵ෼ (6.76) ΛධՁ͢Δɻ ·ͣɺp(aN+1 |tN ) ΛϕΠζͷఆཧΛ༻͍ͯҎԼͷΑ͏ʹมܗ͢Δɻ p(aN+1

    |tN ) = ∫ p(aN+1 |aN )p(aN |tN ) daN (6.77) p(aN |tN ) ͸ࣄޙ෼෍Ͱ͋Δɻ ͜͜Ͱɺ৚݅෇͖෼෍ p(aN+1 |aN ) ͸ɺճؼͷ࣌ͷ (6.66) ͱ (6.67) ͷ p(tN+1 |tN ) ͷ݁ՌΛࢀߟʹ͢Δͱɺ p(aN+1 |aN ) = N(aN+1 |kTC−1 N aN , c − kTC−1 N k) (6.78) ͱͳΔɻ 29 / 34
  30. 6.4.6 ϥϓϥεۙࣅ p(aN |tN ) Λۙࣅ͢Δ (ϥϓϥεۙࣅ)ɻ ͦͷͨΊʹ͸ɺ ∂p(aN |tN

    ) ∂aN = ∇p(aN |tN ) = 0 Λຬͨ͢ aN (= a⋆ N ) ͱɺaN = a⋆ N Ͱͷϔοηߦྻ −∇∇ ln p(aN |tN ) ͕ ඞཁͰ͋Δɻ(4.4 ͱ 4.5 ࢀর) ·ͣɺࣄલ෼෍ p(aN ) ͸ p(aN ) = N(aN |0, CN ) Ͱ༩͑Δɻ͜Ε͸ (6.74) Ͱ N + 1 → N ͱͨ͠΋ͷɻ ໬౓ؔ਺ p(tN |aN ) ͸σʔλ఺ͷಠཱੑΑΓɺ p(tN |aN ) = N ∏ n=1 σ(an )tn (1 − σ(an ))1−tn = N ∏ n=1 eantn σ(−an ) (6.79) ͱͳΔɻ 30 / 34
  31. 6.4.6 ϥϓϥεۙࣅ ϕΠζͷఆཧΑΓɺp(aN |tN ) ∝ p(tN |aN )p(aN )

    ͳͷͰɺ࣮ࡍʹܭࢉΛ ͢Δͱɺ a⋆ N = CN (tN − σN ) (6.84) ͱͳΔɻ͜͜ͰɺσN = (σ(a1 ), σ(a2 ), · · · , σ(aN ))T Ͱ͋Δɻ ·ͨɺaN = a⋆ N Ͱͷϔοηߦྻ H ͸ H = W⋆ + C−1 N (6.85) ͱͳΔɻ͜͜ͰɺW ͸ σ(an )(1 − σ(an )) Λର֯੒෼ʹ࣋ͭର֯ߦྻͰ ͋ΓɺW⋆ ͸ aN = a⋆ N Ͱͷ W Ͱ͋Δɻ Αͬͯɺࣄޙ෼෍ p(aN |tN ) ͸ҎԼͷΑ͏ʹۙࣅ͞ΕΔɻ(ϥϓϥε ۙࣅ) p(aN |tN ) ∼ N(aN |a⋆ N , H−1) (6.86) 31 / 34
  32. 6.4.6 ϥϓϥεۙࣅ (6.78) ͱ (6.86) ΑΓɺҎԼͷΑ͏ʹ (6.77) ͷੵ෼͕ۙࣅͰ͖Δɻ p(aN+1 |tN

    ) ∼ ∫ N(aN+1 |kTC−1 N aN , c−kTC−1 N k)N(aN |a⋆ N , H−1) daN (2.115) ΑΓɺp(aN+1 |tN ) ͸ҎԼͷฏۉͱ෼ࢄΛ࣋ͭΨ΢ε෼෍ͱ ͳΔɻ E[aN+1 |tN ] = kT(tN − σN ) (6.87) var[aN+1 |tN ] = c − kT(W−1 N + CN )−1k (6.88) ͜͜ͰɺWN ͸ (6.85) ͷ W⋆ Ͱ͋Δɻ 32 / 34
  33. 6.4.6 ϥϓϥεۙࣅ p(aN+1 |tN ) ΋෼͔ͬͨͷͰɺ(6.76) ͷ p(tN+1 = 1|tN

    ) Λ (4.153) Λ༻ ͍ͯɺۙࣅܭࢉͰ͖Δɻ 6.4.3 Ͱ΋ٞ࿦ͨ͠Α͏ʹϋΠύʔύϥϝʔλʔ θ ͕ CN ʹؚ·ΕΔͷ Ͱɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ p(tN |θ) ͸ (6.61) ͷ࣌ͱಉ༷ʹ p(tN |θ) = ∫ p(tN |aN ) p(aN |θ) daN (6.89) ͱܭࢉ͢Δ͕ɺ͜Ε΋·ͨղੳతʹ͸ܭࢉͰ͖ͳ͍ɻ ϥϓϥεۙࣅΛ༻͍ͯɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ 33 / 34
  34. 6.4.7 χϡʔϥϧωοτϫʔΫͱͷؔ܎ ϕΠζχϡʔϥϧωοτ (→5.7) ʹ͓͍ͯ΋ग़ྗؔ਺ (ωοτϫʔΫؔ ਺)y(x, w) ͱ w

    ͷࣄલ෼෍ʹΑΓɺग़ྗؔ਺ͷࣄલ෼෍͕ಘΒΕΔɻ χϡʔϥϧωοτͷӅΕ૚ͷϢχοτͷ਺Λ M ͱͯ͠ɺM → ∞ ʹ͠ ͨ࣌ͷग़ྗؔ਺ͷࣄલ෼෍͕Ψ΢εաఔͷग़ྗؔ਺ͷࣄલ෼෍ʹۙͮ ͘͜ͱ͕஌ΒΕ͍ͯΔɻ(Neal 1996) χϡʔϥϧωοτͰ͸ɺग़ྗؔ਺ y(x, w) ͷ੒෼ yk (x, w) ͸ಠཱͰ͸ ͳ͍ɻ(ॏΈͷڞ༗Λߦ͍ͬͯΔɻ) M → ∞ ͰΨ΢εաఔʹۙ͘ͱ͍͏ࣄ࣮͸ɺχϡʔϥϧωοτͷग़ྗ yk (x, w) ͕ M → ∞ Ͱಠཱʹͳ͍ͬͯ͘ͱ͍͏͜ͱΛओு͢Δɻ 34 / 34