Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PRML(回帰編)
Search
gucchi
August 16, 2019
600
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
PRML(回帰編)
gucchi
August 16, 2019
More Decks by gucchi
See All by gucchi
PRML(ニューラルネット編)
gucchi
1
340
PRML(分類編)
gucchi
2
510
PRML第10章
gucchi
1
350
PRMLセミナー(第9章)
gucchi
3
430
PRMLセミナー
gucchi
2
330
PRML第11章
gucchi
1
360
PRMLセミナー
gucchi
1
410
PRMLセミナー
gucchi
1
600
PRML第6章
gucchi
1
67
Featured
See All Featured
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
1.1k
The Pragmatic Product Professional
lauravandoore
37
7.3k
Making the Leap to Tech Lead
cromwellryan
135
9.9k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Amusing Abliteration
ianozsvald
1
200
Done Done
chrislema
186
16k
The Spectacular Lies of Maps
axbom
PRO
1
790
So, you think you're a good person
axbom
PRO
2
2.1k
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
380
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
Building Flexible Design Systems
yeseniaperezcruz
330
40k
Transcript
PRML ΛࡐʹػցֶशΛਂ͘ཧղ͢Δηϛ φʔʲճؼฤʳ ࡔޱ ྒี 1 / 47
࣍ 1. ಋೖ 1-1. ؆୯ͳճؼͷྫ (PRML 1.1) 1-2. ֬ͱ֬ 1-3.
࠷ਪఆͱϕΠζਪఆ 2. ઢܗճؼϞσϧ 2-1. ઢܗجఈؔϞσϧ 2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ 2-3. ϕΠζઢܗճؼ 2 / 47
1. ಋೖ ػցֶशɺಛʹͦͷதͰڭࢣ͋ΓֶशͰɺ·ͣೖྗσʔλͷू߹ {x1 , x2 , · · ·
, xN } ͱͦΕͧΕʹରԠ͢ΔඪϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ(܇࿅σʔλɺ·ͨڭࢣσʔλ) ༻ҙͨ͠܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔ΒඪϕΫτϧΛ༧ଌ͢Δ ؔ y(x) Λ࡞Δɻ(ֶश) ֶशऴྃޙɺະͷσʔλ x ͷඪϕΫτϧΛ y(x) Ͱ༧ଌ͢Δɻ 3 / 47
1. ಋೖ ػցֶशͷओͳλεΫʮճؼʯͱʮྨʯʹ͚Δ͜ͱ͕Ͱ ͖Δɻ ճؼͱɺೖྗσʔλ͔Β࿈ଓతͳग़ྗΛಘΔͰ͋Δɻ(ͨͱ ͑ɺೖྗσʔλΛਓͷࣸਅʹͯ͠ɺͦͷਓͷମॏΛ༧ଌ͢Δ) ྨͱɺೖྗσʔλ͔Βࢄతͳग़ྗΛಘΔͰ͋Δɻ(ͨͱ ͑ɺೖྗσʔλΛݘ͔ೣͷࣸਅʹͯ͠ɺͦΕ͕ݘ͔ೣ͔Λ༧ଌ͢Δ ) ࠓճͷηϛφʔͰɺPRML
Λࡐʹͯ͠ճؼΛऔΓѻ͏ɻ ۩ମతʹɺPRML ୈ 3 ষͷਂ͍ཧղΛඪʹ͢Δɻ 4 / 47
1-1. ؆୯ͳճؼͷྫ ·ͣɺ؆୯ͳճؼΛײతʹઆ໌͢Δ (PRML 1.1 ʹରԠ)ɻ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x =
(x1 , x2 , · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷඪม t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ(ճؼͳͷͰɺ ग़ྗ tn ࿈ଓతͳΛͱΔ) ࠓճɺ܇࿅σʔλͷग़ྗͰ͋Δ tn ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨε (ޙ΄Ͳઆ໌͢Δ) ʹै͏ϥϯμϜϊΠζ ϵ ΛՃ͑ͨͷͱͯ͠࡞ ͢Δɻ(PRML A ࢀߟ) tn = sin(2πxn ) + ϵ (1.1) ճؼͷత܇࿅σʔλ (x, t) Λͬͯɺ(܇࿅σʔλʹؚ·Εͳ͍) ৽ ͨͳೖྗ ˆ x ͕༩͑ΒΕͨ࣌ͷग़ྗ ˆ t Λ༧ଌ͢Δ͜ͱͰ͋Δɻ 5 / 47
1-1. ؆୯ͳճؼͷྫ Լͷਤ܇࿅σʔλͷ N = 10 ͷ߹ͷྫͰ͋Δɻ(੨ؙ͕܇࿅ σʔλ) ·ͨɺͷۂઢ sin(2πx)
Ͱ͋Δɻ ੨ؙ͕ͷۂઢ্ʹ͍ͬͯͳ͍ͷɺ(1.1) ͷΨεͷϊΠζͷӨڹ Ͱ͋Δɻ 6 / 47
1-1. ؆୯ͳճؼͷྫ ͦΕͰɺ܇࿅σʔλΛ༻͍ͯະͷೖྗʹର͢Δग़ྗΛ༧ଌΛߦ͏ɻ ͱΓ͋͑ͣ͜ͷઅͰɺೖྗม x ΛೖΕͨΒରԠ͢Δඪม t ͷ༧ ଌΛฦؔ͢ y(x)
Λ܇࿅σʔλΛͬͯ࡞͢Δ͜ͱΛߟ͑Δɻ ۩ମతʹࠓճҎԼͷΑ͏ͳଟ߲ࣜ y(x, w) Λߟ͑Δɻ y(x, w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (1.2) ͜͜ͰϕΫτϧ w = (w0 , w1 , · · · , wM )T ॏΈύϥϝʔλͱݺΕΔɻ զʑͷඪ܇࿅σʔλ (x, t) Λͬͯɺy(x, w) ͕༧ଌؔͱͳΔΑ ͏ʹύϥϝʔλ w Λదʹௐઅ͢Δ͜ͱͰ͋Δɻ 7 / 47
1-1. ؆୯ͳճؼͷྫ ͦΕͰɺ࣍ʹ w ΛͲͷΑ͏ʹௐઅ͢Ε y(x, w) ͕༧ଌؔͱͯ͠ ૬Ԡ͘͠ͳΔͷ͔Λײతʹߟ͑Δɻ w
ͯ͢ͷ܇࿅σʔλʹରͯؔ͠ y(xn , w) ͕ඪม tn ʹۙ͘ͳ ΔΑ͏ͳ w Ͱ͋Δͱྑ͍ͱࢥΘΕΔɻ ͦ͜ͰɺҎԼͷޡࠩؔ E(w) Λ࠷খʹ͢ΔΑ͏ͳ w(= w⋆) ΛٻΊΔ ͜ͱΛߟ͑Δɻ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 (1.3) y(x, w⋆) ༧ଌؔͱͯ͠૬Ԡ͍͠ͱߟ͑ΒΕΔɻ ͜ͷؔ E(w) ͷ۩ମతͳ࠷খԽํ๏ޙʹ (εϥΠυͷ 2 ষ) Ͱٞ ͢Δɻ 8 / 47
1-1. ؆୯ͳճؼͷྫ ্ͷਤଟ߲ࣜͷ࣍ݩ M = 0, 1, 3, 9 ͷϑΟοςΟϯά݁ՌͰ͋Δɻ(
͕ sin(2πx) Ͱɺ͕ y(x, w⋆)) ͜ͷதͰɺM = 3 ͕Ұ൪ sin(2πx) ʹͯ·͍ͬͯΔΑ͏ʹݟ͑Δɻ M = 9 ͰɺE(w⋆) = 0 ͕ͩɺsin(2πx) ʹͯ·͍ͬͯͳ͍ɻ(ա ֶश) 9 / 47
1-1. ؆୯ͳճؼͷྫ ͜ͷաֶशʹ͍ͭͯͷରࡦํ๏ࠓճઆ໌͠ͳ͍ɻ(PRML 1.1 ࢀর) ͜͜·ͰඇৗʹײతʹϑΟοςΟϯάͷٞΛߦ͖ͬͯͨɻ ͔͜͜Β֬Λಋೖ͢Δ͜ͱͰɺΑΓཧతʹύλʔϯೝࣝͷ Λղ͍͍ͯ͘ɻ 10 /
47
1-2. ֬ͱ֬ ύλʔϯೝࣝʹ͓͍ͯɺॏཁͳෆ࣮֬ੑΛఆྔతʹධՁ͢ΔͨΊʹ֬ Λಋೖ͢Δɻ ֬ม X, Y Λߟ͑ɺ͜ΕΒ X =
xi (i = 1, 2, · · · , M)ɺ Y = yj (j = 1, 2, · · · , L) ΛͱΔͱ͠ɺX = xi , Y = yj ͱͳΔ֬ (ಉ࣌ ֬) Λ p(X = xi , Y = yj ) ͱ͔͘ɻ X = xi ͱͳΔ֬ p(X = xi ) ɺp(X = xi , Y = yj ) Λ༻͍ͯҎԼͷ Α͏ʹ͔͚Δɻ(Ճ๏ఆཧ) p(X = xi ) = L ∑ j=1 p(X = xi , Y = yj ) (1.4) ·ͨɺX = xi ͕༩͑ΒΕ্ͨͰɺY = yj ͱͳΔ֬ (͖݅֬) Λ p(Y = yj |X = xi ) ͱ͢ΔͱɺҎԼͷΑ͏ͳཱ͕ؔࣜ͢Δɻ(๏ ఆཧ) p(X = xi , Y = yj ) = p(Y = yj |X = xi )p(X = xi ) (1.5) 11 / 47
1-2. ֬ͱ֬ ๏ఆཧͱಉ࣌֬ͷରশੑ p(X, Y ) = p(Y, X) Λ༻͍ΔͱɺϕΠζͷ
ఆཧ͕ಋ͚Δɻ p(Y |X) = p(X|Y )p(Y ) p(X) (1.6) ͜͜Ͱɺp(Y ) Λࣄલ֬ (X ͕༩͑ΒΕΔલͷ֬) ͱ͍͍ɺp(Y |X) Λࣄޙ֬ (X ͕༩͑ΒΕͨޙͷ֬) ͱ͍͏ɻ ϕΠζͷఆཧࣄલ֬ p(Y ) ʹ p(X|Y ) Λ͔͚Δͱɺࣄޙ֬ p(X|Y ) ʹͳΔͱ͍͏͜ͱΛද͢ (p(X) p(Y |X) ͕ Y ʹରͯ͠ن֨ Խ͞Ε͍ͯΔ͜ͱΛอূ͢Δن֨Խఆ)ɻ ͞Βʹɺಉ࣌ p(X, Y ) ͕ҎԼͷΑ͏ʹपลͷੵͰදͤΔ࣌ɺX ͱ Y ಠཱͰ͋Δͱ͍͏ɻ p(X, Y ) = p(X) p(Y ) (1.7) 12 / 47
1-2. ֬ͱ֬ ͜Ε·Ͱࢄతͳ֬มʹ͍ͭͯߟ͖͑ͯͨɻ࣍ʹ࿈ଓతͳ֬ มͷʹ͍ͭͯߟ͑Δɻ ֬ม x ͕ (x, x +
δx) ͷൣғʹೖΔ͕֬ δx → 0 ͷ࣌ʹ p(x) δx ͱ ༩͑ΒΕΔ࣌ɺp(x) Λ֬ີͱ͍͏ɻ ͜ͷ࣌ɺม x ͕۠ؒ (a, b) ʹ͋Δ֬ҎԼͷࣜͰ༩͑ΒΕΔɻ p(x ∈ (a, b)) = ∫ b a p(x) dx (1.8) ·ͨɺ֬ͷඇෛੑͱن֨ԽΑΓɺp(x) ҎԼͷੑ࣭Λ࣋ͭɻ p(x) ≥ 0 (1.9) ∫ ∞ −∞ p(x) dx = 1 (1.10) 13 / 47
1-2. ֬ͱ֬ ֬Ͱͷॏཁͳܭࢉͱͯ͠ɺॏΈ͖ฏۉ͕͋Δɻ ࿈ଓతͳ֬ม x ʹରͯ͠ɺؔ f(x) ͷ֬ p(x) ͷԼͰͷฏۉ
ҎԼͷΑ͏ʹͳΔɻ E[f] = ∫ p(x)f(x) dx (1.11) ͜͜Ͱه๏ͱͯ͠ɺͲͷมʹ͍ͭͯ (͘͠ੵ) Λͱ͍ͬͯΔ ͷ͔ΛఴࣈͰද͢͜ͱʹ͢Δɻྫ͑ɺҎԼͷྔ x ͍ͭͯ (͘͠ ੵ) ΛͱͬͨͷͰ͋Δɻ Ex [f(x, y)] (1.12) 14 / 47
1-2. ֬ͱ֬ ҎԼ͕ؔ f(x) ͷ֬ p(x) ͷԼͰͷࢄͰ͋Δɻ(ؔ f(x) ͕ͦ ͷฏۉ
E[f(x)] ͷपΓͰͲΕ͚ͩόϥ͍͍ͭͯΔͷ͔Λද͢) var[f] = E [ (f(x) − E[f(x)])2 ] (1.13) ಛʹ f(x) = x ͷ࣌ҎԼཱ͕͢Δɻ var[x] = E[x2] − E[x]2 (1.14) ·ͨɺ2 ͭͷ֬ม x ͱ y ͷؒͷڞࢄ (2 ͭͷ֬มͷґଘੑΛ ද͢) ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ cov[x, y] = Ex,y [ {x − E[x]}{y − E[y]} ] = Ex,y [xy] − E[x]E[y] (1.15) 2 ͭͷ֬ม x ͱ y ͕ಠཱͷ࣌ɺcov[x, y] = 0 ͱͳΔɻ 15 / 47
1-2. ֬ͱ֬ ࣍ʹɺ࿈ଓมͷ֬Ͱ࠷ॏཁͳͰ͋ΔΨεʹ͍ͭͯ ड़Δɻ ΨεҎԼͰఆٛ͞ΕΔɻ(ύϥϝʔλฏۉ µ ͱࢄ σ2 ͷ 2
ͭ) N(x|µ, σ2) = 1 (2πσ2)1/2 exp { − 1 2σ2 (x − µ)2 } (1.16) 16 / 47
1-2. ֬ͱ֬ Ψεͷॏཁͳੑ࣭ͱͯ͠ɺx ͷฏۉΛࢄ͕ͦΕͧΕ µ ͱ σ2 Ͱ༩͑ΒΕΔ͜ͱͰ͋Δɻ E[x] =
∫ ∞ −∞ N(x|µ, σ2)x dx = µ (1.17) var[x] = E[x2] − E[x]2 = σ2 (1.18) 17 / 47
1-2. ֬ͱ֬ ࣍ʹɺҎԼͷ D ࣍ݩͷϕΫτϧ x ʹର͢ΔଟมྔΨεΛಋೖ ͢Δɻ N(x|µ, Σ)
= 1 (2π)D/2 1 |Σ|1/2 exp { − 1 2 (x − µ)TΣ−1(x − µ) } (1.19) ͜͜Ͱɺµ Λ D ࣍ݩͷฏۉϕΫτϧͱ͠ɺΣ Λ D × D ͷڞࢄߦྻͱ ͢Δɻ ͜ͷ߹ͰฏۉͱڞࢄҎԼͷੑ࣭Λຬͨ͢ɻ E[x] = ∫ N(x|µ, Σ)x dx = µ (1.20) cov[x] = E[(x − E[x])(x − E[x])T] = Σ (1.21) 18 / 47
1-2. ֬ͱ֬ Ҏ߱ͷٞͰΑ͘͏ΨεͷެࣜΛհ͢Δɻ ҎԼͷपล֬ p(x) ͱ͖݅֬ p(y|x) ͕༩͑ΒΕ͍ͯΔͱ͢Δɻ p(x) =
N(x|µ, Λ−1) (1.22) p(y|x) = N(y|Ax + b, L−1) (1.23) ͜͜Ͱɺµ, A, b ฏۉʹؔ͢ΔύϥϝʔλͰɺΛ, L ਫ਼ߦྻͰ ͋Δɻ ͜ͷ࣌ɺपล֬ p(y) ͱ͖݅֬ p(x|y) ҎԼͷΑ͏ʹͳΔɻ p(y) = N(y|Aµ + b, L−1 + AΛ−1AT) (1.24) p(x|y) = N(x|Σ{ATL(y − b) + Λµ}, Σ) (1.25) ͜͜ͰɺΣ ҎԼͰఆٛ͞ΕΔɻ Σ = (Λ + ATLA)−1 (1.26) (ৄ͍͠ಋग़ PRML ͷ 2.3.3 Λࢀߟ) 19 / 47
1-2. ֬ͱ֬ ·ͨɺಉ࣌ p(xa , xb ) ͕ҎԼͰ༩͑ΒΕ͍ͯͨͱ͢Δɻ p(xa ,
xb ) = N(x|µ, Σ) (1.27) ͜͜Ͱɺx = (xa , xb )T Ͱ͋Δɻ ͜ͷͱ͖ɺपล p(xa ) ҎԼͷΑ͏ͳΨεʹͳΔ͜ͱ͕Β Ε͍ͯΔɻ(ৄ͍͠ಋग़ PRML ͷ 2.3.2 Λࢀߟ) p(xa ) = ∫ p(xa , xb ) dxb = N(xa |µa , Σaa ) (1.28) ͜͜Ͱɺµa ͱ Σaa ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ µ = ( µa µb ) , Σ = ( Σaa Σab Σba Σbb ) (1.29) 20 / 47
1-3. ࠷ਪఆͱϕΠζਪఆ ϕΠζਪఆΛଟ߲ࣜۂઢϑΟοςΟϯάΛྫʹઆ໌͢Δɻ ϕΠζతͳ֬ղऍͰɺ·ͣσʔλΛ؍ଌ͢Δલʹɺզʑͷύϥϝʔ λ w ͷԾઆΛࣄલ֬ p(w) ͷܗͰऔΓࠐΜͰ͓͘ɻ ࣮ࡍʹೖྗσʔλ
x = (x1 , x2 , · · · , xN )T ͱඪม t = (t1 , t2 , · · · , tN )T Λ༻͍ͯɺؔ p(t|x, w) ΛٻΊΔɻ ϕΠζͷఆཧΑΓɺࣄޙ֬ p(w|t, x) ΛٻΊΔɻ p(w|t, x) = p(t|x, w)p(w) p(t) (1.30) 21 / 47
1-3. ࠷ਪఆͱϕΠζਪఆ ϕΠζਪఆͰɺ܇࿅σʔλ x, t ͱະͷೖྗσʔλ x ͕༩͑ΒΕͨ ࣌ͷ༧ଌ t
ͷ֬ p(t|x, t, x) ͕ҎԼͷΑ͏ʹٻ·Δɻ p(t|x, t, x) = ∫ p(t|x, w)p(w|t, x) dw (1.31) (͜ͷ༧ଌͷಋग़ํ๏ҎԼͷ Qiita هࣄͰ·ͱΊͯ·͢ɻ͝ཡ͘ ͍ͩ͞ɻ͍͍ͦͯ͠Ͷ͍ͩ͘͞ɻ) https://qiita.com/gucchi0403/items/bfffd2586272a4c05a73 22 / 47
1-3. ࠷ਪఆͱϕΠζਪఆ සओٛతͳ֬ղऍͱϕΠζతͳ֬ղऍͰɺؔ p(D|w) ͷ ׂ͕มΘΔɻ සओٛతͳ֬ղऍͰɺw ͋Δݻఆ͞Εͨύϥϝʔλͱͯ͠ଊ ͑ɺؔ p(D|w)
Λ࠷େʹ͢ΔΑ͏ͳ w Λਪఆྔͱͯ͠ఆΊΔɻ (w 1 ͭʹఆ·Δ) ϕΠζతͳ֬ղऍͰɺؔࣄલΛ؍ଌσʔλ D ʹΑͬ ͯɺࣄޙʹߋ৽͢ΔͨΊʹ͏ (ࣄޙ p(w|D) w ͷ֬ Ͱ͋Γɺw ෆ࣮֬ੑΛͭ) ޙऀͷؔͷ༻ํ๏ͷ۩ମྫޙ΄Ͳհ͢Δɻ 23 / 47
2-1. ઢܗجఈؔϞσϧ ͡Ίʹઆ໌ͨ͠؆୯ͳճؼϞσϧɺग़ྗ y(x, w) ΛҎԼͷΑ͏ʹೖ ྗม x ͷଟ߲ࣜͱ͢ΔͷͰ͋ͬͨɻ y(x,
w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (2.1) ͜͜Ͱɺw = (w0 , w1 , · · · , wM )T ύϥϝʔλϕΫτϧͰ͋Δɻ ͜ͷষͰɺҰൠԽͱͯ͠ೖྗΛϕΫτϧ x ͱ͠ɺඇઢܗͳجఈؔ ϕj (x) (j = 1, · · · , M − 1) Ͱؔ y(x, w) ΛҎԼͷΑ͏ʹల։͢Δ͜ͱ Λߟ͑Δɻ y(x, w) = w0 + M−1 ∑ j=1 wj ϕj (x) (2.2) 24 / 47
2-1. ઢܗجఈؔϞσϧ ·ͨࣜΛॖ͢ΔͨΊɺϕ0 (x) = 1 ͱ͠ɺ ϕ(x) = (ϕ0
(x), ϕ1 (x), · · · , ϕM−1 (x))T ͱఆٛ͢Δͱɺ(2.2) y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (2.3) ͱॻ͚Δɻ ྫ͑ɺجఈؔ ϕj (x) ͱͯ͠ҎԼͷΨεجఈ͕ؔ͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (2.4) ͜ͷجఈؔ x = µj Λத৺ʹͯ͠ɺࢄ s2 ʹΑͬͯࢧ͞ΕΔ͕ ΓΛ࣋ͭΨεجఈؔͰ͋Δɻ Ҏ߱Ұൠͷجఈؔ ϕj (x) Λ༻͍ͯٞ͢Δɻ 25 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ॳΊͷষͰઆ໌ͨ͠ճؼͰɺೋޡࠩΛ࠷খʹ͢ΔΑ͏ʹσʔ λΛଟ߲ࣜؔʹϑΟοςΟϯάͤͨ͞ɻ ࠓճɺඪม t ͕ҎԼͷΑ͏ʹܾఆతͳؔ y(x, w) ͱظ͕
0 Ͱਫ਼͕ β > 0 ͷΨε N(ϵ|0, β−1) ʹै͏ ϵ ͷͰॻ͚Δͱ ͢Δɻ t = y(x, w) + ϵ (2.5) ϵ = t − y(x, w) ΑΓɺҎԼͷΑ͏ʹඪม t Ψεʹै͏ɻ p(t|x, w, β) = N(t − y(x, w)|0, β−1) = N(t|y(x, w), β−1) (2.6) 26 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ͜͜Ͱɺೖྗσʔλͷू߹ X = {x1 , x2 , ·
· · , xN } ͱͦΕͧΕʹରԠ͢ Δඪมͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺඪมΛॎʹฒͨϕ Ϋτϧ t = (t1 , t2 , · · · , tN )T Λఆٛ͢Δɻ ؍ଌ {t1 , t2 , · · · , tN } ͕ (2.6) ͔Βಠཱʹੜ͞Εͨͱ͢Δͱɺ ؔҎԼͷΑ͏ʹݸʑͷσʔλͷͷੵͰॻ͚Δɻ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.7) ͜͜ͰɺΨε N(x|µ, σ2) N(x|µ, σ2) = 1 (2πσ2)1/2 exp { − 1 2σ2 (x − µ)2 } (2.8) Ͱ͋Δɻ 27 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ؔͷ (2.7) Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔΘΓʹ ؔͷରΛ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔɻ ·ͣɺ ln { N(tn
|y(xn , w), β−1) } = ln [ β1/2 (2π)1/2 exp { − β 2 (tn − y(xn , w))2 }] = 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 (2.9) ΑΓɺln p(t|X, w, β) ҎԼͷΑ͏ʹͳΔɻ ln p(t|X, w, β) = N ∑ n=1 ln N(tn |y(xn , w), β−1) = N ∑ n=1 [ 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 ] = N 2 ln β − N 2 ln (2π) − β 2 N ∑ n=1 (tn − y(xn , w))2 (2.10) 28 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ͜͜Ͱɺೋޡࠩ ED (w) Λ ED (w) = 1
2 N ∑ n=1 (tn − y(xn , w))2 (2.11) ͱఆٛ͢Δͱɺln p(t|X, w, β) ln p(t|X, w, β) = N 2 ln β − N 2 ln (2π) − βED (w) (2.12) ͱͳΔɻ ࠷ਪఆղ wML , βML ΛٻΊΔͨΊʹର ln p(t|X, w, β) ͷޯ ΛٻΊΔɻ ରͷ w ʹର͢Δޯ β ʹґଘ͠ͳ͍ͷͰɺઌʹ wML ΛٻΊ ͯɺͦͷ͋ͱʹ ln p(t|X, wML , β) Λ༻͍ͯ βML ΛٻΊΔ͜ͱ͕Ͱ ͖Δɻ 29 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ·ͣɺର (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δͱɺ (2.12) ͷӈลͷ 1,
2 ߲ w ʹґଘ͠ͳ͍ͷͰɺ3 ߲ͷ −βED (w) Λ࠷େԽ͢Δ͜ͱͱՁͰ͋Δɻ β > 0 ΑΓɺର (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱೋޡ ࠩ ED (w)(2.11) Λ w ʹؔͯ͠࠷খʹ͢Δ͜ͱͱՁͰ͋Δɻ 1-1 Ͱൃݟ๏తʹೋޡࠩ (1.3) Λ࠷খԽ͕ͨ͠ɺೋޡࠩ (1.3) ͷ ࠷খԽ֬Λ༻͍ΔͱؔΛΨεͱԾఆͨ͠ͱ͖ͷ࠷ ਪఆͷ݁ՌͰ͋Δࣄ͕Θ͔Δɻ ͦΕͰ࣮ࡍʹରͷ w ʹର͢ΔޯΛٻΊΔɻ 30 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ରͷ w ʹର͢Δޯ y(x, w) = wTϕ(x) ΑΓɺ
∂ ∂w ln p(t|X, w, β) = − β ∂ ∂w ED (w) = − β 2 N ∑ n=1 ∂ ∂w (tn − wTϕ(xn ))2 =β N ∑ n=1 (tn − wTϕ(xn ))ϕ(xn ) =β { N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )Tw } (2.13) ͱͳΓɺ࠷ਪఆղ wML ҎԼͷࣜΛຬͨ͢ɻ N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )TwML = 0 (2.14) 31 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ͜͜ͰɺҎԼͷܭըߦྻ Φ Λఆٛ͢Δɻ Φ =
ϕ0 (x1 ) ϕ1 (x1 ) · · · ϕM−1 (x1 ) ϕ0 (x2 ) ϕ1 (x2 ) · · · ϕM−1 (x2 ) . . . . . . ... . . . ϕ0 (xN ) ϕ1 (xN ) · · · ϕM−1 (xN ) = ϕ(x1 )T ϕ(x2 )T . . . ϕ(xN )T (2.15) ҎԼͷ͕ࣜΓཱͭࣄ͕Θ͔Δɻ ΦTΦ = N ∑ n=1 ϕ(xn )ϕ(xn )T (2.16) ΦTt = N ∑ n=1 tn ϕ(xn ) (2.17) ͜ΕΑΓɺ(2.14) ҎԼͷΑ͏ʹͳΔɻ ΦTt − ΦTΦwML = 0 (2.18) 32 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ Αͬͯɺ࠷ਪఆղ wML wML = (ΦTΦ)−1ΦTt (2.19) ͱͳΔɻ
࣍ʹɺ࠷ਪఆղ wML Λೖͨ͠ ln p(t|X, wML , β) ͷ β ͷඍΛߟ ͑Δͱ ∂ ∂β ln p(t|X, wML , β) = N 2 1 β − ED (wML ) (2.20) ͱͳΔɻ ͜ΕΑΓɺ࠷ਪఆղ βML ͷٯҎԼͷΑ͏ʹͳΔɻ 1 βML = 2 N ED (wML ) = 1 N N ∑ n=1 (tn − wT ML ϕ(xn ))2 (2.21) 33 / 47
2-2. ઢܗجఈؔϞσϧͷ࠷ਪఆ ͜ΕΑΓɺ৽ͨͳೖྗϕΫτϧ x ͕༩͑ΒΕͨ࣌ͷඪม t ͷ༧ଌ p(t|x, wML
, βML ) ҎԼͷΑ͏ʹͳΔɻ p(t|x, wML , βML ) = N(t|y(x, wML ), β−1 ML ) (2.22) ͜͜ͰɺwML , βML (2.19) ͱ (2.21) Ͱ༩͑ΒΕΔɻ 34 / 47
2-3. ϕΠζઢܗճؼ ࣍ઢܗճؼϞσϧΛϕΠζతʹѻ͏͜ͱΛߟ͑Δɻ ͦ͜Ͱɺฏۉ͕ m0 Ͱڞࢄ͕ S0 ͷҎԼͷࣄલΛԾఆ͢Δɻ p(w) =
N(w|m0 , S0 ) (2.23) ·ͨɺؔ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.24) Ͱ͋ΔͷͰɺࣄޙ p(w|t) ϕΠζͷఆཧʹΑΓɺҎԼͷΑ͏ʹ ͳΔɻ p(w|t) ∝ p(t|X, w, β)p(w) ∝ exp ( − β 2 N ∑ n=1 (tn − wTϕ(xn ))2 ) × exp ( − 1 2 (w − m0 )TS−1 0 (w − m0 ) ) (2.25) 35 / 47
2-3. ϕΠζઢܗճؼ (2.25) ΑΓɺࢦͷݞ͕ w ͷ 2 ࣍Ͱ͋ΔͷͰ p(w|t) ΨεͰ
͋Δɻ ۩ମతʹɺp(w|t) ҎԼͷΑ͏ʹͳΔɻ(PRML ͷԋश 3.7 ࢀর) p(w|t) = N(w|mN , SN ) (2.26) ͜͜ͰɺmN ͱ SN ҎԼͰ͋Δɻ mN =SN (S−1 0 m0 + βΦTt) (2.27) S−1 N =S−1 0 + βΦTΦ (2.28) 36 / 47
2-3. ϕΠζઢܗճؼ ͜͜Ͱɺ࠷ਪఆղ wML (2.19) ͱࣄޙ p(w|t) ͷϞʔυ wMAP (Ϟʔυͱɺp(w|t)
Λ࠷େʹ͢Δ w) ͱࣄޙͷฏۉ mN ͷؔ Λߟ͢Δɻ ·ͣɺΨεͷϞʔυฏۉʹ͍͠ͱ͍͏ੑ࣭ (PRML ͷԋश 1.9 ࢀর) ͕͋ΔͷͰɺwMAP = mN Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͞Βʹɺແݶʹ͍ࣄલ S0 = α−1I(α → 0) Λߟ͑Δͱ S−1 N = S−1 0 + βΦTΦ → βΦTΦ (2.29) ͱͳΓɺ mN = SN (S−1 0 m0 + βΦTt) → (ΦTΦ)−1ΦTt (2.30) ͱͳΔͷͰɺ͜ͷͱ͖ wMAP = mN = wML Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͭ·ΓɺԿใΛ࣋ͨͳ͍ (ແݶʹ͍) ࣄલΛ༻ͨ͠ͱ͖ͷ ࣄޙΛ࠷େʹ͢ΔύϥϝʔλؔΛ࠷େʹ͢Δύϥϝʔλ ͱҰக͢Δͱ͍͏͜ͱͰ͋Δɻ 37 / 47
2-3. ϕΠζઢܗճؼ લͷষͰɺϕΠζతͳѻ͍ͰɺؔࣄޙΛߋ৽͢ΔͷͰ ͋Δͱઆ໌͕ͨ͠ɺͦͷߋ৽ͷ༷ࢠΛྫΛͬͯݟ͍ͯ͘ɻ ·ͣɺઃఆͱͯؔ͠ͷฏۉ y(x, w) = w0 +
w1 x ͱ͢Δɻ ·ͨɺڭࢣσʔλʹ͍ͭͯɺೖྗσʔλ xn −1 ͔Β 1 ͷҰ༷ ͔ΒબͼɺରԠ͢Δඪ tn ɺඪ४ภࠩ 0.2 Ͱฏۉ 0 ͷΨεϊΠ ζ ϵ Λ༻͍ͯ tn = f(xn , a0 = −0.3, a1 = 0.5) + ϵ (2.31) ͜͜Ͱɺ f(x, a0 , a1 ) = a0 + a1 x (2.32) Ͱ͋Δɻ ͭ·Γɺ͜͜ͰͷඪڭࢣσʔλΛ༻͍ͯύϥϝʔλ w0 , w1 ͕ a0 = −0.3, a1 = 0.5 Λ෮ݩ͢Δ͜ͱͰ͋Δɻ 38 / 47
2-3. ϕΠζઢܗճؼ ·ͨɺؔͷਫ਼طͰ β = (1/0.2)2 = 25 ͱ͠ɺࣄલҎ ԼͷΑ͏ͳํతΨεΛ༻͍ͯɺύϥϝʔλ
α ͷ α = 2.0 ͱ ͢Δɻ p(w) = N(w|0, α−1I) (2.33) ͜ͷઃఆͰڭࢣσʔλ͕૿͍͑ͯ͘ͱ͖ͷࣄޙͷߋ৽ʹ͍ͭͯݟ ͍ͯ͘ɻ 39 / 47
2-3. ϕΠζઢܗճؼ ·ͣڭࢣσʔλ͕؍ଌ͞ΕΔલͷஈ֊ͷάϥϑͰ͋Δɻ ࠨͷάϥϑࣄલ p(w) Ͱ͋Γɺӈͷάϥϑͦͷࣄલ͔Βϥ ϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1
Λ༻͍ͯɺؔ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ વڭࢣσʔλ͕ͳ͍ͷͰɺ6 ݸͷؔ·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔͳ͍ɻ 40 / 47
2-3. ϕΠζઢܗճؼ ࣍ڭࢣσʔλ͕Ұͭ؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͜ͷσʔλͷؔ p(t|x, w) Λ w ͷؔͱͯ͠ϓ ϩοτͨ͠ͷͰ͋Δɻന͍ेࣈ͕
a0 = −0.3, a1 = 0.5 ͷͰ͋Δɻ ਅΜதͷάϥϑࣄޙɺͭ·Γࣄલ p(w) ʹؔ p(t|x, w) Λ͔͚ͯن֨Խͨ͠ͷͰ͋Γɺӈͷάϥϑͦͷࣄޙ͔Βϥϯμ Ϝʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ·ͩڭࢣσʔλ͕গͳ͍ͷͰɺ6 ݸͷؔ·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔͳ͍͕ɺͯ͢ ͷઢ͕σʔλ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔ͜ͱʹҙɻ 41 / 47
2-3. ϕΠζઢܗճؼ ࣍ೋͭͷڭࢣσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑಉ͘͡ɺ͜ͷσʔλͷؔ p(t|x, w) Ͱ͋Δɻ ਅΜதͷάϥϑσʔλ͕Ұݸͩͬͨ࣌ͷࣄޙΛࣄલͱ͠ ͯɺͦΕʹؔΛ͔͚ͨͷͰ͋Γɺӈͷάϥϑͦͷࣄޙ͔ ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ
w0 , w1 Λ༻͍ͯɺؔ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ࣄޙ͕ a0 = −0.3, a1 = 0.5 ۙʹ࠷େΛ࣋ͪɺ6 ݸͷ͕ؔ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ۙʹ·ͱ·Γ࢝Ίɺͯ͢ ͷઢ͕ 2 ͭͷσʔλ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔɻ 42 / 47
2-3. ϕΠζઢܗճؼ ࠷ޙʹ 20 ݸͷσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑɺ20 ݸͷσʔλͷؔ p(t|x, w) Ͱ͋Δɻ
ਅΜதͷάϥϑ 20 ݸͷσʔλͯ͢ΛؚΜͩࣄޙͰ͋Γɺӈ ͷάϥϑͦͷࣄޙ͔ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦͬͯ ͍Δɻ ࣄޙ͕ a0 = −0.3, a1 = 0.5 ۙʹӶ͍Λ࣋ͪɺ6 ݸͷ͕ؔ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ۙʹ·ͱ·͍͍ͬͯͯΔ͜ ͱ͕Θ͔Δɻ 43 / 47
2-3. ϕΠζઢܗճؼ ࣍ʹࣄޙ p(w|t, α, β) ͱؔ p(t|x, w, β)
Λ༻͍ͯɺະͷೖ ྗϕΫτϧ x ʹର͢Δ༧ଌ t ͷ֬ΛٻΊΔɻ(ࣄޙʹϋΠ ύʔύϥϝʔλ α, β ΛҾʹ෮׆ͤͨ͞ɻ) (1.31) ΑΓɺ༧ଌ p(t|t, α, β) ҎԼͷΑ͏ʹͳΔɻ p(t|t, α, β) = ∫ p(t|x, w, β)p(w|t, α, β) dw (2.34) (2.6) ͱ (2.26) ͱ (1.24) Λ༻͍Δͱɺp(t|t, α, β) ҎԼͷΑ͏ʹͳΔɻ (PRML ͷԋश 3.10 ࢀর) p(t|t, α, β) = N(t|mT N ϕ(x), σ2 N (x)) (2.35) ͜͜Ͱɺ༧ଌͷࢄ σ2 N (x) ҎԼͰ༩͑ΒΕΔɻ σ2 N (x) = 1 β + ϕ(x)TSN ϕ(x) (2.36) 44 / 47
2-3. ϕΠζઢܗճؼ σ2 N (x) ͷҰ߲ͷ 1/β ؔͷࢄͰ͋Γɺೖྗσʔλʹର͢Δ ग़ྗͷόϥ͖ͭ (ϊΠζ)
Ͱ͋Δɻ Ұํɺೋ߲ͷ ϕ(x)TSN ϕ(x) w ͷෆ࣮֬ੑ (ࣄޙͷࢄ) ͔Β ͘Δ߲Ͱ͋Δɻ(ύϥϝʔλΛਪఆ͠ͳ͍ϕΠζਪఆಛ༗ͷϊΠζ) ͜ͷೋ߲৽ͨͳڭࢣσʔλ͕Ճ͞ΕΔ (N → N + 1) ͱখ͘͞ͳ Δɺͭ·Γ σ2 N+1 (x) ≤ σ2 N (x) ͱͳΔɻ(PRML ͷԋश 3.11 ࢀর) ͜Εڭࢣσʔλ͕૿͑Δͱɺग़ྗͷ༧ଌͷ࣮͕֬͞૿͑Δͱ͍͏͜ͱ Λද͢ɻ ࠷ޙʹྫΛ༻͍ͯɺڭࢣσʔλ͕૿͑Δͱ༧ଌͷෆ͔͕֬͞ݮΔ༷ࢠΛ ݟΔɻ 45 / 47
2-3. ϕΠζઢܗճؼ ྫ؆୯ͳճؼͷͱ͖ʹ༻ͨ͠ࡾ֯ؔͷྫͰ͋Δɻ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x = (x1 , x2
, · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷඪม t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ tn ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨεʹै͏ϥϯμϜϊΠζ ϵ Λ Ճ͑ͨͷͱ͢Δɻ tn = sin(2πxn ) + ϵ (2.37) ؔͷฏۉͰ͋Δ y(x, w) Ψεجఈؔ (2.4) Ͱల։͢Δɻ ͜ͷઃఆͰڭࢣσʔλͷ͕ N = 1, 2, 4, 25 ͷͱ͖ͷάϥϑҎԼͷΑ ͏ʹͳΔɻ 46 / 47
2-3. ϕΠζઢܗճؼ ੨ؙ͕ڭࢣσʔλɺԫͷઢ͕ਖ਼ղͰ͋ΔαΠϯؔɺ͍ઢ͕༧ଌ ͷฏۉ mT N ϕ(x)ɺബ͍ͷྖҬ͕༧ଌ ±σN (x) ͷྖҬͰ͋Δɻ
ڭࢣσʔλ͕૿͑Ε૿͑Δ΄Ͳɺ͍ઢ͕ԫͷઢʹ͖ۙɺബ͍ ͷྖҬ͕ݮ͍༷ͬͯ͘ࢠ͕ݟͯऔΕΔɻ 47 / 47