Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PRML第6章
Search
gucchi
January 21, 2019
Science
67
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
PRML第6章
gucchi
January 21, 2019
More Decks by gucchi
See All by gucchi
PRML(ニューラルネット編)
gucchi
1
340
PRML(分類編)
gucchi
2
510
PRML(回帰編)
gucchi
2
600
PRML第10章
gucchi
1
350
PRMLセミナー(第9章)
gucchi
3
430
PRMLセミナー
gucchi
2
330
PRML第11章
gucchi
1
360
PRMLセミナー
gucchi
1
410
PRMLセミナー
gucchi
1
600
Other Decks in Science
See All in Science
人生を変えた一冊「独学大全」のはなし / Self-study ENCYCLOPEDIA: The Book Which Change My Life #独学大全 #EM推し本
expajp
0
160
なぜエネルギーは保存する? 〜自由落下でわかる“対称性”とネーターの定理〜
syotasasaki593876
0
180
イロレーティングを活用した関東大学サッカーの定量的実力評価 / A quantitative performance evaluation of Kanto University Football Association using Elo rating
konakalab
0
270
中央大学AI・データサイエンスセンター 2025年第6回イブニングセミナー 『知能とはなにか ヒトとAIのあいだ』
tagtag
PRO
0
160
プロジェクト「Azayaka」のSARの数式とジオメトリ
syuchimu
0
320
機械学習 - 授業概要
trycycle
PRO
0
510
Amusing Abliteration
ianozsvald
1
200
アクシズを探せ! 各勢力の位置関係についての考察
miu_crescent
PRO
1
330
TypeScript で WebAssembly を用いた 型安全なプラグイン設計
nagano
2
510
東北地方における過去20年間の降水量の変化
naokimuroki
1
240
1. CPC理論の展開と集合的知能モデル(JSAI2026 KS-27 集合的予測符号化と新たな知性の時代)
hayashiyus884
1
150
Accelerating operator Sinkhorn iteration with overrelaxation
tasusu
0
350
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
432
67k
How Software Deployment tools have changed in the past 20 years
geshan
0
34k
Faster Mobile Websites
deanohume
310
31k
Designing for Performance
lara
611
70k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
440
A better future with KSS
kneath
240
18k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.3k
How to build a perfect <img>
jonoalderson
1
5.6k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
28
3.5k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.5k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
960
Transcript
PRML ୈ 6 ষ Χʔωϧ๏ 2019/01/21 ࡔޱ ྒี 1 /
34
3 ষͱ 4 ষͰɺճؼͱྨͷઢܕύϥϝτϦοΫϞσϧΛߟ͑ͨɻ ྫ͑ 3 ষͰɺग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσ
ϧΛߟ͑ͨɻ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (3.3) ͜͜Ͱɺx = (x1 , x2 , · · · , xD )T D ࣍ݩͷೖྗϕΫτϧɻ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T ೖྗϕΫτϧ x Λ M ࣍ݩͷ ಛۭؒʹࣸ૾͢ΔϕΫτϧؔɻ·ͨɺw = (w0 , w1 , · · · , wM−1 )T M ࣍ݩͷύϥϝʔλϕΫτϧͰ͋Δɻ 3 ষͰɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͦΕͧΕʹରԠ͢Δඪ ϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙͯ͠ɺ࠷খೋ๏Λ༻͍ͯɺxn Λೖྗͨ࣌͠ͷग़ྗ y(xn , w) ͕ tn Λ࠶ݱ͢ΔΑ͏ʹύϥϝʔλ w Λ ܾΊͨɻ 2 / 34
3 ষͱ 4 ষͰɺͬ͘͟Γݴ͏ͱɺϕΫτϧؔ ϕ(x) ͷܗΛܾΊΔ (ྫ ͑ɺΨεجఈؔ) ͜ͱ͕Ϟσϧߏஙͷग़ൃͰ͋ͬͨɻ(ͪͳΈʹ 5
ষͷχϡʔϥϧωοτͰɺϕ(x) ࣗମֶशύϥϝʔλʹґଘ͞ ͤͨ) ଟ͘ͷઢܕύϥϝτϦοΫϞσϧͰɺϞσϧΛରදݱͰॻ͖͢͜ ͱʹΑΓɺΧʔωϧؔ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) Λ௨ͯ͠ͷΈ ϕ(x) ֶ͕शࡁΈͷύϥϝʔλ wML ͦͷύϥϝʔλΛ ༻͍ͨग़ྗ y(x, wML ) ґଘ͢ΔΑ͏ʹॻ͖ͤΔɻ(6.1 Ͱৄ͘͠ղ આ͢Δ) ·ͨɺճؼͱྨͷઢܕύϥϝτϦοΫϞσϧΛ֬తʹऔΓѻ͏͜ͱ ʹΑͬͯɺ͜ΕΒͷϞσϧ͕ΨεաఔͷҰྫʹͳ͍ͬͯΔ͜ͱΛΈ Δɻ(6.4 Ͱৄ͘͠ղઆ͢Δ) 3 / 34
6.1 ରදݱ ग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσϧΛߟ͑Δɻ y(x, w) = wTϕ(x)
ҎԼͷਖ਼ଇԽ͞ΕͨೋޡࠩΛ࠷খԽ͢Δ͜ͱΛߟ͑Δɻ J(w) = 1 2 N ∑ n=1 {wTϕ(xn ) − tn }2 + λ 2 wTw (6.2) ͜͜Ͱɺೖྗσʔλͷू߹Λ {x1 , x2 , · · · , xN }ɺඪϕΫτϧͷू߹Λ {t1 , t2 , · · · , tN } ͱ͢Δɻ 4 / 34
6.1 ରදݱ ఀཹ݅ ∂J(w)/∂w = 0 ҎԼͷΑ͏ʹมܗͰ͖Δɻ(3.1.1 Λࢀর) w =
N ∑ n=1 an ϕ(xn ) = ΦTa (6.3) ͜͜Ͱɺ an = − 1 λ {wTϕ(xn ) − tn } (6.4) Ͱ͋Γɺa = (a1 , · · · , aN )T ͱ͠ɺΦ = (ϕ(x1 ), · · · , ϕ(xN ))T ܭըߦ ྻ (3.16) Ͱ͋Δɻ w = ΦTa Λ༻͍ͯɺJ(w) ΛύϥϝʔλϕΫτϧ a ͷؔʹॻ͖͢ ͱҎԼͷΑ͏ʹͳΔɻ J(a) = 1 2 aTΦΦTΦΦTa − aTΦΦTt + 1 2 tTt + λ 2 aTΦΦTa (6.5) ͜͜Ͱɺt = (t1 , · · · , tN )T Ͱ͋Δɻ 5 / 34
6.1 ରදݱ ͜͜ͰɺάϥϜߦྻ K = ΦΦT Λఆٛ͢Δɻ͜ͷߦྻͷ Knm Ҏ ԼͷΑ͏ʹΧʔωϧͰॻ͚Δɻ
Knm = ϕ(xn )Tϕ(xm ) = k(xn , xm ) (6.6) άϥϜߦྻ K Λ༻͍ͯ (6.5) ͷ J(a) ҎԼͷΑ͏ʹॻ͚Δɻ J(a) = 1 2 aTKKa − aTKt + 1 2 tTt + λ 2 aTKa (6.7) ͜ͷΑ͏ʹύϥϝʔλ w ͷΘΓʹύϥϝʔλ a Ͱ࠷খೋ๏ͷΞϧ ΰϦζϜΛදݱ͢Δ͜ͱ͕Ͱ͖ɺ͜ͷදݱΛରදݱͱݴ͏ɻ ରදݱͰॻ͖͢ͱɺJ(a) ͷ ϕ(x) ґଘΧʔωϧ (6.6) Λ௨ͯ͠ͷ Έґଘ͍ͯ͠Δ͜ͱ͕Θ͔Δɻ(ੜͷ ϕ(x) ґଘͳ͍) 6 / 34
6.1 ରදݱ ͜ͷ J(a) Λ࠷খʹ͢Δ a ΛٻΊΔͱɺҎԼͷΑ͏ʹͳΔɻ( J(a) ͷ a
ʹର͢Δޯ͕θϩʹͳΔΑ͏ͳ a ΛٻΊͯྑ͍͠ɺຊจͰ ͍ͬͯΔΑ͏ʹɺJ(w) ͷ w ʹର͢Δޯ͕θϩʹͳΔΑ͏ͳ w(6.3) ͱ a ͱ w ͷؔ (6.4) Λ༻͍ͯٻΊͯྑ͍ɻ) a = (K + λIN )−1t (6.8) ͜͜ͰɺIN N × N ͷ୯ҐߦྻͰ͋Δɻ ͜ͷղ a ͱ w = ΦTa ͱ y(x, w) = wTϕ(x) Λ༻͍Δͱɺ৽͍͠ೖྗ x ʹର͢Δ༧ଌ y(x) ҎԼͷΑ͏ʹͳΔɻ y(x) = aTΦϕ(x) = k(x)T(K + λIN )−1t (6.9) ͜͜Ͱɺk(x) = (k(x1 , x), k(x2 , x), · · · , k(xN , x))T Ͱ͋Δɻ ͜ΕΑΓɺ༧ଌ y(x) ΧʔωϧؔͷΈʹΑͬͯද͞Ε͍ͯΔɻ 7 / 34
6.1 ରදݱ ରදݱͰղ a ΛٻΊΔࡍɺ(6.8) ΑΓ N × N ͷߦྻͷٯߦྻΛٻΊ
Δඞཁ͕͋Δɻ(N ڭࢣσʔλͷɻ) Ұํɺओදݱ (ࠓͰύϥϝʔλ w Ͱͷදݱͷํ) Ͱͷղ w ɺ w = ( λIM + ΦTΦ )−1 ΦTt (3.28) ͳͷͰɺM × M ͷߦྻͷٯߦྻΛٻΊΔඞཁ͕͋Δɻ(M ಛྔۭ ؒͷ࣍ݩɻ) N ≫ M ͷ࣌ (͜ͷΑ͏ͳ߹͕େଟ)ɺओදݱͰղΛٻΊΔํָ͕ɻ ҰํɺରදݱͰ M ͕ແݶେͷ࣌ͷಛۭؒऔΓѻ͏͜ͱ͕Ͱ͖ Δɻ(6,2 Ͱ M ͕ແݶେͷ࣌ͷಛۭؒͷྫΛڍ͛Δɻ) 8 / 34
6.2 Χʔωϧؔͷߏ ͜ͷઅͰɺΧʔωϧؔͷఆٛΛ༻͍ͯɺ͍ΖΜͳΧʔωϧؔΛ հ͢Δɻ Χʔωϧؔͷఆٛɺೖྗ x ͔Βదͳ M ࣍ݩಛۭؒͷࣸ૾ ϕ
͕ఆٛͰ͖ɺk(x, x′) ͕ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) ͱॻ͚Δ͜ͱͰ͋Δɻ Χʔωϧؔͷ؆୯ͳྫ k(x, z) = (xTz)2 (6.11) Ͱ͋Δɻ 9 / 34
6.2 Χʔωϧؔͷߏ ྫ͑ x = (x1 , x2 )T ͱ͠ɺࣸ૾
ϕ Λ ϕ(x) = (x2 1 , √ 2x1 x2 , x2 2 )T ͱ͢ Δͱɺ k(x, z) = (xTz)2 = ϕ(x)Tϕ(z) (6.12) ͱॻ͚ΔͷͰɺk(x, z) = (xTz)2 ΧʔωϧؔͰ͋Δɻ ࣮Χʔωϧؔͷఆٛ (6.1) ͷଞʹ͏Ұͭ͋Γɺ͕ Knm = k(xn , xm ) Ͱ͋ΔάϥϜߦྻ K ͕ਖ਼ఆஔߦྻͰ͋Δ͜ͱͰ ͋Δɻ ɹ (͜ΕΒͷ 2 ͭͷఆ͕ٛՁͰ͋Δ͜ͱҎԼͷهࣄͰূ໌ͯ͠Έ·͠ ͨɻ͚ٓ͠Εɺ͝ཡ and άου͍ͩ͘͞) https://qiita.com/gucchi0403/items/544065345f91144524c4 10 / 34
6.2 Χʔωϧؔͷߏ ࣍ʹɺطʹΧʔωϧؔͩͱΘ͔͍ͬͯΔ͔ؔΒɺ৽͍͠Χʔωϧؔ k(x, x′) Λੜ͢Δํ๏ΛҎԼʹࣔ͢ɻ ͜͜Ͱɺؔ k1 (·,
·), k2 (·, ·) Χʔωϧؔɺc > 0 ఆɺf(·) ҙ ͷؔɺq(·) ඇෛͷΛ࣋ͭଟ߲ࣜɺϕ(·) M ࣍ݩϕΫτϧؔɺ k3 (·, ·) M ࣍ݩϕΫτϧ্ۭؒʹఆٛ͞ΕͨΧʔωϧؔɺA ରশ ͳਖ਼ఆஔߦྻɺx = (xa , xb )ɺka (·, ·), kb (·, ·) ΧʔωϧؔͰ͋Δɻ 11 / 34
6.2 Χʔωϧؔͷߏ ͜ΕΒͷߏ๏Λ༻͍Δͱɺྫ͑ҎԼͷ͕ؔΧʔωϧͰ͋Δ͜ͱ͕ Θ͔Δɻ k(x, x′) = (xTx′ + c)M
͜͜Ͱɺc ≥ 0 ͷఆɺM ҙͷࣗવɻ ·ͨɺҎԼͷඇৗʹॏཁͳΨεΧʔωϧͱݴ͏ΧʔωϧؔΛߏͰ ͖Δɻ k(x, x′) = exp (−∥x − x′∥/2σ2) (6.23) ͜͜Ͱɺσ2 ҙͷਖ਼ͷఆɻ ͪͳΈʹΨεΧʔωϧʹରԠ͢ΔಛϕΫτϧແݶ࣍ݩͰ͋Δɻ (→ ԋश 6.11) 12 / 34
6.3 RBF ωοτϫʔΫ ͜ͷઅͰɺҰൠతʹΑ͘ΘΕΔ RBF ͱ͍͏جఈؔʹ͍ͭͯड़ ΔɻRBF ͱɺத৺ µj ͔ΒͷڑͷΈʹґଘ͍ͯ͠ΔجఈؔͰɺ
ϕj (x) = h(∥x − µj ∥) ͱ͍͏ܗΛ͍ͯ͠Δɻ RFB ͕ొ͢ΔͷɺೖྗมʹϊΠζؚ͕·ΕΔ࣌Ͱ͋ΔɻϊΠζ ξ ͷ֬Λ ν(ξ) ͱ͢ΔͱɺೋޡࠩҎԼͷΑ͏ʹͳΔɻ E = 1 2 N ∑ n=1 ∫ {y(xn + ξ) − tn }2ν(ξ) dξ (6.39) 13 / 34
6.3 RBF ωοτϫʔΫ ͜ͷೋޡࠩΛ࠷େʹ͢Δ y(x) ม๏ʹΑΓҎԼͷΑ͏ʹͳΔ͜ ͱ͕Θ͔Δɻ y(x) = N
∑ n=1 tn h(x − xn ) (6.40) ͜͜Ͱɺh(x − xn ) ҎԼͷΑ͏ʹ༩͑ΒΕΔɻ h(x − xn ) = ν(x − xn ) N ∑ m=1 ν(x − xm ) (6.41) ͜ͷΑ͏ͳϞσϧΛ Nadaraya-Watson Ϟσϧͱ͍͏ɻ ·ͨϊΠζ͕ํతɺͭ·Γ ∥ξ∥ ͷ࣌ (6.41) ͷجఈؔͷํతɺ ͭ·Γ h(∥x − xn ∥) ͱͳΓɺRBF ͱͳΔɻ (6.3.1 ͷ Nadaraya-Watson ϞσϧҎ߱ͷষͰ༻͠ͳ͔ͬͨͷͰɺඈ ͠·͢ɻ) 14 / 34
6.4 Ψεաఔ 6.1 Ͱɺઢܗճؼͷඇ֬తͳϞσϧ (ग़ྗ y(x, w) Λͦͷ··༧ଌʹ ༻) ʹ͍ͭͯɺରදݱͰॻ͖͢͜ͱͰΧʔωϧ͕ग़ݱ͢Δ͜ͱΛ
ݟͨɻ ɹ ࠓઢܗճؼͷ֬Ϟσϧ (ग़ྗ y(x, w) ͷ֬Λಋग़͢Δ) Λ ѻ͍ɺ͜͜ͰࣗવʹΧʔωϧ͕ग़ͯ͘Δ͜ͱΛ֬ೝ͢Δɻ 15 / 34
6.4.1 ઢܗճؼ࠶๚ 6.1 ͱಉ༷ʹҎԼͷΑ͏ͳೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ͔Β࢝ΊΔɻ y(x, w) = wTϕ(x)
(6.49) ࣍ʹɺύϥϝʔλϕΫτϧ w ͷࣄલ p(w) = N(w|0, α−1I) (6.50) ΛԾఆ͢Δɻ ͜͜Ͱɺw ͕༩͑ΒΕͨͱ͢Δͱɺ(6.49) ΑΓ x ʹ͍ͭͯͷಛఆͷؔ y(x) ͕ܾ·Δɻͭ·Γɺw ͷ֬ y(x) ͷ֬Λಋ͘ɻ ࣮༻తʹɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͕༩͑ΒΕ͍ͯΔ࣌ ͷग़ྗؔͷಉ࣌ؔ p(y(x1 ), y(x2 ), · · · , y(xN )) ͕ w ͷ֬ ͱ (6.49) ʹΑΓಋ͔ΕΔɻ(ΑΓਖ਼֬ʹݴ͏ͱɺX = {x1 , x2 , · · · , xN } ͱͯ͠ɺp(y(x1 ), y(x2 ), · · · , y(xN )|X) Λߟ͑Δɻ) 16 / 34
6.4.1 ઢܗճؼ࠶๚ ͦ͜Ͱɺy = (y(x1 ), y(x2 ), · ·
· , y(xN ))T ͱఆٛ͢Δͱɺ(6.49) ΑΓ y = Φw (6.51) ͕Θ͔Δɻ(Φ ܭըߦྻ) ͜ͷ࣌ɺy Ψε (6.50) ʹै͏ w ͷઢܗมΑΓɺy Ψε ʹै͏ɻΑͬͯɺΛશʹܾఆ͢ΔͨΊʹฏۉͱڞࢄ͕Θ ͔ΕΑ͘ɺ E[y] = ΦE[w] = 0 (6.52) cov[y] = E[yyT] = ΦE[wwT]ΦT = 1 α ΦΦT = K (6.53) ͱΘ͔Δɻ͜͜ͰɺK ҎԼͷΑ͏ʹʹΧʔωϧؔΛͭάϥ ϜߦྻͰ͋Δɻ Knm = k(xn , xm ) = 1 α ϕ(xn )Tϕ(xm ) (6.54) 17 / 34
6.4.1 ઢܗճؼ࠶๚ Ҏ্Ͱઆ໌ͨ͠ઢܗճؼΨεաఔͷҰྫͱͳ͍ͬͯΔɻ Ψεաఔͱɺೖྗσʔλͷू߹ {x1 , x2 , · ·
· , xN } ͕༩͑ΒΕ͍ͯ Δ࣌ͷग़ྗؔͷಉ࣌ؔ p(y) = p(y(x1 ), y(x2 ), · · · , y(xN )) ͕Ψ εʹै͏ͱԾఆ͢ΔͷͰ͋Δɻ ͦͷฏۉθϩͱԾఆ͢Δ͜ͱ͕ଟ͘ɺ·ͨڞࢄҎԼͷΑ͏ʹΧʔ ωϧͱ͢Δɻ E[y(xn ), y(xm )] = k(xn , xm ) (6.55) ্Ͱઆ໌ͨ͠ઢܗճؼ͔֬ʹΨεաఔͷҰྫͱͳ͍ͬͯΔ͜ͱ͕ Θ͔Δɻ 18 / 34
6.4.2 ΨεաఔʹΑΔճؼ ͜͜ͰɺΨεաఔΛઢܗճؼʹదԠ͢Δɻ ඪม tn ग़ྗؔ yn = y(xn )
Λฏۉͱͨ͠Ψεʹै͏ͱ ͢Δɻ p(tn |yn ) = N(tn |yn , β−1) (6.58) β ਫ਼ͷϋΠύʔύϥϝʔλɻ ಠཱੑʹΑΓɺy = (y(x1 ), y(x2 ), · · · , y(xN ))T ͕༩͑ΒΕͨ࣌ͷ t = (t1 , · · · , tN )T ͷ༧ଌҎԼͷΑ͏ʹͳΔɻ p(t|y) = N(t|y, β−1IN ) (6.59) ·ͨΨεաఔʹΑΓɺपล p(y) ฏۉ͕ 0 Ͱڞࢄ͕άϥϜߦ ྻ K Ͱ͋ΔΨεʹै͏ͱ͢Δɻ p(y) = N(y|0, K) (6.60) 19 / 34
6.4.2 ΨεաఔʹΑΔճؼ (6.59) ͷ p(t|y) ͱ (6.60) ͷ p(y) Λ༻͍Δͱɺ{x1
, x2 , · · · , xN } ͕༩͑ ΒΕ͍ͯΔ࣌ͷతม t ͷ p(t) ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ p(t) = ∫ p(t|y) p(y) dy = N(t|0, C) (6.61) ͜͜Ͱɺڞࢄ C ͷ Cnm Cnm = k(xn , xm ) + β−1δnm (6.62) Ͱ͋Δɻ(ࣜ (2.113)ʙࣜ (2.115) Λ༻ͨ͠ɻ) ڞࢄ C ʹग़ͯ͘ΔΧʔωϧؔͱͯ͠Α͘༻͞ΕΔͷ͕ɺҎԼͷ Α͏ͳΧʔωϧͰ͋Δɻ k(xn , xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) θ0 , · · · , θ3 ϋΠύʔύϥϝʔλɻ 20 / 34
6.4.2 ΨεաఔʹΑΔճؼ զʑ͕Γ͍ͨͷɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · ·
, xN } ͱ {t1 , t2 , · · · , tN } Λ༻ͯ͠ɺະͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ ඪม tN+1 ͷͰ͋Δɻͭ·ΓɺtN = (t1 , · · · , tN )T ͱఆٛͨ͠ ࣌ͷ p(tN+1 |tN ) Ͱ͋Δɻ(͜͜Ͱɺೖྗมͷґଘੑলུͨ͠ɻ) p(tN+1 |tN ) ΛٻΊΔͨΊʹɺ·ͣपล֬ p(tN+1 ) ͔ΒٻΊΔɻ͜ ͜ͰɺtN+1 = (t1 , · · · , tN+1 )T Ͱ͋Δɻ (6.61) ͷ݁ՌΛར༻͢Δͱɺp(tN+1 ) p(tN+1 ) = N(tN+1 |0, CN+1 ) (6.64) ͱͳΔɻ 21 / 34
6.4.2 ΨεաఔʹΑΔճؼ ͜͜Ͱɺڞࢄߦྻ CN+1 CN+1 = ( CN k
kT c ) (6.65) Ͱ͋Δɻ͜͜ͰɺCN ͕ (6.62) Ͱ͋ΔΑ͏ͳ N × N ͷߦྻͰɺ k = (k(x1 , xN+1 ), k(x2 , xN+1 ), · · · , k(xN , xN+1 ))T ͳΔϕΫτϧɺ c = k(xN+1 , xN+1 ) + β−1 Ͱ͋Δɻ ͜ͷ݁Ռͱ (2.81) ͱ (2.82) Λ༻͍Δͱɺp(tN+1 |tN ) Ψεʹै ͍ɺͦͷฏۉ m(xN+1 ) ͱࢄ σ2(xN+1 ) ҎԼͷΑ͏ʹͳΔɻ m(xN+1 ) = kTC−1 N tN (6.66) σ2(xN+1 ) = c − kTC−1 N k (6.67) ͭ·Γɺະͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷඪม tN+1 ͷ֬ ฏۉͱࢄ͕ xN+1 ʹґଘ͢ΔΨεͱͳΔɻ 22 / 34
6.4.3 ύϥϝʔλͷֶश Χʔωϧ๏ͰɺΧʔωϧؔΛܾఆ͢Δඞཁ͕͋ΔɻҰ͔ΒΧʔωϧ ؔΛܾఆ͢ΔΑΓɺԼͷ (6.63) ͷΑ͏ʹΧʔωϧؔΛύϥϝʔ λԽͯ͠ɺ܇࿅σʔλ͔Β͜ͷϋΠύʔύϥϝʔλΛܾఆ͢Δͷָ͕ͳ ͱ͖͋Δɻ k(xn ,
xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) ͜ͷϋΠύʔύϥϝʔλΛܾΊΔͨΊʹɺ(6.61) ͷ p(t|θ) ͷର ln p(t|θ) ΛͱͬͨͷΛ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛܾΊΕ ͍͍ɻ ln p(t|θ) ҎԼͷΑ͏ʹͳΔɻ ln p(t|θ) = − 1 2 ln |CN | − 1 2 tTC−1 N t − N 2 ln (2π) (6.69) ͜ͷ ln p(t|θ) Λ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛٻΊΔ͜ͱʹͳΔɻ 23 / 34
6.4.4 ؔ࿈ࣗಈܾఆ 6.4.3 ͰϋΠύʔύϥϝʔλͷਪఆΛߦ͕ͬͨɺ͜ͷਪఆͷ݁ՌΑ Γೖྗมͷ༧ଌͷॏཁ͕Θ͔Δɻ ྫ͑ɺҎԼͷΑ͏ͳΧʔωϧΛߟ͑Δɻ k(x, x′) = θ0
exp { − θ1 2 2 ∑ i=1 ηi (xi − x′ i )2 } (6.71) ͜͜Ͱɺθ0 , η1 , η2 ϋΠύʔύϥϝʔλͰ͋Δɻ ͜ͷΧʔωϧΛ༻͍ͯɺy ͷࣄલΛߟ͑Δɻ p(y) = N(y|0, K) (6.60) 24 / 34
6.4.4 ؔ࿈ࣗಈܾఆ ্ͷදɺη1 , η2 ΛมԽͤͨ࣌͞ͷ y ͷࣄલʹΑͬͯಘΒΕΔαϯ ϓϧͰ͋Δɻ ηi
Λখ͘͢͞Δͱɺxi ͷมԽʹΑΔ y ͷมԽখ͘͞ͳΔ͜ͱ͕Θ ͔Δɻ ͜ͷ࣌ɺy ʹϊΠζΛ͚Ճ͑ͨඪม t ͷ֬ p(t|θ) Λ࠷େʹ ͢ΔϋΠύʔύϥϝʔλΛٻΊΔͱɺηi খ͍͞ʹͳΔɻ 25 / 34
6.4.5 ΨεաఔʹΑΔྨ ࠓΨεաఔͰΫϥεྨΛߦ͏ɻ ճؼͰɺ(6.60) ͷΑ͏ʹग़ྗؔͷಉ࣌ؔ p(y) ͕Ψε ʹै͏ͱԾఆͨ͠ɻ͜ͷ࣌ɺyn ࣮શମͷΛͱΔɻ ྨͰɺग़ྗ
yn 0 ≤ yn ≤ 1 ͱͳΔ͖Ͱ͋Δɻͦ͜Ͱɺग़ྗͰ ͳ͘׆ੑ an = a(xn ) ͷಉ࣌ؔΛߟ͑Δ͜ͱʹ͠ɺग़ྗΛ yn = σ(an ) ͱ͢Δɻ ࣍ʹ֬ԽΛߦ͏ɻతม tn = 1 ͷ࣌ͷ֬Λ p(tn = 1|an ) = σ(an ) ͱ͢Δͱɺp(tn = 0|an ) = 1 − σ(an ) ΑΓɺ p(tn |an ) = σ(an )tn (1 − σ(an ))1−tn (6.73) ͱͳΔɻ ճؼͷ࣌ͱಉ༷ʹɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · · , xN } ͱ tN = (t1 , · · · , tN )T Λ༻ͯ͠ɺະͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ ࣌ͷඪม tN+1 ͷ p(tN+1 |tN ) ΛٻΊΔɻ(͜͜Ͱೖྗมͷ ґଘੑলུͨ͠ɻ) 26 / 34
6.4.5 ΨεաఔʹΑΔྨ ·ͣɺaN+1 = (a(x1 ), a(x2 ), · ·
· , a(xN+1 ))T ͱͯ͠ɺΨεաఔΑΓ ׆ੑͷಉ࣌ p(aN+1 ) ΛҎԼͷΑ͏ʹԾఆ͢Δɻ p(aN+1 ) = N(aN+1 |0, CN+1 ) (6.74) ͜͜Ͱɺڞࢄߦྻ CN+1 ͷҎԼͱ͢Δɻ (CN+1 )nm = k(xn , xm ) + νδnm (6.75) ν ϊΠζ߲Ͱ͋Δɻ ٻΊ͍ͨͷɺඪม tN+1 ͷ p(tN+1 |tN ) Ͱ͋Γɺ2 ྨͰ p(tN+1 = 0|tN ) = 1 − p(tN+1 = 1|tN ) ͳͷͰɺp(tN+1 = 1|tN ) ͷΈΛ ٻΊΕྑ͍ɻ 27 / 34
6.4.5 ΨεաఔʹΑΔྨ ͜͜Ͱɺ p(tN+1 = 1, tN ) = ∫
p(tN+1 = 1, tN , aN+1 ) daN+1 = ∫ p(tN+1 = 1|tN , aN+1 )p(aN+1 |tN )p(tN ) daN+1 = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN )p(tN ) daN+1 ΑΓɺp(tN+1 = 1|tN ) ҎԼͷΑ͏ʹܭࢉ͞ΕΔɻ p(tN+1 = 1|tN ) = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN ) daN+1 (6.76) ͜͜Ͱɺp(tN+1 = 1|aN+1 ) = σ(aN+1 ) Ͱ͋Δɻ ͜ͷੵղੳతʹ࣮ߦ͢Δ͜ͱෆՄೳͰ͋Γɺ༷ʑͳํ๏Λ༻͍ͯ ۙࣅతʹٻΊΔ͜ͱ͕͞Ε͍ͯΔɻࠓճϥϓϥεۙࣅΛ༻͍Δɻ 28 / 34
6.4.6 ϥϓϥεۙࣅ ͜ͷઅͰɺϥϓϥεۙࣅΛ༻͍ͯੵ (6.76) ΛධՁ͢Δɻ ·ͣɺp(aN+1 |tN ) ΛϕΠζͷఆཧΛ༻͍ͯҎԼͷΑ͏ʹมܗ͢Δɻ p(aN+1
|tN ) = ∫ p(aN+1 |aN )p(aN |tN ) daN (6.77) p(aN |tN ) ࣄޙͰ͋Δɻ ͜͜Ͱɺ͖݅ p(aN+1 |aN ) ɺճؼͷ࣌ͷ (6.66) ͱ (6.67) ͷ p(tN+1 |tN ) ͷ݁ՌΛࢀߟʹ͢Δͱɺ p(aN+1 |aN ) = N(aN+1 |kTC−1 N aN , c − kTC−1 N k) (6.78) ͱͳΔɻ 29 / 34
6.4.6 ϥϓϥεۙࣅ p(aN |tN ) Λۙࣅ͢Δ (ϥϓϥεۙࣅ)ɻ ͦͷͨΊʹɺ ∂p(aN |tN
) ∂aN = ∇p(aN |tN ) = 0 Λຬͨ͢ aN (= a⋆ N ) ͱɺaN = a⋆ N Ͱͷϔοηߦྻ −∇∇ ln p(aN |tN ) ͕ ඞཁͰ͋Δɻ(4.4 ͱ 4.5 ࢀর) ·ͣɺࣄલ p(aN ) p(aN ) = N(aN |0, CN ) Ͱ༩͑Δɻ͜Ε (6.74) Ͱ N + 1 → N ͱͨ͠ͷɻ ؔ p(tN |aN ) σʔλͷಠཱੑΑΓɺ p(tN |aN ) = N ∏ n=1 σ(an )tn (1 − σ(an ))1−tn = N ∏ n=1 eantn σ(−an ) (6.79) ͱͳΔɻ 30 / 34
6.4.6 ϥϓϥεۙࣅ ϕΠζͷఆཧΑΓɺp(aN |tN ) ∝ p(tN |aN )p(aN )
ͳͷͰɺ࣮ࡍʹܭࢉΛ ͢Δͱɺ a⋆ N = CN (tN − σN ) (6.84) ͱͳΔɻ͜͜ͰɺσN = (σ(a1 ), σ(a2 ), · · · , σ(aN ))T Ͱ͋Δɻ ·ͨɺaN = a⋆ N Ͱͷϔοηߦྻ H H = W⋆ + C−1 N (6.85) ͱͳΔɻ͜͜ͰɺW σ(an )(1 − σ(an )) Λର֯ʹ࣋ͭର֯ߦྻͰ ͋ΓɺW⋆ aN = a⋆ N Ͱͷ W Ͱ͋Δɻ Αͬͯɺࣄޙ p(aN |tN ) ҎԼͷΑ͏ʹۙࣅ͞ΕΔɻ(ϥϓϥε ۙࣅ) p(aN |tN ) ∼ N(aN |a⋆ N , H−1) (6.86) 31 / 34
6.4.6 ϥϓϥεۙࣅ (6.78) ͱ (6.86) ΑΓɺҎԼͷΑ͏ʹ (6.77) ͷੵ͕ۙࣅͰ͖Δɻ p(aN+1 |tN
) ∼ ∫ N(aN+1 |kTC−1 N aN , c−kTC−1 N k)N(aN |a⋆ N , H−1) daN (2.115) ΑΓɺp(aN+1 |tN ) ҎԼͷฏۉͱࢄΛ࣋ͭΨεͱ ͳΔɻ E[aN+1 |tN ] = kT(tN − σN ) (6.87) var[aN+1 |tN ] = c − kT(W−1 N + CN )−1k (6.88) ͜͜ͰɺWN (6.85) ͷ W⋆ Ͱ͋Δɻ 32 / 34
6.4.6 ϥϓϥεۙࣅ p(aN+1 |tN ) ͔ͬͨͷͰɺ(6.76) ͷ p(tN+1 = 1|tN
) Λ (4.153) Λ༻ ͍ͯɺۙࣅܭࢉͰ͖Δɻ 6.4.3 Ͱٞͨ͠Α͏ʹϋΠύʔύϥϝʔλʔ θ ͕ CN ʹؚ·ΕΔͷ Ͱɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ p(tN |θ) (6.61) ͷ࣌ͱಉ༷ʹ p(tN |θ) = ∫ p(tN |aN ) p(aN |θ) daN (6.89) ͱܭࢉ͢Δ͕ɺ͜Ε·ͨղੳతʹܭࢉͰ͖ͳ͍ɻ ϥϓϥεۙࣅΛ༻͍ͯɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ 33 / 34
6.4.7 χϡʔϥϧωοτϫʔΫͱͷؔ ϕΠζχϡʔϥϧωοτ (→5.7) ʹ͓͍ͯग़ྗؔ (ωοτϫʔΫؔ )y(x, w) ͱ w
ͷࣄલʹΑΓɺग़ྗؔͷࣄલ͕ಘΒΕΔɻ χϡʔϥϧωοτͷӅΕͷϢχοτͷΛ M ͱͯ͠ɺM → ∞ ʹ͠ ͨ࣌ͷग़ྗؔͷࣄલ͕Ψεաఔͷग़ྗؔͷࣄલʹۙͮ ͘͜ͱ͕ΒΕ͍ͯΔɻ(Neal 1996) χϡʔϥϧωοτͰɺग़ྗؔ y(x, w) ͷ yk (x, w) ಠཱͰ ͳ͍ɻ(ॏΈͷڞ༗Λߦ͍ͬͯΔɻ) M → ∞ ͰΨεաఔʹۙ͘ͱ͍͏ࣄ࣮ɺχϡʔϥϧωοτͷग़ྗ yk (x, w) ͕ M → ∞ Ͱಠཱʹͳ͍ͬͯ͘ͱ͍͏͜ͱΛओு͢Δɻ 34 / 34