Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
PRML(分類編)
gucchi
September 06, 2019
Science
2
300
PRML(分類編)
gucchi
September 06, 2019
Tweet
Share
More Decks by gucchi
See All by gucchi
PRML(ニューラルネット編)
gucchi
0
220
PRML(回帰編)
gucchi
2
360
PRML第10章
gucchi
1
210
PRMLセミナー(第9章)
gucchi
3
250
PRMLセミナー
gucchi
2
230
PRML第11章
gucchi
1
240
PRMLセミナー
gucchi
1
320
PRMLセミナー
gucchi
1
400
PRML第6章
gucchi
1
24
Other Decks in Science
See All in Science
[勉強会資料メモ] Double/Debiased ML
masa_asa
0
280
データでスポーツを楽しもう! / Enjoy sports with data! (2021-11-30)
konakalab
0
140
Droidcon Berlin - A ride through AOSP's new colors
siddroid
0
220
SpaceXから学ぶ人生観 ~元JWが自分の人生を取り戻すヒント~
caesar2022
0
130
ミケル点とべズーの定理
unaoya
0
220
Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
sansan_randd
1
1k
Use ParaView for ISEE NLFFF database (v1.1)
hsc_nagoya
0
1.2k
Bibliométrie & science ouverte
mlarrieu
0
490
20220121_バスケットボール周りの流れ
kamakiri1225
0
290
Tokyo.R RStudioでグラフをちょっときれいに出力する - CairoとAGG -
bob3bob3
0
690
Cross-Media Information Spaces and Architectures (CISA)
signer
PRO
2
14k
機械学習を用いた効果検証~回帰分析とT-Learner~
s1ok69oo
0
120
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
60
3.9k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
12
940
Agile that works and the tools we love
rasmusluckow
319
19k
VelocityConf: Rendering Performance Case Studies
addyosmani
316
22k
A better future with KSS
kneath
226
16k
Mobile First: as difficult as doing things right
swwweet
213
7.6k
Building an army of robots
kneath
298
40k
Fireside Chat
paigeccino
13
1.4k
Infographics Made Easy
chrislema
233
17k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
224
49k
Designing for Performance
lara
597
63k
Ruby is Unlike a Banana
tanoku
91
9.3k
Transcript
PRML ΛࡐʹػցֶशΛਂ͘ཧղ͢Δηϛφʔ ʲྨฤʳ ࡔޱ ྒี 1 / 47
0. ࠓճͷηϛφʔʹ͍ͭͯ ࠓճͷηϛφʔͰɺPRML ͷୈ 4 ষͷઢܗࣝผϞσϧΛத৺ʹ͓ ͍ͨ͠͠ͱࢥ͍·͢ɻ ͳ͓ҙͱͯ͠ɺຊεϥΠυͷࣜ൪߸ͱ PRML ͷࣜ൪߸ҟͳΓ·
͢ͷͰɺ͝ҙ͍ͩ͘͞ɻ 2 / 47
࣍ 1. ༧උࣝ 1-1. ྨͷΞϓϩʔν 2. ϑΟογϟʔͷઢܗผ (PRML 4.1.4) 3.
ϩδεςΟοΫճؼ (PRML 4.3, 4.4, 4.5) 3-1. ϩδεςΟοΫճؼ 3-2. ϩδεςΟοΫճؼͷ࠷ਪఆ 3-3. ϕΠζϩδεςΟοΫճؼ 3 / 47
1. ༧උࣝ ػցֶशɺಛʹͦͷதͰڭࢣ͋ΓֶशͰɺ·ͣೖྗσʔλͷू߹ {x1 , x2 , · · ·
, xN } ͱͦΕͧΕʹରԠ͢ΔඪϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ(܇࿅σʔλɺ·ͨڭࢣσʔλ) ༻ҙͨ͠܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔ΒඪϕΫτϧΛ༧ଌ͢Δ ؔ y(x) Λ࡞Δɻ(ֶश) ֶशऴྃޙɺະͷσʔλ x ͷඪϕΫτϧΛ y(x) Ͱ༧ଌ͢Δ ֤ೖྗϕΫτϧΛ༗ݶݸͷࢄతͳΧςΰϦʹׂΓͯΔ߹ (ྫ͑ ɺखॻ͖ࣈͷೝࣝ) ΛΫϥεྨͱ͍͍ɺग़ྗ͕࿈ଓมͷ߹ Λճؼͱ͍͏ɻ ࠓճΫϥεྨͷΛߟ͑Δɻ 4 / 47
1-1. ྨͷΞϓϩʔν ·ͣɺ܇࿅σʔλ͕༩͑ΒΕͨ࣌ͷྨͷΞϓϩʔνΛ 2 ͭհ ͢Δɻ 1 ͭɺ܇࿅σʔλ͔Βࣝผؔ y(x) Λ࡞Δํ๏Ͱ͋Δɻ
͜͜Ͱ x D ࣍ݩͷೖྗϕΫτϧ (ྫ͑ɺn ݸͷ܇࿅σʔλͷೖྗ Ͱ͋Δ xn ) Ͱ͋Δɻ ͜ͷΞϓϩʔνͰɺy(x) ͷΛ༻͍ͯɺ܇࿅σʔλʹͳ͍ະͷ σʔλ x ͕ͲͷΫϥεʹྨ͞ΕΔ͖ͳͷ͔Λ༧ଌ͢Δɻ(ޙͰྫΛ ͋͛Δ͕ɺྫ͑ x y(x) ≥ 0 ͳΒΫϥε 1, y(x) < 0 ͳΒΫϥε 2 ʹଐ͢ΔͳͲ) 5 / 47
1-1. ྨͷΞϓϩʔν ࣮ͨͩ͠ࡍʹྨΛղ͘ͱ͖ɺ܇࿅σʔλΛͬͯؔ y(x) Λ Ұ͔Β࡞Γ্͛Δ͜ͱ͠ͳ͍ɻ Α͘ߦ͏ํ๏ɺೖྗϕΫτϧͱಉ࣍͡ݩͷύϥϝʔλϕΫτϧ (ॏ Έ)w =
(w1 , w2 , · · · , wD )T ͱεΧϥʔύϥϝʔλ (όΠΞε)w0 Λ༻ҙ ͠ɺؔ y(x, w, w0 ) y(x, w, w0 ) = f ( wTx + w0 ) (1.1) Λ༻ҙ͢Δɻ ͜͜Ͱɺf ඇઢܗؔͰ͋Γɺ׆ੑԽؔͱݺΕΔɻ(ྫ͑ϩδ εςΟοΫγάϞΠυؔ) ͦͯ͠ɺڭࢣσʔλʹΑΓؔ y(x, w, w0 ) ͷύϥϝʔλ w, w0 Λਪ ఆ͠ɺਪఆ͞ΕͨύϥϝʔλͷΛ w⋆, w⋆ 0 ͱ͢Δͱɺؔ y(x, w = w⋆, w0 = w⋆ 0 ) ͕ࣝผؔͰ͋Δɻ(ύϥϝʔλͷௐํ๏ ͨ͘͞Μ͋Δɻ) ͜Ε͕ࣝผؔΛ࡞͢Δํ๏Ͱ͋ΓɺࠓճͦͷதͰϑΟογϟʔ ͷઢܗผͱ͍͏ํ๏Λհ͢Δɻ 6 / 47
1-1. ྨͷΞϓϩʔν ͏ҰͭͷΞϓϩʔνͱͯ͠ɺࣝผؔ y(x) ͷ࡞Ͱͳ͘ɺ͖݅ ֬ p(Ck |x) Λ܇࿅σʔλ͔Βܾఆ͢Δํ๏͕͋Δɻ ͜͜ͰɺྨΫϥε
K ݸ (Ϋϥε 1, Ϋϥε 2, · · · , Ϋϥε K) ͋Δ ͱͯ͠ɺCk k ݸͷΫϥεΛද͢ɻ ͳͷͰɺ͖݅֬ p(Ck |x) ɺԿ͔͋Δೖྗ x ͕༩͑ΒΕͨͱ͖ʹ ͦͷೖྗ͕ k ݸͷΫϥεʹଐ͢Δ֬Λ༩͑Δɻ ͜Ε࣮ࡍɺ܇࿅σʔλ͔Β p(Ck |x) ͷܗΛҰ͔ΒܾΊΔͷͰͳ ͘ɺܗܾΊ͓͍ͯͯύϥϝʔλΛ܇࿅σʔλ͔ΒܾΊΔɻ(࠷ਪఆ Λ͢Δɻ) 7 / 47
1-1. ྨͷΞϓϩʔν ͖ͨͩ݅֬͠ͷࣝผؔͱҟͳΔͱ͜Ζɺ͖݅֬ͱͯ͠ Ծఆ͢Δؔͷܗͱͯ͠ɺن֨Խ݅ ∑ k p(Ck |x) = 1
(1.2) Λຬͨ͢Α͏ͳؔΛԾఆ͠ͳ͍͚ͯ͘ͳ͍ɻ ͜ͷΑ͏ʹ͖݅֬ΛϞσϧԽ͢Δ͜ͱʹΑͬͯɺ࠷ਪఆ͔Βϕ ΠζਪఆʹࣗવʹҠߦͰ͖Δɻ ࠓճ۩ମྫͱͯ͠ɺK = 2 ͷ࣌ͷϩδεςΟοΫճؼͷ࠷ਪఆͱ ϕΠζਪఆΛऔΓѻ͏ɻ 8 / 47
2. ϑΟογϟʔͷઢܗผ ͦΕͰɺࣝผؔΛ܇࿅σʔλ͔ΒܾΊΔํ๏ͷ۩ମྫͱͯ͠ϑΟο γϟʔͷઢܗผΛઆ໌͢Δɻ ·ͣ 2 ΫϥεྨΛߟ͑Δɻ(ΫϥεͷϥϕϧΛ C1 ͱ C2
ͱ͢Δɻ) ࣝผؔͱͯ͠ɺ(1.1) ͷ׆ੑؔ f Λ߃ؔͱͨ͠ͷΛߟ͑Δɻ y(x, w, w0 ) = wTx + w0 (2.1) ͦͯ͠ɺN ݸͷ܇࿅σʔλͷೖྗσʔλͱͯ͠ {x1 , x2 , · · · , xN } ͱ͠ɺ ͦΕΒʹରԠ͢Δඪมͱͯ͠ {t1 , t2 , · · · , tN } ͱ͠ɺศ্ٓΫϥε C1 ͕ t = 1 ʹରԠ͠ɺΫϥε C2 ͕ t = 0 ʹରԠ͢Δͱ͢Δɻ ·ͨɺೖྗϕΫτϧ͕ॴଐ͢ΔΫϥεͷఆํ๏Ͱ͋Δ͕ɺ͋Δೖྗϕ Ϋτϧ xi ͕༩͑ΒΕͨ࣌ʹ y(xi , w, w0 ) ≥ 0 Ͱ͋Ε xi Ϋϥε C1 ʹॴଐ͠ (ͭ·Γ ti = 1)ɺy(xi , w, w0 ) < 0 Ͱ͋Ε xi Ϋϥε C2 ʹ ॴଐ͢Δ (ͭ·Γ ti = 0) ͱ͢Δɻ 9 / 47
2. ϑΟογϟʔͷઢܗผ ͦͦɺD ࣍ݩͷೖྗσʔλ x Λ 1 ࣍ݩͷؔ y(x, w,
w0 ) ʹࣹӨ͠ ͯ y ͷਖ਼ෛͰॴଐ͢ΔΫϥεΛܾΊΔͷͰɺ࣍ݩͷݮগʹΑΓஅ͢Δ ͨΊͷใ͕େ͖͘ݮͬͨྔ (ͭ·Γ y) ͰΫϥεΛஅ͢Δ͜ͱʹ ͳΔɻ ͭ·ΓɺD ۭؒͰೖྗσʔλ͕ଐ͢ΔΫϥεͷྖҬಉ͕࢜Α͘ ͞Ε͍͕ͯͨɺ1 ࣍ݩͷࣹӨͷํʹΑͬͯɺͦͷ͕ͳ͘ͳΔ (ॏͳͬͯ͠·͏) ͜ͱ͕͋Δɻ D = 2 ͷ࣌ͷ۩ମྫΛ࣍ͷεϥΠυͰࣔ͢ɻ 10 / 47
2. ϑΟογϟʔͷઢܗผ ੨ͷϓϩοτ͕Ϋϥε C1 ʹॴଐ͢Δ܇࿅σʔλɺͷϓϩοτ͕Ϋϥ ε C2 ʹॴଐ͢Δ܇࿅σʔλͱ͠ɺը૾ 2 ຕͱ܇࿅σʔλશ͘ಉ͡
Ͱ͋Δɻ y(x, w, w0 ) = 0 ͷઢύϥϝʔλ w, w0 ΛมԽͤ͞Δ͜ͱʹΑͬͯɺ D = 2 ͷೖྗۭؒΛॎԣແਚʹҠಈ͢Δɻ ੨ͷϓϩοτ͕ଟ͍ํΛ y(x, w, w0 ) ≥ 0 ͷྖҬͱͯ͠ɺͷϓϩοτ ͕ଟ͍ํΛ y(x, w, w0 ) < 0 ͷྖҬͱ͢Δɻ ྆ํͷΫϥϑͰڭࢣσʔλ͘͠ɺೖྗۭؒͰΑ͘͞Ε͍ͯΔ ͕ɺࠨଆͷࣹӨͦͷΛҰ෦ফͯ͠͠·͍ͬͯΔ͜ͱ͕Θ͔Δɻ 11 / 47
2. ϑΟογϟʔͷઢܗผ ϑΟογϟʔͷઢܗผͰɺ܇࿅σʔλΛ࠷͢ΔΑ͏ͳࣹӨɺ ͭ·Γύϥϝʔλ w, w0 ΛٻΊΔ͜ͱΛߟ͑Δɻ ͦ͜Ͱ·ͣɺ܇࿅σʔλͷೖྗ {x1 ,
x2 , · · · , xN } ͷΫϥεผͷฏۉϕ ΫτϧΛ m1 , m2 ͱ͠ɺ m1 = 1 N1 ∑ n∈C1 xn , m2 = 1 N2 ∑ n∈C2 xn (2.2) ͱͳΔɻ ͜͜ͰɺN1 ͱ N2 ͦΕͧΕɺΫϥε C1 ·ͨ C2 ʹଐ͍ͯ͠Δ܇࿅ σʔλͷͰ͋Δɻ(ͪΖΜɺN1 + N2 = N Λຬͨ͢ɻ) 12 / 47
2. ϑΟογϟʔͷઢܗผ ҰํɺࣹӨޙͷΫϥεผͷฏۉ y(x, w, w0 )(2.1) ͷઢܗੑΑΓɺҎԼ ͷΑ͏ʹͳΔɻ m1
= 1 N1 ∑ n∈C1 y(xn , w, w0 ) = wTm1 + w0 m2 = 1 N2 ∑ n∈C2 y(xn , w, w0 ) = wTm2 + w0 (2.3) Α͘͞ΕࣹͨӨͰɺ͜ΕΒͷࣹӨޙͷΫϥεผͷฏۉେ͖͘ҟ ͳ͍ͬͯΔͱߟ͑ΒΕΔɻ ͭ·Γɺ m2 − m1 = wT(m2 − m1 ) (2.4) ͷ͕େ͖͍΄ͲɺࣹӨޙྑ͘͞Ε͍ͯΔͩΖ͏ɻ 13 / 47
2. ϑΟογϟʔͷઢܗผ ͨͩ͠ɺࣹӨޙͷΫϥεؒͷฏۉͷ͕ࠩେ͖ͯ͘ɺࣹӨޙͷΫϥε ผͷࢄ͕େ͖͔ͬͨΒɺ̍࣍ݩʹࣹӨ͞ΕͨΫϥεͷྖҬ͕ॏͳΓ ߹ͬͯ͠·͏ɻ ͦ͜ͰɺҎԼͷΫϥεผͷࢄͷখ͘͞ͳ͍ͬͯͯ΄͍͠ɻ s2 1 = ∑
n∈C1 {y(xn , w, w0 ) − m1 }2 s2 2 = ∑ n∈C2 {y(xn , w, w0 ) − m2 }2 (2.5) ΑͬͯɺҎԼͷϑΟογϟʔͷผج४Λ࠷େʹ͢ΔΑ͏ͳύϥϝʔλ w, w0 ͕ࣹӨޙΛ࠷େʹอͭΑ͏ͳύϥϝʔλͰ͋Δ͜ͱ͕Θ ͔Δɻ J(w, w0 ) = (m2 − m1 )2 s2 1 + s2 2 (2.6) 14 / 47
2. ϑΟογϟʔͷઢܗผ ͦΕͰɺ(2.6) ͷӈลΛ w, w0 Ͱॻ͖Լͦ͏ɻ ·ͣࢠ (2.4) ΑΓɺ
(m2 − m1 )2 = { wT(m2 − m1 ) }2 = wT(m2 − m1 )(m2 − m1 )Tw = wTSB w (2.7) ͱͳΔɻ ͜͜ͰɺSB Ϋϥεؒڞࢄߦྻͱ͍͍ɺ SB = (m2 − m1 )(m2 − m1 )T (2.8) Ͱఆٛ͞ΕΔɻ 15 / 47
2. ϑΟογϟʔͷઢܗผ ࣍ʹɺ(2.6) ͷ s2 1 + s2 2 Ͱ͋Δ͕ɺͱΓ͋͑ͣ
s2 1 ͚ͩ w, w0 Ͱॻ ͍ͯΈΔͱɺ(2.5) ΑΓɺ s2 1 = ∑ n∈C1 {wTxn + w0 − m1 }2 = ∑ n∈C1 {wTxn − wTm1 }2 = ∑ n∈C1 {wT(xn − m1 )}2 = ∑ n∈C1 wT(xn − m1 )(xn − m1 )Tw =wT [ ∑ n∈C1 (xn − m1 )(xn − m1 )T ] w (2.9) ͱͳΔɻ 16 / 47
2. ϑΟογϟʔͷઢܗผ ಉ༷ͷมܗΛ s2 2 ʹ͍ͭͯߦ͏ͱɺ(2.6) ͷ s2 1 +
s2 2 ҎԼͷΑ͏ ʹͳΔɻ s2 1 + s2 2 = wTSW w (2.10) ͜͜ͰɺSW ૯Ϋϥεڞࢄߦྻͱ͍͍ɺҎԼͰఆٛ͞ΕΔɻ SW = ∑ n∈C1 (xn − m1 )(xn − m1 )T + ∑ n∈C2 (xn − m2 )(xn − m2 )T (2.11) ͜ΕΑΓɺϑΟογϟʔͷผج४ (2.6) w Λ༻͍ͯҎԼͷΑ͏ʹॻ ͚Δɻ J(w, w0 ) = J(w) = wTSB w wTSW w (2.12) ͜͜Ͱɺ݁ՌతʹϑΟογϟʔͷผج४ w ʹͷΈґଘ͠ɺw0 ʹ ґଘ͠ͳ͍͜ͱ͕Θ͔Δɻ 17 / 47
2. ϑΟογϟʔͷઢܗผ ͜Ε͔ΒϑΟογϟʔͷผج४ J(w) Λ࠷େʹ͢Δ w ΛٻΊ͍ͯ͘ ͜ͱʹͳΔ͕ɺ͜͜Ͱ 1 ͭ
J(w) ʹॏཁͳੑ࣭͕͋Δɻ θϩͰͳ͍ఆ α Λ༻ҙ͠ɺJ(αw) Λܭࢉ͢Δͱɺ J(αw) = (αw)TSB (αw) (αw)TSW (αw) = wTSB w wTSW w = J(w) (2.13) ͱͳΓɺϑΟογϟʔͷผج४ J(w) w ͷఆഒʹରͯ͠ෆมͰ ͋Δɻ(εέʔϧෆมੑ) ͜ͷੑ࣭ΑΓɺJ(w) Λ࠷େʹ͢Δ w(= w⋆) ͕ݟ͔ͭΔͱɺͦͷఆ ഒͷ αw⋆ J(w) Λ࠷େʹ͢Δͱ͍͏͜ͱ͕Θ͔Δɻ ͜ΕΑΓɺզʑ J(w) Λ࠷େʹ͢Δ w ΛٻΊΔࡍɺ࠷େʹ͢Δ w ͲͷํΛ͍͍ͯΔ͔͚ͩΛΕྑ͍ࣄʹͳΔɻ 18 / 47
2. ϑΟογϟʔͷઢܗผ ͦΕͰ࣮ࡍʹ J(w) Λ࠷େʹ͢Δ w ΛٻΊΔҝʹɺJ(w) Λ w Ͱඍ
͢Δͱɺ ∂J(w) ∂w = 2 (wTSW w)2 { SB w(wTSW w) − SW w(wTSB w) } (2.14) ͱͳΓɺ͜ΕΛθϩʹ͢Δ w ҎԼͷࣜΛຬͨ͢ɻ SB w(wTSW w) − SW w(wTSB w) = 0 (2.15) ྆ลɺࠨ͔Β S−1 W Λ͔͚ͯɺগࣜ͠มܗΛ͢Δͱɺ S−1 W SB w(wTSW w) − w(wTSB w) = 0 →w = (wTSW w) (wTSB w) S−1 W SB w w ∝ S−1 W SB w (2.16) ͱͳΔ͜ͱ͕Θ͔Δɻ 19 / 47
2. ϑΟογϟʔͷઢܗผ ͜͜ͰɺSB w SB w = (m2 −
m1 )(m2 − m1 )Tw ∝ m2 − m1 (2.17) ͱͳΔ͜ͱΛ༻͍ΔͱɺϑΟογϟʔͷผج४ J(w) Λ࠷େʹ͢Δϕ Ϋτϧ w w ∝ S−1 W (m2 − m1 ) (2.18) ͱͳΓɺS−1 W (m2 − m1 ) ͱಉ͡ํΛ͍͍ͯΔϕΫτϧͰ͋Δ͜ͱ͕ Θ͔Δɻ Αͬͯɺߴ࣍ݩͷೖྗσʔλΛ (2.1) Λͬͯ̍࣍ݩʹࣹӨ͠ɺΫϥε Λஅ͢ΔࡍɺϑΟογϟʔͷઢܗผΛ༻͍ΔͱɺS−1 W (m2 − m1 ) ํ ͷύϥϝʔλ w Λ༻͍ࣹͯӨΛߦ͏ͱɺೖྗσʔλͷΛ࠷େݶ ʹอͪͳ͕ΒࣹӨ͢Δ͜ͱ͕Ͱ͖Δ͜ͱ͕Θ͔ͬͨɻ 20 / 47
3. ϩδεςΟοΫճؼ ͜Ε·Ͱࣝผؔ y(x) Λڭࢣσʔλ͔Βܾఆ͢Δํ๏ΛऔΓѻͬͨ ͕ɺ͔͜͜Β͖݅֬ p(Ck |x) Λܾఆ͢Δํ๏ΛऔΓѻ͏ɻ ࣝผؔͷ࣌ͱಉ͡Α͏ʹɺ܇࿅σʔλͱͯ͠ೖྗσʔλͷू߹
{x1 , x2 , · · · , xN } ͱͦΕͧΕʹରԠ͢Δඪมͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ ࠓճΫϥε K ͕ 2 ͷ࣌ (ೋྨ) Λѻ͏ͨΊɺతม tn 0 ͔ 1 ͷࢄతͳΛͱΔɻ ࠓճྨͷϞσϧͱͯ͠ɺϩδεςΟοΫճؼϞσϧΛհ͢Δɻ (ʮճؼʯͱ͍͍ͯΔ͕ɺྨͷϞσϧͰ͋Δɻ) 21 / 47
3-1. ϩδεςΟοΫճؼ ࣝผؔͷ࣌ͱಉ༷ʹɺ܇࿅σʔλΛ༻͍͖ͯ݅֬ p(Ck |x) ΛҰ ͔Β࡞Δ͜ͱͤͣɺ͖݅֬ p(Ck |x) ʹύϥϝʔλ
w Λಋೖ͠ ͯɺp(Ck |x, w) Λߟ͑Δɻ ࠓճͷϩδεςΟοΫճؼͰɺΫϥε 1 ʹରͯ͠ɺҎԼͷΑ͏ͳؔ p(C1 |x, w) Λߟ͑Δɻ p(C1 |x, w) = σ(wTϕ(x)) (3.1) 22 / 47
3-1. ϩδεςΟοΫճؼ ·ͣɺ(3.1) ͷӈลͷҾͷதʹ͋ΔϕΫτϧؔ ϕ(·) ඇઢܗͳؔ ϕj (x) (j =
0, · · · , M − 1) ΛॎʹฒͨϕΫτϧؔ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T Ͱ͋Δɻ ྫ͑ɺجఈؔ ϕj (x) ͱͯ͠ҎԼͷΨεجఈ͕ؔ͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (3.2) ͜ͷجఈؔ x = µj Λத৺ʹͯ͠ɺࢄ s2 ʹΑͬͯࢧ͞ΕΔ͕ ΓΛ࣋ͭΨεجఈؔͰ͋Δɻ Ҏ߱Ұൠͷجఈؔ ϕj (x) Λ༻͍ͯٞ͢Δɻ 23 / 47
3-1. ϩδεςΟοΫճؼ ·ͣɺؔ σ(·) ϩδεςΟοΫγάϞΠυؔͱݺΕɺҎԼͰఆ ٛ͞ΕΔɻ σ(x) = 1 1
+ e−x (3.3) ਤͰॻ͘ͱҎԼͷΑ͏ʹͳΔɻ ϩδεςΟοΫγάϞΠυؔఆٛʹΑΓɺ0 ͔Β 1 ·ͰͷΛͱΔ ͷͰɺ(3.1) ΑΓ p(C1 |x, w) ҎԼͷΑ͏ʹ֬ͷຬ͖ͨ͢ͷൣғ ʹΛͱΔɻ 0 < p(C1 |x, w) < 1 (3.4) 24 / 47
3-1. ϩδεςΟοΫճؼ ҰํɺΫϥε 2 ʹରͯ͠ɺp(C2 |x, w) ҎԼͷΑ͏ʹԾఆ͢Δɻ p(C2 |x,
w) = 1 − p(C1 |x, w) = 1 − σ(wTϕ(x)) (3.5) (3.4) ΑΓɺp(C2 |x, w) ҎԼͷΑ͏ʹ֬ͷຬ͖ͨ͢ͷൣғʹ ΛͱΔɻ 0 < p(C2 |x, w) < 1 (3.6) ͜ΕΒͷఆٛʹΑΓɺԾఆͨؔ͠ p(Ck |x, w) ن֨Խ݅ (1.2) Λຬ ͨ͢ɻ ∑ k p(Ck |x, w) = p(C1 |x, w) + 1 − p(C1 |x, w) = 1 (3.7) t = 1 ͷΫϥεΛ C1 ͱ͠ɺt = 0 ͷΫϥεΛ C2 ͱ͍ͯ͠ΔͷͰɺؔ p(t|x, w) p(t|x, w) = σ(wTϕ(x))t(1 − σ(y(wTϕ(x))))1−t (3.8) ͱͳΔɻ (͜ͷΑ͏ͳΛϕϧψʔΠͱ͍͏) 25 / 47
3-2. ϩδεςΟοΫճؼͷ࠷ਪఆ ࣍ʹ࠷ਪఆΛߦ͏͜ͱΛߟ͑Δɻ ͍ͭͷΑ͏ʹɺ܇࿅σʔλͱͯ͠ೖྗσʔλͷू߹ X = {x1 , x2 ,
· · · , xN } ͱͦΕͧΕʹରԠ͢Δඪมͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ ڭࢣσʔλҰͭҰ͕ͭ (3.8) ͔Βಠཱʹੜ͞Ε͍ͯΔͱ͢Δͱɺ ؔ p(t|X, w) ҎԼͷΑ͏ʹͳΔɻ p(t|X, w) = N ∏ n=1 ytn n (1 − yn )1−tn (3.9) ͱͳΔɻ ͜͜Ͱ yn ҎԼͰఆٛ͞ΕΔɻ yn = σ(wTϕ(xn )) (3.10) 26 / 47
3-2. ϩδεςΟοΫճؼͷ࠷ਪఆ ؔ (3.9) Λ࠷େʹ͢Δ w ΛٻΊΔ͜ͱҎԼͷෛͷରΛ ࠷খʹ͢Δ w ΛٻΊΔ͜ͱͱՁͰ͋Δɻ
E(w) = − ln p(t|X, w) = − N ∑ n=1 ln { ytn n (1 − yn )1−tn } = − N ∑ n=1 { tn ln yn + (1 − tn ) ln (1 − yn ) } (3.11) ͜ΕަࠩΤϯτϩϐʔޡࠩͱݺΕΔޡࠩؔͰɺྨͰΑ͘ ΘΕΔޡࠩؔͰ͋Δɻ ྨʹ͓͚ΔަࠩΤϯτϩϐʔޡࠩͷ࠷খԽɺ֬Λ༻͍Δͱ ؔΛϕϧψʔΠ (3.8) ͱԾఆͨ͠ͱ͖ͷ࠷ਪఆͷ݁ՌͰ͋ Δࣄ͕Θ͔Δɻ 27 / 47
3-2. ϩδεςΟοΫճؼͷ࠷ਪఆ ࣍ʹෛͷର (3.11) Λ࠷খʹ͢Δ w ΛٻΊΔͨΊʹ (3.11) ͷ w
ʹର͢ΔޯΛٻΊΔͱҎԼͷΑ͏ʹͳΔɻ(PRML ͷԋश 4.13 ࢀর) ∇E(w) = N ∑ n=1 (yn − tn )ϕ(xn ) (3.12) ͜ͷޯͷܗਖ਼ղϥϕϧ tn ͱ༧ଌ yn ͷࠩ (ͭ·Γޡࠩ) ͱجఈؔ ϕΫτϧ ϕ(xn ) ͷͷܗΛ͍ͯ͠Δɻ 28 / 47
3-2. ϩδεςΟοΫճؼͷ࠷ਪఆ ͜ͷޯ ∇E(w) Λθϩʹ͢Δ w ΛղੳతʹٻΊΔ͜ͱͰ͖ͳ͍ɻ ͦͷཧ༝༧ଌ y =
σ(wTϕ(x)) ͕ϩδεςΟοΫؔΛ׆ੑԽؔ ʹ͔࣋ͭΒͰ͋Δɻ ͜ͷΑ͏ʹޯ ∇E(w) Λθϩʹ͢Δ w ΛղੳతʹٻΊΔ͜ͱ͕Ͱ͖ ͳ͍࣌ޯ߱Լ๏Λ༻͍Δ͜ͱ͕͋Δɻ(χϡʔϥϧωοτͰ͜ͷ ํ๏͕Α͘༻͍ΒΕΔɻ) ޯ߱Լ๏Ͱɺ·ͣϥϯμϜʹܾΊͨύϥϝʔλͷॳظΛ w(0) ͱ ͠ɺޡࠩؔͷޯΛ༻͍ͯύϥϝʔλΛҎԼͷΑ͏ʹߋ৽͢Δɻ w(1) = w(0) − η∇E(w(0)) (3.13) ͜͜Ͱ η > 0 ֶशύϥϝʔλͱݺͿɻ ͜ΕΛ܁Γฦ͢͜ͱͰύϥϝʔλ͕ޯ ∇E(w) ͕খ͘͞ͳΔํʹߋ ৽͞ΕɺE(w) Λ࠷খʹ͢Δύϥϝʔλʹऩଋ͢Δɻ 29 / 47
ϕΠζਪఆʹ͍ͭͯ ͜Ε·Ͱ (࠷ਪఆ) ͰɺؔΛ࠷େʹ͢ΔΑ͏ͳύϥϝʔλ w Λਪఆ͖ͯͨ͠ɻ ϕΠζਪఆͰɺڭࢣσʔλΛ༻͍ͯύϥϝʔλ w ͷ֬ (Ͱͳ
͘෯ΛͭɺࣄޙͱݺΕΔ) ΛٻΊΔɻ ͦͷࣄޙΛ༻͍ͯɺະͷσʔλͷೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ t ͷ༧ଌ p(t|x, t, X) ΛٻΊΔɻ ͜ͷ༧ଌ p(t|x, t, X) ɺະͷσʔλʹର͢Δ͖݅֬ p(t|x, w) ͱࣄޙ p(w|t, X) Λ༻͍ͯ p(t|x, t, X) = ∫ p(t|x, w)p(w|t, X) dw (3.14) ͱॻ͚Δɻ(PRML 1.68 ࣜࢀর) ϕΠζਪఆʹ͍ͭͯɺPRML 1.2.3 ࢀরɻ 30 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ࣍ϩδεςΟοΫճؼΛϕΠζతʹѻ͏͜ͱΛߟ͑Δɻ ϕΠζਪఆͰະͷೖྗ x ʹର͢Δग़ྗ t ͷ༧ଌ p(t|x, t,
X) Λ ٻΊΔ͜ͱ͕తͱͳΔɻ (3.14) ΑΓɺͦͷ༧ଌ p(t|x, t, X) ॏΈ w ͷੵͰҎԼͷΑ͏ʹ ͔͚Δɻ p(t|x, t, X) = ∫ p(t|x, w)p(w|t, X) dw (3.15) ͜͜Ͱɺp(t|x, w) ؔͰ͋Γɺp(w|t, X) ύϥϝʔλͷࣄޙ Ͱ͋Δɻ ಛʹࠓճೋྨΛߟ͍͑ͯΔͷͰɺ֬ p(C1 |x, t, X) = ∫ p(C1 |x, w)p(w|t, X) dw (3.16) ͚ͩΛੵͯ͠ٻΊͯɺp(C2 |x, t, X) p(C2 |x, t, X) = 1 − p(C1 |x, t, X) (3.17) ͷΑ͏ʹن֨Խ͔݅ΒٻΊΔ͜ͱΛߟ͑Δɻ 31 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ͨͩ͠ɺੵ (3.16) Λղੳతʹղ͘ͷෆՄೳͰ͋Δɻ(͜Εϩδε ςΟοΫγάϞΠυؔͷӨڹͰ͋Δ) ͕ͨͬͯ͠ɺੵΛۙࣅతʹٻΊΔ͜ͱΛߟ͑Δɻ ࠓճ (PRML 4
ষͰ) ϥϓϥεۙࣅΛ༻͍ͯੵΛۙࣅతʹٻΊͯ ͍Δɻ ۩ମతʹ (3.16) ͷύϥϝʔλͷࣄޙ p(w|t, X) ʹϥϓϥεۙࣅ Λద༻ͯ͠ɺΨεʹۙࣅ͢Δɻ ͜͜Ͱϥϓϥεۙࣅͷઆ໌Λগ͠ߦ͏ɻ 32 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ·ͣ֬ม͕Ұ࣍ݩͷม z ͷ߹Λߟ͑ɺҎԼͷΑ͏ͳ֬ p(z) Λߟ͑Δɻ p(z) = 1
Z f(z) (3.18) ͜͜ͰɺZ ҎԼͰఆٛ͞ΕΔن֨ԽఆͰ͋Δɻ Z = ∫ f(z) dz (3.19) ϥϓϥεۙࣅͷత p(z) ΛϞʔυ (dp(z)/dz = 0 ͱͳΔ z) Λத ৺ͱ͢ΔΨεʹۙࣅ͢Δ͜ͱͰ͋Δɻ ·ͣϞʔυ z = z0 Λݟ͚ͭΔɻϞʔυ (3.18) ΑΓ df(z) dz z=z0 = 0 (3.20) ͳΔ z0 Ͱ͋Δɻ 33 / 47
3-3. ϕΠζϩδεςΟοΫճؼ Ϟʔυ͕ٻ·ͬͨΒɺؔ ln f(z) Λ z = z0 पΓͰҎԼͷΑ͏ʹςΠ
ϥʔల։ͷ 2 ࣍·ͰͰۙࣅ͢Δɻ ln f(z) ∼ ln f(z0 ) − 1 2 A(z − z0 )2 (3.21) ͜͜Ͱɺ A = − d2 dz2 ln f(z) z=z0 (3.22) Ͱ͋Δɻ ͜͜Ͱɺ(3.20) ʹΑΓ (3.21) ͷӈลͰҰ࣍ͷ߲͕ଘࡏ͠ͳ͍ɻ (3.21) ͷ྆ลͷࢦΛͱΔͱ f(z) ∼ f(z0 ) exp { − 1 2 A(z − z0 )2 } (3.23) ͱͳΔɻ 34 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ن֨ԽΛ͢Δͱɺ p(z) p(z) ∼ ( A 2π
)1/2 exp { − 1 2 A(z − z0 )2 } (3.24) ͱۙࣅͰ͖Δɻ͜Ε͕ϥϓϥεۙࣅͰ͋Δɻ ͨͩ͠ҙͱͯ͠ɺA > 0 Ͱͳ͍ͱΨε͕ఆٛͰ͖ͳ͍ɻ 35 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ࣍Ұ࣍ݩͷ֬ม͔ΒɺϕΫτϧʹ֦ு͠Α͏ɻ ͭ·ΓɺҎԼͷ֬ p(z) Λఆٛ͢Δɻ p(z) = 1 Z
f(z) (3.25) ͜͜Ͱɺ Z = ∫ f(z) dz (3.26) Ͱ͋Δɻ Ұ࣍ݩͷ֬มͱಉ͡Α͏ʹޯ ∇f(z) ͕θϩʹͳΔ z0 Λٻ ΊΔɻ 36 / 47
3-3. ϕΠζϩδεςΟοΫճؼ Ϟʔυ͕ٻ·ͬͨΒɺln f(z) Λ z0 पΓͰςΠϥʔల։Ͱۙࣅ͢Δɻ ln f(z) ∼
ln f(z0 ) − 1 2 (z − z0 )TA(z − z0 ) (3.27) ͜͜ͰɺA ҎԼͰఆٛ͞ΕΔ M × M ͷϔοηߦྻͰ͋Δɻ A = −∇∇ ln f(z) z=z0 (3.28) ࣍ʹ (3.27) ͷ྆ลͷࢦΛͱΔͱҎԼͷΑ͏ʹͳΔɻ f(z) ∼ f(z0 ) exp { − 1 2 (z − z0 )TA(z − z0 ) } (3.29) ͜ΕΑΓن֨ԽΛ͢Δͱɺ p(z) p(z) ∼ |A|1/2 (2π)M/2 exp { − 1 2 (z − z0 )TA(z − z0 ) } = N(z|z0 , A−1) (3.30) ͱΨεʹۙࣅͰ͖Δɻ 37 / 47
3-3. ϕΠζϩδεςΟοΫճؼ Ҏ্Ͱઆ໌ͨ͠ϥϓϥεۙࣅΛ༻͍ͯҎԼͷੵ (3.16) Λۙࣅ͍ͨ͠ɻ p(C1 |x, t, X) =
∫ p(C1 |x, w)p(w|t, X) dw (3.31) ·ͣɺࣄޙ p(w|t, X) ΛٻΊΔͨΊʹࣄલΛಋೖ͢Δɻ p(w) = N(w|m0 , S0 ) (3.32) (3.18) ΑΓɺؔ p(t|X, w) p(t|X, w) = N ∏ n=1 ytn n (1 − yn )1−tn (3.33) Ͱ͋ͬͨɻ 38 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ͜ΕΑΓɺࣄޙ p(w|t, X) ϕΠζͷఆཧΑΓɺҎԼͰ͋Δɻ p(w|t, X) ∝ p(w)p(t|X,
w) (3.34) ͱͳΔͷͰɺln p(w|t, X) ҎԼͱͳΔɻ ln p(w|t, X) = − 1 2 (w − m0 )TS−1 0 (w − m0 ) + N ∑ n=1 { tn ln yn + (1 − tn ) ln (1 − yn ) } + const. (3.35) ͜ͷࣄޙͷର ln p(w|t, X) Λ࠷େʹ͢Δύϥϝʔλ wMAP Λ (ͨ ͱ͑ޯ߱Լ๏ͳͲͰ) ٻΊͯɺͦͷ wMAP ͰͷϔοηߦྻΛٻΊ ΔͱɺҎԼͷΑ͏ʹͳΔɻ S−1 N = − ∇∇ ln p(w|t, X) w=wMAP =S−1 0 + N ∑ n=1 yn (1 − yn )ϕn ϕT n w=wMAP (3.36) 39 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ΑͬͯɺϥϓϥεۙࣅΛ༻͍Δͱࣄޙ p(w|t, X) ҎԼͷΑ͏ʹۙ ࣅͰ͖Δɻ p(w|t, X) ∼
N(w|wMAP , SN ) (3.37) ͜ΕΑΓɺ(3.31) ͷੵҎԼͷΑ͏ʹۙࣅͰ͖Δɻ p(C1 |x, t, X) ∼ ∫ σ(wTϕ) N(w|wMAP , SN ) dw (3.38) ͜͜Ͱɺp(C1 |x, w) = σ(wTϕ) Λར༻ͨ͠ɻ ࣍ʹɺϩδεςΟοΫγάϞΠυؔΛҎԼͷΑ͏ʹॻ͖͢ɻ σ(wTϕ) = ∫ δ(a − wTϕ)σ(a) da (3.39) ͜͜Ͱɺδ(·) σϡϥοΫͷσϧλؔͰ͋Δɻ 40 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ͜ΕΑΓɺ(3.38) ҎԼͷΑ͏ʹॻ͖ͤΔɻ ∫ σ(wTϕ) N(w|wMAP , SN )
dw = ∫ ∫ δ(a − wTϕ)σ(a) N(w|wMAP , SN ) da dw = ∫ ∫ δ(a − wTϕ) N(w|wMAP , SN ) dw σ(a) da = ∫ p(a) σ(a) da (3.40) ͜͜Ͱɺ p(a) = ∫ δ(a − wTϕ) N(w|wMAP , SN ) dw (3.41) Ͱ͋Δɻ 41 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ੵ (3.41) ʹ͓͍ͯɺϕ ʹฏߦͳͯ͢ͷํͷ w ੵͦΕΒͷύ ϥϝʔλʹઢܗ੍Λ༩͑ɺ·ͨ ϕ
ʹߦ͢Δͯ͢ͷํͷ w ੵ Ψε N(w|wMAP , SN ) ͷपลԽΛ༩͑Δɻ ͨͱ͑ɺw = (w1 , w2 )T ͱ͠ɺϕ = (ϕ, 0)T Ͱ͋Δͱ͖Λߟ͑Δͱɺ ੵ (3.41) ҎԼͷΑ͏ʹ͔͚Δɻ p(a) = ∫ ∫ δ(a − w1 ϕ) N(w|wMAP , SN ) dw1 dw2 = ∫ δ(a − w1 ϕ) [ ∫ N(w|wMAP , SN ) dw2 ] dw1 (3.42) (??) ΑΓɺΨεΛपลԽͨ͠पล࠶ͼΨεͰ͋Δ ͜ͱ͕Θ͔͍ͬͯΔͷͰɺϕ ʹߦ͢Δ w2 ํͷੵΨεͷ पลԽΛ༩͑ɺͦͷपลԽ͞ΕͨΨε N(w1 |(wMAP )1 , (SN )11 ) ͱͳΔɻ 42 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ·ͨɺw1 ͷੵΛ͢Δͱɺੵ (3.41) ҎԼͷΑ͏ʹͳΔɻ p(a) = ∫ δ(a
− w1 ϕ) N(w1 |(wMAP )1 , (SN )11 ) dw1 = 1 |ϕ| N(a/ϕ|(wMAP )1 , (SN )11 ) =N(a|(ϕwMAP )1 , (ϕ2SN )11 ) (3.43) ͭ·Γɺϕ ʹฏߦͳ w1 ͷํͷੵ w1 ʹ w1 = a/ϕ ͳΔઢܗ੍ Λ༩͑Δ͜ͱ͕Θ͔Δɻ ͜ΕΑΓɺp(a) ֬ม͕ a ͷΨεʹͳΔ͜ͱ͕Θ͔Δɻ 43 / 47
3-3. ϕΠζϩδεςΟοΫճؼ Ψεฏۉͱࢄ͕ܾ·Εɺܗ͕Ұҙʹఆ·Γɺฏۉ µa ͱ ࢄ σ2 a ҎԼͷΑ͏ʹͳΔɻ µa
= ∫ p(a)a da = ∫ ∫ aδ(a − wTϕ) N(w|wMAP , SN ) dwda = ∫ wTϕ N(w|wMAP , SN ) dw = wT MAP ϕ (3.44) σ2 a = ∫ p(a)(a2 − µ2 a ) da = ∫ ∫ (a2 − µ2 a )δ(a − wTϕ) N(w|wMAP , SN ) dwda = ∫ ((wTϕ)2 − (wT MAP ϕ)2) N(w|wMAP , SN ) dw =ϕT [ ∫ (wwT − wMAP wT MAP ) N(w|wMAP , SN ) dw ] ϕ =ϕTSN ϕ (3.45) 44 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ͢Δͱɺ༧ଌ p(C1 |x, t, X) (3.40) ΑΓɺҎԼͷΑ͏ʹͳΔ͜ͱ
͕Θ͔Δɻ p(C1 |x, t, X) ∼ ∫ σ(a)N(a|µa , σ2 a ) da (3.46) ͜͜Ͱɺµa ͱ σ2 a (3.44) ͱ (3.45) Ͱܭࢉͨ͠ฏۉͱࢄͷύϥϝʔ λͰ͋Δɻ ͜ͷੵ (3.46) ·ͨղੳతʹੵͰ͖ͳ͍ɻ ͦ͜ͰҎԼͷϓϩϏοτؔͷٯؔ Φ(a) Λಋೖ͢Δɻ Φ(a) = 1 2 { 1 + erf ( a √ 2 )} (3.47) ͜͜Ͱɺޡࠩؔ erf(a) ҎԼͰఆٛ͞ΕΔɻ erf(a) = 2 √ π ∫ a 0 exp (−θ2) dθ (3.48) 45 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ϓϩϏοτؔͷٯؔ Φ (√ π 8 a ) ʹΑͬͯϩδεςΟοΫγάϞΠ
υؔ σ(a) Λۙࣅ͢Δ͜ͱ͕Ͱ͖Δɻ ҎԼϩδεςΟοΫγάϞΠυؔ σ(a)(ͷ࣮ઢ) ͱϓϩϏοτؔ ͷٯؔ Φ (√ π 8 a ) (੨ͷઢ) Λൺֱͨ͠ਤͰ͋Δɻ 46 / 47
3-3. ϕΠζϩδεςΟοΫճؼ ͞ΒʹϓϩϏοτؔͷٯؔʹҎԼͷੑ࣭͕͋Δɻ(PRML ͷԋश 4.26 ࢀর) ∫ Φ(λa)N(a|µ, σ2) da
= Φ ( µ (λ−2 + σ2)1/2 ) (3.49) ͜ΕΒͷੑ࣭Λ༻͍ͯɺੵ (3.46) ΛҎԼͷΑ͏ʹۙࣅͯ͠ٻΊΔɻ p(C1 |x, t, X) ∼ ∫ σ(a)N(a|µa , σ2 a ) da ∼ ∫ Φ (√ π 8 a ) N(a|µa , σ2 a ) da =Φ ( µa (8/π + σ2 a )1/2 ) ∼ σ (√ 8 π µa (8/π + σ2 a )1/2 ) =σ ( µa (1 + πσ2 a /8)1/2 ) (3.50) ͜͜Ͱɺµa ͱ σ2 a (3.44) ͱ (3.45) Ͱ͋Δɻ 47 / 47