Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ベイズ深層学習(2.2~2.4)

catla
January 17, 2020

 ベイズ深層学習(2.2~2.4)

「ベイズ深層学習」(著 須山 敦志)2.2節から2.4節のスライド

RNNの逆伝播の計算は以下でまとめています.
https://drive.google.com/file/d/1QkGb-Ra_E5PRSzq2YOzhax524GtBIRgO/view?usp=sharing

catla

January 17, 2020
Tweet

More Decks by catla

Other Decks in Education

Transcript

  1. ϕΠζਂ૚ֶश

    d
    ܡɹঘً

    View Slide

  2. ࠓճͷςʔϚ
    w χϡʔϥϧωοτϫʔΫ
    w ޮ཰తͳֶश๏
    w χϡʔϥϧωοτϫʔΫͷ֦ுϞσϧ

    View Slide

  3. χϡʔϥϧωοτϫʔΫ

    View Slide

  4. ॱ఻೻ܕχϡʔϥϧωοτϫʔΫ
    ૚ͷॱ఻೻ܕχϡʔϥϧωοτ͸ҎԼͷΑ͏ʹϞσϧԽग़དྷΔɽ
    yn
    = W(2)Φ(W(1)xn
    ) + ϵn
    yn
    ∈ ℝD
    xn
    ∈ ℝH0
    ϵn
    ∈ ℝD
    W(1) ∈ ℝH1
    ×H0
    W(2) ∈ ℝD×H1
    Φ( ⋅ )
    ೖྗɿ
    ϥϕϧɿ
    ϊΠζɿ
    ૚໨ͷύϥϝʔλɿ
    ૚໨ͷύϥϝʔλɿ
    ඇઢܗؔ਺׆ੑԽؔ਺ɿ
    z = Φ(W(1)xn
    ) ͱ͢Δͨ͠ͱ͖ɼ
    ɹ ΛӅΕϢχοτͱݺͿɽ·ͨɼ
    ӅΕϢχοτ΍ೖྗʹରͯ͠૯࿨Λऔͬͨ
    ΋ͷΛ׆ੑͱ͍͏ɽ
    zn,h1
    ∈ ℝ

    View Slide

  5. ॱ఻೻ܕχϡʔϥϧωοτϫʔΫ
    Φ( ⋅ ) ͸ಛʹ׆ੑԽؔ਺ͱݺ͹Εɼओʹඇઢܗؔ਺͕༻͍ΒΕΔɽ
    ʲ׆ੑԽؔ਺ͷ঺հʳ
    wγάϞΠυؔ਺
    w૒ۂઢਖ਼઀ؔ਺ 5BOI

    wྦྷੵ෼෍ؔ਺
    wΨ΢εͷޡࠩؔ਺
    wϥϯϓؔ਺ʢ3F-6ʣ
    wࢦ਺ઢܗؔ਺ʢ&-6ʣ
    wFUDʜ

    View Slide

  6. ॱ఻೻ܕχϡʔϥϧωοτϫʔΫ
    ීวੑఆཧʢVOJWFSTBMBQQSPYJNBUJPOUIFPSFNʣ
    ɹ૚ʢӅΕ૚͕ͭʣͷॱ఻೻ܕχϡʔϥϧωοτʹ͓͍ͯɼӅΕϢχοτͷ
    ਺Λ૿΍͢͜ͱʹΑͬͯɼ೚ҙͷ࿈ଓؔ਺ʹۙࣅͰ͖Δɽ
    yn
    = W(L)Φ(W(L−1)⋯Φ(W(1)xn
    )⋯) + ϵn
    ෳ਺૚Λ΋ͭॱ఻೻ܕχϡʔϥϧωοτϫʔΫ
    ɹ૚਺͕૚Ҏ্ͱͳΔΑ͏ͳਂ͍ωοτϫʔΫߏ଄Λ΋ͭϞσϧΛҰൠతʹਂ૚ֶशͱ
    ݺͿɽʢຊॻͰ͸ɼ૚ͷχϡʔϥϧωοτϫʔΫ΋ਂ૚ֶशϞσϧͷͭͱͯ͠ߟ͑Δɽʣ
    ૚ͷχϡʔϥϧωοτʹ͓͍ͯɼ ͷཁૉΛશͯ ʹݻఆ͢ΔͱɼϞσϧ͸ҰൠઢܗϞσ
    ϧʢ(-.ʣͱҰக͢Δɽ͜ͷ৔߹ɼϦϯΫؔ਺͸׆ੑԽؔ਺ͷٯؔ਺ʹ֘౰ɽ
    ͭ·Γɼଟ૚ߏ଄Λ࣋ͭχϡʔϥϧωοτϫʔΫϞσϧ͸ɼ(-.ʹ͓͚Δඇઢܗͳɹ
    ɹม׵Λ܁Γฦ͠ద༻ͨ͠ϞσϧͱղऍՄೳɽ
    W(2) 1

    View Slide

  7. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    ɹॱ఻೻ܕχϡʔϥϧωοτϫʔΫ͸ɼඇઢܗؔ਺ͷதʹֶशର৅ͷύϥϝʔλ͕͋Δ͜ͱ
    Ͱղੳతʹ࠷খղ͕ٻ·Βͳ͍ɽ
    w w w w w w w w
    ໰୊఺
    ղܾࡦ
    ɹܭࢉػΛ࢖༻ͯ͠ɼ਺஋తʹ࠷খ஋ΛٻΊΔ࠷దԽख๏Λಋೖɽ
    ࠷΋Α͘࢖ΘΕΔ࠷దԽख๏͕ɹޯ഑߱Լ๏ɹ
    ࣍ݩͷύϥϝʔλ Λ΋ͭϞσϧʹର͢Δޡࠩؔ਺Λ ͱ͠ɼޯ഑ΛҎԼͱ͢Δɽ
    M w E(w)
    ∇w
    E(w) =
    ∂E(w)
    ∂w
    = (
    ∂E(w)
    ∂w1
    ,
    ∂E(w)
    ∂w2
    , …,
    ∂E(w)
    ∂wM
    )
    T
    ͜Ε͸ɼޡࠩؔ਺͕ϢʔΫϦουڑ཭ͷۙ๣Ͱ࠷΋ٸʹ૿Ճ͢Δํ޲ੑΛද͍ͯ͠Δɽ
    ޯ഑߱Լ๏Ͱ͸ɼύϥϝʔλ ʹରͯ͠ద౰ͳॳظ஋Λ༩͑ɼޯ഑ͱٯํ޲ʹύϥϝʔλΛ
    ಈ͔͢͜ͱΛ܁Γฦͯ͠࠷దԽΛߦ͏ɽ
    w
    wnew
    = wold
    − α∇w
    E(w)|
    w=wold
    (α > 0)

    View Slide

  8. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    wnew
    = wold
    − α∇w
    E(w)|
    w=wold
    ͸ֶश཰ͱݺ͹Εɼճͷߋ৽ͰύϥϝʔλΛಈ͔͢ྔΛࢦఆ͢Δ஋ɽ
    α
    ֶश཰ͷτϨʔυΦϑ
    ֶश཰ େ͖͍ খ͍͞
    ֶशͷ଎౓
    ऩଋͷ҆ఆੑ
    ◯ ×
    × ◯

    View Slide

  9. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    ύϥϝʔλ਺ ͕ଟ͘ͳ͍৔߹ɼޡࠩؔ਺ͷ֊ඍ෼Λར༻ͯ͠࠷దԽΛޮ཰Խ͢Δ͜ͱ΋
    Մೳɽ ɹχϡʔτϯɾϥϑιϯ๏ʢ/FXUPO3BQITPONFUIPEʣ͕୅දྫɽ
    M

    ࠷খԽ͍ͨ͠ޡࠩؔ਺Λ͋Δ ·ΘΓͷςʔϥʔల։ʹΑΓೋ࣍ۙࣅ͢Δͱɼ
    ¯
    w
    E(w) ≈ ˜
    E(w)
    = E( ¯
    w) + ∇w
    E(w)|T
    w= ¯
    w
    (w − ¯
    w) +
    1
    2
    (w − ¯
    w)T ∇2
    w
    E(w)|
    w= ¯
    w
    (w − ¯
    w)
    ͸ޡࠩؔ਺ ʹର͢Δϔοηߦྻʢ)FTTJBONBUSJYʣͰ͋Δɽ͢ͳΘͪɼରশߦྻɽ
    ∇2
    w
    E E
    H = ∇2
    w
    E(w) =
    ∂2E(w)
    ∂w2
    1
    ⋯ ∂2E(w)
    ∂w1
    ∂wM
    ⋮ ⋱ ⋮
    ∂2E(w)
    ∂wM
    ∂w1
    ⋯ ∂2E(w)
    ∂w1
    ∂w2
    M

    View Slide

  10. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    E(w) ≈ ˜
    E(w)
    = E( ¯
    w) + ∇w
    E(w)|T
    w= ¯
    w
    (w − ¯
    w) +
    1
    2
    (w − ¯
    w)T ∇2
    w
    E(w)|
    w= ¯
    w
    (w − ¯
    w)
    ˜
    E(w) = E( ¯
    w) + ∇w
    E(w)|T
    w= ¯
    w
    (w − ¯
    w) +
    1
    2
    (w − ¯
    w)T ∇2
    w
    E(w)|
    w= ¯
    w
    (w − ¯
    w)
    ɼ ͱ͠ɼ ͷޯ഑ Λܭࢉ͢Δͱɼ
    A = ∇w
    E(w)|
    w= ¯
    w
    B = ∇2
    w
    E(w)|
    w= ¯
    w
    ˜
    E(w) ∇ ˜
    E(w) =
    ∂ ˜
    E(w)
    ∂w
    ˜
    E(w) = E( ¯
    w) + AT(w − ¯
    w) +
    1
    2
    (w − ¯
    w)TB(w − ¯
    w)
    ∂ ˜
    E(w)
    ∂w
    =
    ∂E( ¯
    w)
    ∂w
    +
    ∂AT(w − ¯
    w)
    ∂(w − ¯
    w)

    ∂(w − ¯
    w)
    ∂w
    +
    1
    2

    ∂(w − ¯
    w)TB(w − ¯
    w)
    ∂(w − ¯
    w)

    ∂(w − ¯
    w)
    ∂w
    ͕ରশߦྻͷͱ͖ɼ
    A
    ∂xTAx
    ∂x
    = 2Ax
    = 0 + (AT)T +
    1
    2
    ⋅ 2B(w − ¯
    w) = A + B(w − ¯
    w)
    ͜ΕΛ ͱ͓͍ͯ ʹؔͯ͠ղ͚͹ɼ
    ∇ ˜
    E(w) = 0 w
    ∇ ˜
    E(w) = A + B(w − ¯
    w) = A + Bw − B ¯
    w = 0
    ⇔ Bw = B ¯
    w − A
    w = B−1B ¯
    w − B−1A = ¯
    w − B−1A
    ͸ਖ਼ଇߦྻͱԾఆ
    B
    w = ¯
    w − {∇2
    w
    E(w)|
    w= ¯
    w
    }
    −1
    ∇w
    E(w)|
    w= ¯
    w ͕ٻ·Δɽ

    View Slide

  11. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    ɹʹରͯ͠ɼ ͱ͠ɼ࣍ͷΑ͏ʹ܁Γฦ͠ߋ৽
    ͢Δ͜ͱͰ Λ࠷খԽͰ͖Δɽ
    w = ¯
    w − {∇2
    w
    E(w)|
    w= ¯
    w
    }
    −1
    ∇w
    E(w)|
    w= ¯
    w
    wold
    = ¯
    w
    E(w)
    wnew
    = wold
    − {∇2
    w
    E(w)|
    w=wold
    }
    −1
    ∇w
    E(w)|
    w=wold
    ޯ഑߱Լ๏ʹΑΔ࠷దԽͱൺ΂Δͱɼ
    ޯ഑߱Լ๏ɹɹɹɹɹɹɹɿɹ
    χϡʔτϯɾϥϑιϯ๏ɹɿɹ
    wnew
    = wold
    −α∇w
    E(w)|
    w=wold
    wnew
    = wold
    −{∇2
    w
    E(w)|
    w=wold
    }
    −1
    ∇w
    E(w)|
    w=wold
    ޯ഑߱Լ๏ʹ͓͚Δֶश཰ ͕ϔοηߦྻͷٯߦྻ ʹରԠ͍ͯ͠Δɽ
    α {∇2
    w
    E(w)|
    w=wold
    }
    −1

    View Slide

  12. ޯ഑߱Լ๏ͱχϡʔτϯɾϥϑιϯ๏
    ɹχϡʔτϯɾϥϑιϯ๏͸࣍ऩଋ͢ΔͨΊɼ୯७ͳޯ഑߱Լ๏ΑΓޮ཰తʹղʹऩଋɽ
    ͔͠͠ͳ͕Βɼ໰୊఺΋ଘࡏɽ
    ໰୊఺
    ύϥϝʔλ਺͕ଟ͍৔߹ɼϔοηߦྻ΍ͦͷٯߦྻͷܭࢉʹ͕͔͔࣌ؒΔɽ
    ղܾࡦ
    ϔοηߦྻΛۙࣅతʹܭࢉ͢Δɽ ɹ४χϡʔτϯ๏

    ࣍ऩଋͱ͸ɾɾɾʁ
    ɹߋ৽Λߦ͏ͨͼʹਪఆ஋ͷਖ਼͍ܻ͠਺͕
    ͓͓ΑͦഒʹͳΔ͜ͱɽ

    View Slide

  13. ޡࠩٯ఻೻๏
    ɹޡࠩٯ఻೻๏ʢFSSPSCBDLQSPQBHBUJPONFUIPEʣͱ͸ɼॱ఻೻ܕχϡʔϥϧωοτʹ
    ͓͍ͯ୅ද͞ΕΔֶश๏ͷͭɽ
    ɹ ૚ͷॱ఻೻ܕχϡʔϥϧωοτϫʔΫΛҎԼͷΑ͏ʹදݱ͢Δɽ
    L
    yn
    = W(L)ϕ(W(L−1)⋯ϕ(W(1)xn
    )⋯) + ϵn
    yn,d
    =
    HL−1

    hL−1
    =1
    w(L)
    d,hL−1
    ϕ
    HL−2

    hL−2
    =1
    w(L−1)
    hL−1
    ,hL−2
    ⋯ϕ
    H0

    h0
    =1
    w(1)
    h1
    ,h0
    xn,h0
    + ϵn,d
    yn,d
    = a(L)
    n,d
    + ϵn,d
    a(L)
    n,d
    =
    HL−1

    hL−1
    =1
    w(L)
    d,hL−1
    z(L−1)
    n,hL−1
    z(L−1)
    n,hL−1
    = ϕ(a(L−1)
    n,hL−1
    )
    a(l)
    n,hl
    =
    Hl−1

    hl−1
    =1
    w(l)
    hl
    ,hl−1
    z(l−1)
    n,hl−1
    z(l−1)
    n,hl−1
    = ϕ(a(l−1)
    n,hl−1
    )
    a(1)
    n,h1
    =
    H0

    h0
    =1
    w(1)
    h1
    ,h0
    z(0)
    n,h0
    z(0)
    n,h0
    = xn,h0


    View Slide

  14. ૚ ॱ఻೻ ٯ఻೻
    ޡࠩٯ఻೻๏
    ɹઌʹ͋͛ͨχϡʔϥϧωοτͷύϥϝʔλू߹Λ ɼֶशσʔλ਺Λ ͱͨ͠৔߹ͷ
    ޡࠩؔ਺Λ ͷΑ͏ʹఆٛ͢Δɽ ʹؔͯ͠ඍ෼͢Δɽ
    W N
    E(W) =
    N

    n=1
    En
    (W) =
    N

    n=1
    (
    1
    2
    D

    d=1
    (yn,d
    − a(L)
    n,d
    )2
    )
    En
    (W)
    a(L)
    n,d
    =
    HL−1

    hL−1
    =1
    w(L)
    d,hL−1
    z(L−1)
    n,hL−1
    =
    HL−1

    hL−1
    =1
    w(L)
    d,hL−1
    ϕ(a(L−1)
    n,hL−1
    )
    z(L−1)
    n,hL−1
    = ϕ(a(L−1)
    n,hL−1
    )
    ∂En
    ∂w(L)
    d,hL−1
    =
    ∂En
    ∂a(L)
    n,d
    ∂a(L)
    n,d
    ∂w(L)
    d,hL−1
    En
    (W) =
    1
    2
    D

    d=1
    (yn,d
    − a(L)
    n,d
    )2
    a(L−1)
    n,hL−1
    =
    HL−2

    hL−2
    =1
    w(L−1)
    hL−1
    ,hL−2
    z(L−2)
    n,hL−2
    = (an,d
    − yn,d
    )z(L−1)
    n,hL−1
    = δ(L)
    n,d
    z(L−1)
    n,hL−1
    L
    L − 1
    ∂En
    ∂w(L−1)
    hL−1,hL−2
    =
    D

    d=1
    ∂En
    ∂a(L)
    n,d
    ∂a(L)
    n,d
    ∂a(L−1)
    n,hL−1
    ∂a(L−1)
    n,hL−1
    ∂w(L−1)
    hL−1,hL−2
    ∂a(L)
    n,d
    ∂a(L−1)
    n,hL−1
    =

    ∂a(L−1)
    n,hL−1
    (
    HL−1

    h=1
    w(L)
    d,h
    ϕ(a(L−1)
    n,h
    )
    )
    = w(L)
    d,hL−1
    ϕ′(a(L−1)
    n,hL−1
    )
    ∂a(L−1)
    n,hL−1
    ∂w(L−1)
    hL−1,hL−2
    =

    ∂w(L−1)
    hL−1,hL−2
    HL−2

    h=1
    w(L−1)
    hL−1,h
    z(L−2)
    n,h
    = z(L−2)
    n,hL−2
    =
    D

    d=1
    δ(L)
    n,d
    (w(L)
    d,hL−1
    ϕ′(a(L−1)
    n,hL−1
    ))z(L−2)
    n,hL−2
    = ϕ′(a(L−1)
    n,hL−1
    )
    (
    D

    d=1
    δ(L)
    n,d
    w(L)
    d,hL−1)
    z(L−2)
    n,hL−2
    = δ(L−1)
    n,hL−1
    z(L−2)
    n,hL−2
    a(l)
    n,hl
    =
    Hl−1

    hl−1
    =1
    w(l)
    hl
    ,hl−1
    z(l−1)
    n,hl−1
    z(l)
    n,hl
    = ϕ(a(l)
    n,hl
    )
    l δ(l)
    n,hl
    =
    a(L)
    n,hl
    − yn,hl
    , if l = L
    ϕ′(a(l)
    n,hl
    )∑Hl+1
    h=1
    δ(l+1)
    n,h
    w(l+1)
    h,hl
    if l ≠ L
    ∂En
    ∂w(l)
    hl,hl−1
    = δ(l)
    n,hl
    z(l−1)
    n,hl−1
    ͸ ͷಋؔ਺ɽ

    ϕ′ ϕ

    View Slide

  15. ޡࠩٯ఻೻๏
    ޡࠩٯ఻೻๏ͷΞϧΰϦζϜ
    ॱ఻೻ɿ
    E(W) = f(W(L)ϕ(W(L−1)⋯ϕ(W(1)xn
    )⋯), y)
    ٯ఻೻ɿ
    δ(l)
    n,hl
    =
    a(L)
    n,hl
    − yn,hl
    , if l = L
    ϕ′(a(l)
    n,hl
    )∑Hl+1
    h=1
    δ(l+1)
    n,h
    w(l+1)
    h,hl
    if l ≠ L
    ޯ഑ܭࢉɿ
    ∂En
    ∂w(l)
    hl
    ,hl−1
    = δ(l)
    n,hl
    z(l−1)
    n,hl−1
    ύϥϝʔλͷߋ৽ɿ
    wnew
    = wold
    − α∇w
    E(w)|
    w=wold

    View Slide

  16. ޡࠩٯ఻೻๏
    ɹ࣮ࡍͷޯ഑ܭࢉ͸ɼඍ෼ͷ࿈࠯཯ʢDIBJOSVMFʣΛద༻͍ͯ͠Δ͚ͩɽ
    ଟ͘ͷ࣮૷͸ɼࣗಈඍ෼ʢBVUPNBUJDEJ⒎FSFOUJBUJPOʣʹΑΔޯ഑ܭࢉ͞Ε͍ͯΔɽ
    ɹ௨ৗͷॱ఻೻ܕχϡʔϥϧωοτϫʔΫ͸ύϥϝʔλ͕ଟ͍ͷͰɼա৒ద߹Λى͜͢
    Մೳੑ͕͋Δɽ
    ɹϦοδճؼͷΑ͏ͳ-ਖ਼ଇԽ߲ͳͲΛಋೖɽ

    J(W) = E(W)+λΩL2
    (W)
    ɹ Λ࠷খԽ͢ΔΑ͏ʹ࠷దԽΛߦ͏Α͏ʹ͢ΔɽχϡʔϥϧωοτϫʔΫͷ෼໺Ͱ͸ɼ
    ͜ΕΛɹॏΈݮਰʢXFJHIUEFDBZʣɹͱݺͿɽ
    J(W)

    View Slide

  17. ϔοηߦྻΛར༻ֶͨ͠श
    ɹύϥϝʔλ਺Λ ͱͨ͠ͱ͖ɼϔοηߦྻͷܭࢉྔ͸ ʹͳΓ ͷ஋͕େ͖͍ͱ͖ɼ
    ܭࢉ͕࣌ؒ๲େɽ
    ɹϔοηߦྻΛޯ഑Λ࢖ͬͯۙࣅ͢Δͱߴ଎ԽͰ͖Δɽʢ४χϡʔτϯ๏ʣ
    ྫɿɹ֤ೖྗ ʹର͢Δग़ྗ ͷޯ഑Λ࢖ͬͯɼ
    M O(M2) M

    xn
    a(L)
    n
    H ≈
    N

    n=1
    (∇w
    a(L)
    n
    )(∇w
    a(L)
    n
    )T
    ˞13.-্רQdʹϔοηߦྻͷߴ଎Խʹ͍ͭͯॻ͔Ε͍ͯΔɽ

    View Slide

  18. ೋ஋෼ྨ ଟ஋෼ྨ
    Ϋϥε਺ %
    ग़ྗͷ࣍ݩ਺ %
    ग़ྗ
    ग़ྗʹର͢Δ
    ׆ੑԽؔ਺
    ޡࠩؔ਺
    ෼ྨϞσϧͷֶश
    ɹࠓ·Ͱͷઆ໌ʹ͓͚Δॱ఻೻ܕχϡʔϥϧωοτϫʔΫ͸ճؼ໰୊ʹద༻͞ΕΔɽ
    Ͱ͸ɼ\ࣝผ ෼ྨ^໰୊ʹରͯ͠͸ɾɾɾʁ
    a(L)
    n
    ∈ ℝ a(L)
    n
    ∈ ℝD
    E(W) = −
    N

    n=1
    {yn
    log μ + (1 − yn
    )log(1 − μn
    )}
    γάϞΠυؔ਺ɿɹ
    μn
    = Sig(a(L)
    n
    ) ∈ (0,1) ιϑτϚοΫεؔ਺ɿɹ
    πd
    (a(L)
    n
    )
    D

    d=1
    πd
    (a(L)
    n
    ) = 1
    E(W) = −
    N

    n=1
    D

    d=1
    yn,d
    log πd
    (a(L)
    n
    )
    ɹ͜ΕΒͷޡࠩؔ਺͸ɹަࠩΤϯτϩϐʔޡࠩؔ਺ɹͱݺ͹ΕΔɽ

    View Slide

  19. ޮ཰తͳֶश๏

    View Slide

  20. ֬཰తޯ഑߱Լ๏
    ɹઌ΄Ͳͷޯ഑߱Լ๏ͷΑ͏ͳɼ͢΂ͯͷֶशσʔλΛҰ౓ʹ࢖༻ͯ͠ޯ഑Λܭࢉ͢Δํ๏
    ͸ɹόονֶशɹͱݺ͹ΕΔɽશσʔλΛ࢖͏ͷͰܭࢉޮ཰͕ѱ͍ɽ
    ɹαϯϓϧબ୒ʹΑͬͯޮ཰Խ
    ɹֶशσʔλͷೖग़ྗσʔλͷ૊ͱͦͷ૊਺ΛͦΕͧΕ ɼ ͱ͢Δɽ
    ɹαϯϓϧબ୒Ͱ͸ɼֶशσʔλ͔ΒϥϯμϜʹ ૊Λબ୒͢ΔɽαϯϓϧʹΑͬͯ
    બ͹Εͨ૊ͷΠϯσοΫεू߹Λ ͱ͢Δͱɼબ͹Εͨ෦෼ू߹͸ɹ ɹ
    ͱදͤΔɽ
    ɹޡࠩؔ਺Λɹ ɹͱͯ͠ύϥϝʔλΛߋ৽͢Δɽ
    ɹ্هͷΑ͏ͳํ๏Λɹ֬཰తޯ഑߱Լ๏ɹͱݺͿɽ·ͨɼ෦෼ू߹ Λɹϛχόονɹ
    ͱݺͿɽɹҰ༷෼෍ͰϥϯμϜબ୒͞ΕΔͱظ଴஋͸όονֶश࣌ͱ౳Ձɽ

    N
    M( < N)

    = {xn
    , yn
    }n∈
    E
    (W) =
    N
    M ∑
    n∈
    En
    (W)

    View Slide

  21. ֬཰తޯ഑߱Լ๏
    ɹճ໨ͷύϥϝʔλߋ৽ʹ༻͍ΒΕΔֶश཰Λ ͱͨ͠ͱ͖ɼ

    ͷͱ͖ɼ֬཰Ͱϛχόονֶश͕ऩଋ͢Δɽ
    ɹ͜ΕΛ༻͍ͯɼϛχόονͷֶशʹΑΓσʔλશମʹ͓͚Δଛࣦؔ਺͕࠷খͱͳΔύ
    ϥϝʔλΛ୳ࡧ͢Δํ๏ΛɹϩϏϯεɾϞϯϩʔΞϧΰϦζϜɹͱ͍͏ɽ
    i αi


    i=1
    αi
    = ∞,


    i=1
    α2
    i
    < ∞
    ɹ֬཰తޯ഑߱Լ๏Λޮ཰Խ͢ΔͨΊʹɼύϥϝʔλͷߋ৽ʹ଎౓ϕΫτϧΛಋೖɽ
    ɹϞʔϝϯλϜ๏ʢNPNFOUVNNFUIPEʣ


    ͸աڈͷޯ഑ͷӨڹΛௐ੔͢Δύϥϝʔλɽ

    pnew
    = βpold
    − α∇w
    E(w)|
    w=wold
    wnew
    = wold
    + pnew
    β ∈ [0,1)

    View Slide

  22. υϩοϓΞ΢τ
    ֬཰తਖ਼ଇԽʢTUPDIBTUJDSFHVMBSJ[BUJPOʣ
    Ϟσϧͷֶश࣌ʹগ͠ϊΠζΛσʔλ΍ӅΕϢχοτͳͲʹՃ͑Δ͜ͱʹΑͬ
    ͯɼա৒ద߹Λ཈੍͠ɼ൚Խੑೳͷ޲্Λ໨ࢦ͢ख๏ɽ
    ɹ֬཰తਖ਼ଇԽͷ୅දྫʹɹυϩοϓΞ΢τɹ͕ڍ͛ΒΕΔɽ
    ֶश࣌ʹҰఆͷ֬཰ͰӅΕϢχοτΛແ͍΋ͷͱͯ͠ѻ͏͜ͱͰɼαϒωοτͱ͍͏
    ෦෼άϥϑ͕ߏ੒͞ΕΔɽ
    ɹ༧ଌ࣌͸શϢχοτΛ࢖༻͢Δ͜ͱͰɼ
    όΪϯάʹࣅͨΞϯαϯϒϧޮՌ͕ظ଴͞Εɼ
    ա৒ద߹Λ཈੍͢Δɽ
    ɹଞʹ΋ɼ֤εΧϥʔͷॏΈͱӅΕϢχοτͷ
    ઀ଓΛϥϯμϜʹܽଛͤ͞ΔυϩοϓίωΫτ
    ͱ͍͏ख๏΋͋Δɽ

    View Slide

  23. όονਖ਼ଇԽ
    ɹόονਖ਼ଇԽʢCBUDIOPSNBMJ[BUJPOʣΛಋೖ͢Δ͜ͱͰɼϊΠζʹΑΔਖ਼ଇԽͱ֤૚΁
    ͷೖྗͷ෼෍ͷ҆ఆੑΛ޲্ͤ͞Δ͜ͱͰ࠷దԽͷޮ཰ԽΛਤΔɽ
    ͋ΔӅΕϢχοτͷฏۉ ͱ෼ࢄ ΛͦΕͧΕٻΊɼඪ४Խ͢Δɽ

    ୯७ʹೖྗΛਖ਼نԽͯ͠͠·͏ͱɼදݱྗ͕୯७ͳ΋ͷʹ੍ݶ͞ΕΔɽ
    ઢܗม׵ΛՃ͑Δ͜ͱͰදݱྗΛҡ࣋ɽ

    z μ σ2
    ˜
    z =
    z − μ
    σ2 + ϵ
    (ϵ ≪ 1)

    ˜
    z′ = γ˜
    z + β (γ ∈ ℝ, β ∈ ℝ)

    View Slide

  24. χϡʔϥϧωοτϫʔΫͷ
    ֦ுϞσϧ

    View Slide

  25. ৞ΈࠐΈχϡʔϥϧωοτϫʔΫ
    ɹ৞ΈࠐΈχϡʔϥϧωοτϫʔΫʢ$//ʣ͸৞ΈࠐΈʢDPOWPMVUJPOʣΛऔΓೖΕͨ
    Ϟσϧɽ࣌ܥྻ΍ը૾ʹରͯ͠༗ޮɽ
    ɹ৞ΈࠐΈͷܭࢉ͸ɼը૾Λೖྗʹ૝ఆͯ͠ ͱ͓͖ ॏΈύϥϝʔλʢϑΟϧλʔʣ
    Λ ͱͨ͠ͱ͖ɼ৞ΈࠐΈޙͷը૾ʢಛ௃Ϛοϓʣ ͷ ൪໨ͷཁૉ͸ɼ
    ɹɹɹɹɹɹɹɹ
    Ͱද͞ΕΔɽ
    ॱ఻೻ܕχϡʔϥϧωοτϫʔΫ͕શ݁߹ʹରͯ͠ɼ$//͸ૄ݁߹Ͱ͋ΔͱݴΘΕ͍ͯΔɽ
    యܕతͳ$//Ͱ͸ɼ৞ΈࠐΈޙϓʔϦϯάʢFH࠷େϓʔϦϯάʣͱݺ͹ΕΔඇઢܗؔ਺
    ΛڬΉɽ
    X ∈ ℝH×W
    W ∈ ℝM×N S i, j
    Si,j
    = (W * X)i,j
    = ∑
    n∈N,m∈M
    Wm,n
    Xi+m−1,j+n−1
    ʢQͷਤΛࢀরʣ

    View Slide

  26. ࠶ؼܕχϡʔϥϧωοτϫʔΫ
    ɹ࠶ؼܕχϡʔϥϧωοτϫʔΫʢ3//ʣ͸ɼσʔλͷܥྻ৘ใΛදݱ͞Εͨχϡʔϥϧ
    ωοτϫʔΫɽ
    ɹ࣌ࠁ ʹ͓͚ΔӅΕϢχοτΛ ɼೖྗσʔλΛ ͱͨ͠ͱ͖ɼӅΕϢχοτ͸
    ɹɹɹɹ
    Ͱද͞ΕΔɽ·ͨɼύϥϝʔλ ͸ϞσϧશମͰڞ༗ɽ ͸ཁૉ͝ͱͷඇઢܗؔ਺ɽ
    ࣌ࠁ ʹ͓͚Δग़ྗ ͸ιϑτϚοΫεؔ਺Λ ͱ͢Δͱɼ

    Ͱܭࢉ͞ΕΔɽΑͬͯɼϞσϧͷύϥϝʔλू߹Λ ͱ͢Δͱɼ࣌ܥྻશମͷޡࠩ͸ɼ
    [
    Ͱܭࢉ͞Εɼ͜ΕΛ࠷খԽ͢ΔΑ͏ʹ࠷దԽΛߦ͏ɽ
    n zn
    xn
    zn
    = ϕ.
    (Wzx
    xn
    + Wzz
    zn−1
    + bz
    )
    Wzx
    , Wzz
    , bz
    ϕ.
    n πn
    π
    πn
    = π(Wyz
    zn
    + by
    )
    Θ
    E(Θ) =
    N

    n=1
    En
    (Θ) =
    N

    n=1
    (

    D

    d=1
    yn,d
    log πn,d)
    QͷਤΛࢀর

    View Slide

  27. ࣗݾූ߸Խث
    ɹࣗݾූ߸ԽثʢBVUPFODPEFSʣ͸ɼೖྗσʔλʹର͢ΔϥϕϧΛඞཁͱ͠ͳ͍
    ڭࢣͳֶ͠शʢVOTVQFSWJTFEMFBSOJOHʣͷͭɽ ڭࢣ͋ΓֶशTVQFSWJTFEMFBSOJOH

    ɹූ߸ԽثʢFODPEFSʣͱ෮߸ԽثʢEFDPEFSʣͷͭͷχϡʔϥϧωοτϫʔΫΛ࢖༻͠ɼ
    ؍ଌσʔλ Λ෮߸Խثʹ௨ͯ͠ɼજࡏม਺ ʹม׵͢ΔɽҰํɼ෮߸Խث͸ೖྗͱͯ͠ɼ
    જࡏม਺ Λ༩͑ɼݩͷσʔλ Λ෮ݩ͢Δ͜ͱ͕໨తɽ
    ɹ߃౳ࣸ૾Λආ͚ΔͨΊʹɼજࡏม਺ʹରͯ͠ਖ਼ଇԽ߲ΛՃ͑ͨΓ͢Δɽ

    X Z
    Z X

    X &ODPEFS
    ̂
    X %FDPEFS

    Z
    X ≈ ̂
    X
    Ұൠతʹ
    dim xn
    > dim zn

    View Slide