Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Watanabe 6.3

Naoya Umezaki
October 25, 2018
300

Watanabe 6.3

Sumio Watanabe, Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)のゼミでの発表資料。4章の復習と6.3について。汎化誤差の漸近挙動を調べる。

Naoya Umezaki

October 25, 2018
Tweet

Transcript

  1. 2018೥10݄25೔ Watanebe, 6.3 @unaoya

  2. ౷ܭతֶश 1. ਅͷ෼෍q(x)·ͨ͸͔ͦ͜Βੜ੒͞ΕΔαϯ ϓϧΛ༧ଌ͍ͨ͠ 2. Ϟσϧp(x|w)ͱύϥϝʔλۭؒW Λઃఆ 3. ༩͑ΒΕͨαϯϓϧ͔Βύϥϝʔλ্ۭؒͷଌ ౓͓Αͼ༧ଌ෼෍

    ˆ p(x)Λܾఆ ϞσϧΛධՁ͍ͨ͠
  3. Bayes quartet ࢛ͭͷϞσϧͷධՁج४ Bayesਪଌ Gibbsਪଌ ෼෍ Bg Gg αϯϓϧ Bt

    Gt ͜ΕΒ͸αϯϓϧDn ʹґଘͨ֬͠཰ม਺
  4. 6.3ͷ໨ඪ • αϯϓϧ਺n → ∞ʹ͓͚Δ઴ۙڍಈ • αϯϓϧʹ͍ͭͯͷظ଴஋ • ͜ΕΒͷ4ͭͷؒͷؔ܎ ʹ͍ͭͯௐ΂Δɻ

  5. ࠓ೔͸Gg ʹ͍ͭͯ

  6. Gibbsਪଌ ࣄޙ෼෍ʹैͬͯύϥϝʔλ ˆ wΛαϯϓϦϯά ͠ɺˆ p(x) = p(x| ˆ w)Λ༧ଌ෼෍ͱ͢Δɻ

    ൚ԽޡࠩGg q(x)ͱ ˆ p(x)ͷKL divergenceΛwʹ͍ͭͯࣄޙ෼ ෍p(w|Dn )Ͱੵ෼ͨ͠΋ͷ Gg = ∫ W K(w)p(w|Dn )dw
  7. ໰୊ n → ∞ͰnGg ͕ͲͷΑ͏ͳ֬཰ม਺ʹऩଋ͢ Δ͔ʁ ओཁ߲ Gg (ϵ) =

    ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ p(w|Dn )dw ิ୊ 1 (Lemma 6.3). nGg − nGg (ϵ)͸0ʹ֬཰ ऩଋ͢Δɻ
  8. ओཁ߲ͷධՁ Gg (ϵ) = ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ

    p(w|Dn )dw = E[K(w)|K(w)≤ϵ ] ͷධՁΛ͍͕ͨ͠௚઀͸೉͍͠ɻ ಛҟ఺ղফΛ࢖͏
  9. ඪ४ܗ f(x, g(u)) = log( q(x) p(x|g(u)) ) = a(x,

    u)uk ͱ͠ K(g(u)) = u2k Kn (g(u)) = u2k − 1 √ n ukξn (u) ξn (u) = 1 √ n n ∑ i=1 {a(Xi , u) − EX [a(X, u)]}
  10. ξn (u)͸αϯϓϧDn ʹґଘͨ֬͠཰աఔɻξn ͸ Gaussաఔξʹ๏ଇऩଋ͢Δɻ ิ୊ 2 (6.51). G∗ g

    (ξn ) = Ey,t [t|ξn ] ͱఆٛ͢Δͱ nGg (ϵ) − G∗ g (ξn ) →P 0
  11. ಛҟ఺ղফM ্Ͱͷੵ෼Ey,t ͱEu • ξ(u): M ্ͷC1 ڃؔ਺ʢαϯϓϧͷ֬཰աఔʣ • f(u):

    M ্ͷؔ਺ʢK(w)ʹର͠f(u) = u2kʣ • 0 ≤ σ ≤ 1 Eσ u [f(u)|ξ] = ∑ α∈A ∫ [0,b]d f(u)Z(u, ξ)du ∑ α∈A ∫ [0,b]d Z(u, ξ)du A͸࠲ඪۙ๣ͷʢ༗ݶʣू߹ɻ
  12. Z(u, ξ)͸ uhϕ∗(u) exp(−βnu2k+β √ nukξ(u)−σuka(X, u)) ࣄޙ෼෍p(w|Dn )ͱͷؔ܎ɻ •

    uhϕ∗(u)͕ࣄલ෼෍ϕ(w)ʹରԠɻ • σ = 0ͱͯ͠ Z0 n p(w|Dn ) = exp(−nβKn (w)) = exp(−βnu2k + β √ nukξn (u))
  13. ิ୊ 3 (6.41). Gg (ϵ) = E0 u [u2k|ξn ]

    u2k = K(g(u))Ͱ͋ͬͨɻ
  14. ຊ࣭త෦෼ ࠲ඪu = (x, y)ͱຊ࣭త෦෼A∗ ⊂ AʢK(w)ͷ ಛҟ఺ղফ͔Βܾ·Δʣ Ey,t [f(y,

    t)|ξ] = ∑ α∈A∗ ∫ dt ∫ [0,b]d−m f(y, t)Z0 (y, t, ξ)du ∑ α∈A∗ ∫ dt ∫ [0,b]d−m Z0 (y, t, ξ)du
  15. Z0 (y, t, ξ) = γb yµtλ−1 exp(−βt+β √ tξ0

    (y))ϕ∗ 0 (y) ิ୊ 4 (Lemma 6.6, p = 1, f = 1, ξ = ξn ). |E0 u [nu2k|ξn ] − Ey,t [t|ξn ]| ≤ D(ξn , 1, ϕ∗) log n ͜Εͷূ໌ʹ4ষͰͷ෼഑ؔ਺ͷܭࢉΛ༻͍Δɻ
  16. ऩଋઌͷߏ੒ ఆٛ 1 (6.46). M ্ͷؔ਺ψ(u)ʹର͠ G∗ g (ψ) =

    Ey,t [t|ψ] ͜ΕΛ࢖͖ͬͯͬ͞ͷิ୊Λॻ͖௚͢ͱ ิ୊ 5. |nGg (ϵ) − G∗ g (ξn )| ≤ D(ξn , 1, ϕ∗) log n
  17. ݁࿦ ξn ͕ξʹ๏ଇऩଋ͢Δ͜ͱ͔Β • ิ୊4Λ༻͍ͯnGg (ϵ) − G∗ g (ξn

    ) → 0 • G∗ g (ξn ) − G∗ g (ξ) → 0 ͕ݴ͑Δɻ શͯ߹ΘͤͯnGg − G∗ g (ξ) → 0͕ূ໌Ͱ͖ͨɻ G∗ g (ξ)͸ξʹґଘͨ֬͠཰ม਺Ͱ͋Δɻ
  18. 4ষͷ෮श θʔλؔ਺ ։ू߹U ⊂ Rd ্ͷඇෛղੳతؔ਺K(w)ͱίϯ ύΫτ୆C∞ ؔ਺ϕ(w)ʹର͠ɺ ζ(z) =

    ∫ K(w)zϕ(w)dw ͱఆٛ͢Δɻ͜ΕͷۃͷҐஔͱͦͷҐ਺͸ͲͷΑ ͏ͳ৘ใΛ͔࣋ͭʁ
  19. ঢ়ଶີ౓ؔ਺ ζ(z)ͷٯMellinม׵͸ঢ়ଶີ౓ؔ਺ v(t) = ∫ δ(t − K(w))ϕ(w)dw Ͱ͋ΔɻMellinม׵ͷཧ࿦ʹΑΓɺ͜Εͷൃࢄͷ Φʔμʔ͕ζ(z)ͷۃͷҐஔͱରԠɻ

  20. ෼഑ؔ਺ ঢ়ଶີ౓ؔ਺v(t)ͷLaplaceม׵ Z(n) = ∫ exp(−nK(w))ϕ(w)dw Λ෼഑ؔ਺ͱ͍͏ɻ͜Ε͕6ষલ൒Ͱௐ΂͍ͯͨ ΋ͷɻ

  21. ಛҟֶशཧ࿦ Remark 4.4ʹ͋ΔΑ͏ʹ Z = ∫ exp(−nβK(w)+β √ nK(w)ξ(w))ϕ(w)dw ͷn

    → ∞ͰͷڍಈΛௐ΂͍ͨɻ
  22. K ʹ͍ͭͯͷಛҟ఺ղফʹΑΓɺnormal crossing ͷ৔߹ͷੵ෼Z(n, ξ, ϕ)Λ༻͍ͯ Z = ∑ α

    Z(n, ξ ◦ gα , ϕ ◦ gα |g′ α |) ͱॻ͚ΔͷͰɺZ(n, ξ, ϕ)ʹ͍ͭͯௐ΂Δͷ͕4.4 ͷ໨ඪɻ
  23. Zp(n, ξ, ϕ) = ∫ [0,b]r dx ∫ [0,b]s dyK(X,

    y)pxhyh′ ϕ(x, y) exp(−nβK(x, y)2 + √ nβK(x, y)ξ(x, y)) ͱఆٛ͢Δɻ
  24. ͞Βʹ͜ΕͰξ = 0, ϕ = 1ͱஔ͍ͨ΋ͷΛ Zp(n) = ∫ [0,b]r

    dx ∫ [0,r]s dy K(x, y)pxh, yh′ exp(−nβK(x, y)2) ͱॻ͘͜ͱʹ͢Δɻ
  25. ఆཧ 1 (Theorem 4.7). hi + 1 2ki = λ

    ͕ҰఆͰ h′ j + 1 2k′ j > λ ͱ͢ΔɻK(x, y) = xkyk′ ͷͱ͖ʹɺ͋Δ a1 , a2 > 0͕ଘࡏͯ͠೚ҙͷnʹରͯ͠ a1 (log n)r−1 nλ+p ≤ Zp(n) ≤ a2 (log n)r−1 nλ+p
  26. ฏۉޡࠩؔ਺K(w)ͱࣄલ෼෍ϕ(w)ʹରͯ͠ θʔλؔ਺ ζ(z) = ∫ K(w)zϕ(w)dw Λఆٛ͢Δɻ͜Εͷۃͷ৘ใ͔ΒK ͷʁಛҟ఺ ͷ࣮ର਺ᮢ஋͕΋ͱ·Δɻ͜Ε͕ࣗ༝ΤωϧΪʔ ͓Αͼ൚Խଛࣦͷཧ࿦஋͕໌Β͔ʹͳΔɻ

  27. 0.1 ٙ໰఺ ਖ਼ଇͳ৔߹ͷܭࢉΛ΍Δɻಛʹ͜ͷ࣌ਖ਼ଇੑΛͲ ͜Ͱ࢖͏͔ɻਖ਼ଇͳ৔߹ɺαϯϓϧ͕ଟ͍͜ͱ͕ Ծఆ͞ΕΔʁ