Save 37% off PRO during our Black Friday Sale! »

Watanabe 6.3

8554778f1060f77c8eec4e88a30369ac?s=47 Naoya Umezaki
October 25, 2018
280

Watanabe 6.3

Sumio Watanabe, Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)のゼミでの発表資料。4章の復習と6.3について。汎化誤差の漸近挙動を調べる。

8554778f1060f77c8eec4e88a30369ac?s=128

Naoya Umezaki

October 25, 2018
Tweet

Transcript

  1. 2018೥10݄25೔ Watanebe, 6.3 @unaoya

  2. ౷ܭతֶश 1. ਅͷ෼෍q(x)·ͨ͸͔ͦ͜Βੜ੒͞ΕΔαϯ ϓϧΛ༧ଌ͍ͨ͠ 2. Ϟσϧp(x|w)ͱύϥϝʔλۭؒW Λઃఆ 3. ༩͑ΒΕͨαϯϓϧ͔Βύϥϝʔλ্ۭؒͷଌ ౓͓Αͼ༧ଌ෼෍

    ˆ p(x)Λܾఆ ϞσϧΛධՁ͍ͨ͠
  3. Bayes quartet ࢛ͭͷϞσϧͷධՁج४ Bayesਪଌ Gibbsਪଌ ෼෍ Bg Gg αϯϓϧ Bt

    Gt ͜ΕΒ͸αϯϓϧDn ʹґଘͨ֬͠཰ม਺
  4. 6.3ͷ໨ඪ • αϯϓϧ਺n → ∞ʹ͓͚Δ઴ۙڍಈ • αϯϓϧʹ͍ͭͯͷظ଴஋ • ͜ΕΒͷ4ͭͷؒͷؔ܎ ʹ͍ͭͯௐ΂Δɻ

  5. ࠓ೔͸Gg ʹ͍ͭͯ

  6. Gibbsਪଌ ࣄޙ෼෍ʹैͬͯύϥϝʔλ ˆ wΛαϯϓϦϯά ͠ɺˆ p(x) = p(x| ˆ w)Λ༧ଌ෼෍ͱ͢Δɻ

    ൚ԽޡࠩGg q(x)ͱ ˆ p(x)ͷKL divergenceΛwʹ͍ͭͯࣄޙ෼ ෍p(w|Dn )Ͱੵ෼ͨ͠΋ͷ Gg = ∫ W K(w)p(w|Dn )dw
  7. ໰୊ n → ∞ͰnGg ͕ͲͷΑ͏ͳ֬཰ม਺ʹऩଋ͢ Δ͔ʁ ओཁ߲ Gg (ϵ) =

    ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ p(w|Dn )dw ิ୊ 1 (Lemma 6.3). nGg − nGg (ϵ)͸0ʹ֬཰ ऩଋ͢Δɻ
  8. ओཁ߲ͷධՁ Gg (ϵ) = ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ

    p(w|Dn )dw = E[K(w)|K(w)≤ϵ ] ͷධՁΛ͍͕ͨ͠௚઀͸೉͍͠ɻ ಛҟ఺ղফΛ࢖͏
  9. ඪ४ܗ f(x, g(u)) = log( q(x) p(x|g(u)) ) = a(x,

    u)uk ͱ͠ K(g(u)) = u2k Kn (g(u)) = u2k − 1 √ n ukξn (u) ξn (u) = 1 √ n n ∑ i=1 {a(Xi , u) − EX [a(X, u)]}
  10. ξn (u)͸αϯϓϧDn ʹґଘͨ֬͠཰աఔɻξn ͸ Gaussաఔξʹ๏ଇऩଋ͢Δɻ ิ୊ 2 (6.51). G∗ g

    (ξn ) = Ey,t [t|ξn ] ͱఆٛ͢Δͱ nGg (ϵ) − G∗ g (ξn ) →P 0
  11. ಛҟ఺ղফM ্Ͱͷੵ෼Ey,t ͱEu • ξ(u): M ্ͷC1 ڃؔ਺ʢαϯϓϧͷ֬཰աఔʣ • f(u):

    M ্ͷؔ਺ʢK(w)ʹର͠f(u) = u2kʣ • 0 ≤ σ ≤ 1 Eσ u [f(u)|ξ] = ∑ α∈A ∫ [0,b]d f(u)Z(u, ξ)du ∑ α∈A ∫ [0,b]d Z(u, ξ)du A͸࠲ඪۙ๣ͷʢ༗ݶʣू߹ɻ
  12. Z(u, ξ)͸ uhϕ∗(u) exp(−βnu2k+β √ nukξ(u)−σuka(X, u)) ࣄޙ෼෍p(w|Dn )ͱͷؔ܎ɻ •

    uhϕ∗(u)͕ࣄલ෼෍ϕ(w)ʹରԠɻ • σ = 0ͱͯ͠ Z0 n p(w|Dn ) = exp(−nβKn (w)) = exp(−βnu2k + β √ nukξn (u))
  13. ิ୊ 3 (6.41). Gg (ϵ) = E0 u [u2k|ξn ]

    u2k = K(g(u))Ͱ͋ͬͨɻ
  14. ຊ࣭త෦෼ ࠲ඪu = (x, y)ͱຊ࣭త෦෼A∗ ⊂ AʢK(w)ͷ ಛҟ఺ղফ͔Βܾ·Δʣ Ey,t [f(y,

    t)|ξ] = ∑ α∈A∗ ∫ dt ∫ [0,b]d−m f(y, t)Z0 (y, t, ξ)du ∑ α∈A∗ ∫ dt ∫ [0,b]d−m Z0 (y, t, ξ)du
  15. Z0 (y, t, ξ) = γb yµtλ−1 exp(−βt+β √ tξ0

    (y))ϕ∗ 0 (y) ิ୊ 4 (Lemma 6.6, p = 1, f = 1, ξ = ξn ). |E0 u [nu2k|ξn ] − Ey,t [t|ξn ]| ≤ D(ξn , 1, ϕ∗) log n ͜Εͷূ໌ʹ4ষͰͷ෼഑ؔ਺ͷܭࢉΛ༻͍Δɻ
  16. ऩଋઌͷߏ੒ ఆٛ 1 (6.46). M ্ͷؔ਺ψ(u)ʹର͠ G∗ g (ψ) =

    Ey,t [t|ψ] ͜ΕΛ࢖͖ͬͯͬ͞ͷิ୊Λॻ͖௚͢ͱ ิ୊ 5. |nGg (ϵ) − G∗ g (ξn )| ≤ D(ξn , 1, ϕ∗) log n
  17. ݁࿦ ξn ͕ξʹ๏ଇऩଋ͢Δ͜ͱ͔Β • ิ୊4Λ༻͍ͯnGg (ϵ) − G∗ g (ξn

    ) → 0 • G∗ g (ξn ) − G∗ g (ξ) → 0 ͕ݴ͑Δɻ શͯ߹ΘͤͯnGg − G∗ g (ξ) → 0͕ূ໌Ͱ͖ͨɻ G∗ g (ξ)͸ξʹґଘͨ֬͠཰ม਺Ͱ͋Δɻ
  18. 4ষͷ෮श θʔλؔ਺ ։ू߹U ⊂ Rd ্ͷඇෛղੳతؔ਺K(w)ͱίϯ ύΫτ୆C∞ ؔ਺ϕ(w)ʹର͠ɺ ζ(z) =

    ∫ K(w)zϕ(w)dw ͱఆٛ͢Δɻ͜ΕͷۃͷҐஔͱͦͷҐ਺͸ͲͷΑ ͏ͳ৘ใΛ͔࣋ͭʁ
  19. ঢ়ଶີ౓ؔ਺ ζ(z)ͷٯMellinม׵͸ঢ়ଶີ౓ؔ਺ v(t) = ∫ δ(t − K(w))ϕ(w)dw Ͱ͋ΔɻMellinม׵ͷཧ࿦ʹΑΓɺ͜Εͷൃࢄͷ Φʔμʔ͕ζ(z)ͷۃͷҐஔͱରԠɻ

  20. ෼഑ؔ਺ ঢ়ଶີ౓ؔ਺v(t)ͷLaplaceม׵ Z(n) = ∫ exp(−nK(w))ϕ(w)dw Λ෼഑ؔ਺ͱ͍͏ɻ͜Ε͕6ষલ൒Ͱௐ΂͍ͯͨ ΋ͷɻ

  21. ಛҟֶशཧ࿦ Remark 4.4ʹ͋ΔΑ͏ʹ Z = ∫ exp(−nβK(w)+β √ nK(w)ξ(w))ϕ(w)dw ͷn

    → ∞ͰͷڍಈΛௐ΂͍ͨɻ
  22. K ʹ͍ͭͯͷಛҟ఺ղফʹΑΓɺnormal crossing ͷ৔߹ͷੵ෼Z(n, ξ, ϕ)Λ༻͍ͯ Z = ∑ α

    Z(n, ξ ◦ gα , ϕ ◦ gα |g′ α |) ͱॻ͚ΔͷͰɺZ(n, ξ, ϕ)ʹ͍ͭͯௐ΂Δͷ͕4.4 ͷ໨ඪɻ
  23. Zp(n, ξ, ϕ) = ∫ [0,b]r dx ∫ [0,b]s dyK(X,

    y)pxhyh′ ϕ(x, y) exp(−nβK(x, y)2 + √ nβK(x, y)ξ(x, y)) ͱఆٛ͢Δɻ
  24. ͞Βʹ͜ΕͰξ = 0, ϕ = 1ͱஔ͍ͨ΋ͷΛ Zp(n) = ∫ [0,b]r

    dx ∫ [0,r]s dy K(x, y)pxh, yh′ exp(−nβK(x, y)2) ͱॻ͘͜ͱʹ͢Δɻ
  25. ఆཧ 1 (Theorem 4.7). hi + 1 2ki = λ

    ͕ҰఆͰ h′ j + 1 2k′ j > λ ͱ͢ΔɻK(x, y) = xkyk′ ͷͱ͖ʹɺ͋Δ a1 , a2 > 0͕ଘࡏͯ͠೚ҙͷnʹରͯ͠ a1 (log n)r−1 nλ+p ≤ Zp(n) ≤ a2 (log n)r−1 nλ+p
  26. ฏۉޡࠩؔ਺K(w)ͱࣄલ෼෍ϕ(w)ʹରͯ͠ θʔλؔ਺ ζ(z) = ∫ K(w)zϕ(w)dw Λఆٛ͢Δɻ͜Εͷۃͷ৘ใ͔ΒK ͷʁಛҟ఺ ͷ࣮ର਺ᮢ஋͕΋ͱ·Δɻ͜Ε͕ࣗ༝ΤωϧΪʔ ͓Αͼ൚Խଛࣦͷཧ࿦஋͕໌Β͔ʹͳΔɻ

  27. 0.1 ٙ໰఺ ਖ਼ଇͳ৔߹ͷܭࢉΛ΍Δɻಛʹ͜ͷ࣌ਖ਼ଇੑΛͲ ͜Ͱ࢖͏͔ɻਖ਼ଇͳ৔߹ɺαϯϓϧ͕ଟ͍͜ͱ͕ Ծఆ͞ΕΔʁ