Naoya Umezaki
October 25, 2018
320

# Watanabe 6.3

Sumio Watanabe, Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)のゼミでの発表資料。4章の復習と6.3について。汎化誤差の漸近挙動を調べる。

October 25, 2018

## Transcript

2. ### ౷ܭతֶश 1. ਅͷ෼෍q(x)·ͨ͸͔ͦ͜Βੜ੒͞ΕΔαϯ ϓϧΛ༧ଌ͍ͨ͠ 2. Ϟσϧp(x|w)ͱύϥϝʔλۭؒW Λઃఆ 3. ༩͑ΒΕͨαϯϓϧ͔Βύϥϝʔλ্ۭؒͷଌ ౓͓Αͼ༧ଌ෼෍

ˆ p(x)Λܾఆ ϞσϧΛධՁ͍ͨ͠
3. ### Bayes quartet ࢛ͭͷϞσϧͷධՁج४ Bayesਪଌ Gibbsਪଌ ෼෍ Bg Gg αϯϓϧ Bt

Gt ͜ΕΒ͸αϯϓϧDn ʹґଘͨ֬͠཰ม਺

6. ### Gibbsਪଌ ࣄޙ෼෍ʹैͬͯύϥϝʔλ ˆ wΛαϯϓϦϯά ͠ɺˆ p(x) = p(x| ˆ w)Λ༧ଌ෼෍ͱ͢Δɻ

൚ԽޡࠩGg q(x)ͱ ˆ p(x)ͷKL divergenceΛwʹ͍ͭͯࣄޙ෼ ෍p(w|Dn )Ͱੵ෼ͨ͠΋ͷ Gg = ∫ W K(w)p(w|Dn )dw
7. ### ໰୊ n → ∞ͰnGg ͕ͲͷΑ͏ͳ֬཰ม਺ʹऩଋ͢ Δ͔ʁ ओཁ߲ Gg (ϵ) =

∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ p(w|Dn )dw ิ୊ 1 (Lemma 6.3). nGg − nGg (ϵ)͸0ʹ֬཰ ऩଋ͢Δɻ
8. ### ओཁ߲ͷධՁ Gg (ϵ) = ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ

p(w|Dn )dw = E[K(w)|K(w)≤ϵ ] ͷධՁΛ͍͕ͨ͠௚઀͸೉͍͠ɻ ಛҟ఺ղফΛ࢖͏
9. ### ඪ४ܗ f(x, g(u)) = log( q(x) p(x|g(u)) ) = a(x,

u)uk ͱ͠ K(g(u)) = u2k Kn (g(u)) = u2k − 1 √ n ukξn (u) ξn (u) = 1 √ n n ∑ i=1 {a(Xi , u) − EX [a(X, u)]}
10. ### ξn (u)͸αϯϓϧDn ʹґଘͨ֬͠཰աఔɻξn ͸ Gaussաఔξʹ๏ଇऩଋ͢Δɻ ิ୊ 2 (6.51). G∗ g

(ξn ) = Ey,t [t|ξn ] ͱఆٛ͢Δͱ nGg (ϵ) − G∗ g (ξn ) →P 0
11. ### ಛҟ఺ղফM ্Ͱͷੵ෼Ey,t ͱEu • ξ(u): M ্ͷC1 ڃؔ਺ʢαϯϓϧͷ֬཰աఔʣ • f(u):

M ্ͷؔ਺ʢK(w)ʹର͠f(u) = u2kʣ • 0 ≤ σ ≤ 1 Eσ u [f(u)|ξ] = ∑ α∈A ∫ [0,b]d f(u)Z(u, ξ)du ∑ α∈A ∫ [0,b]d Z(u, ξ)du A͸࠲ඪۙ๣ͷʢ༗ݶʣू߹ɻ
12. ### Z(u, ξ)͸ uhϕ∗(u) exp(−βnu2k+β √ nukξ(u)−σuka(X, u)) ࣄޙ෼෍p(w|Dn )ͱͷؔ܎ɻ •

uhϕ∗(u)͕ࣄલ෼෍ϕ(w)ʹରԠɻ • σ = 0ͱͯ͠ Z0 n p(w|Dn ) = exp(−nβKn (w)) = exp(−βnu2k + β √ nukξn (u))
13. ### ิ୊ 3 (6.41). Gg (ϵ) = E0 u [u2k|ξn ]

u2k = K(g(u))Ͱ͋ͬͨɻ
14. ### ຊ࣭త෦෼ ࠲ඪu = (x, y)ͱຊ࣭త෦෼A∗ ⊂ AʢK(w)ͷ ಛҟ఺ղফ͔Βܾ·Δʣ Ey,t [f(y,

t)|ξ] = ∑ α∈A∗ ∫ dt ∫ [0,b]d−m f(y, t)Z0 (y, t, ξ)du ∑ α∈A∗ ∫ dt ∫ [0,b]d−m Z0 (y, t, ξ)du
15. ### Z0 (y, t, ξ) = γb yµtλ−1 exp(−βt+β √ tξ0

(y))ϕ∗ 0 (y) ิ୊ 4 (Lemma 6.6, p = 1, f = 1, ξ = ξn ). |E0 u [nu2k|ξn ] − Ey,t [t|ξn ]| ≤ D(ξn , 1, ϕ∗) log n ͜Εͷূ໌ʹ4ষͰͷ෼഑ؔ਺ͷܭࢉΛ༻͍Δɻ
16. ### ऩଋઌͷߏ੒ ఆٛ 1 (6.46). M ্ͷؔ਺ψ(u)ʹର͠ G∗ g (ψ) =

Ey,t [t|ψ] ͜ΕΛ࢖͖ͬͯͬ͞ͷิ୊Λॻ͖௚͢ͱ ิ୊ 5. |nGg (ϵ) − G∗ g (ξn )| ≤ D(ξn , 1, ϕ∗) log n
17. ### ݁࿦ ξn ͕ξʹ๏ଇऩଋ͢Δ͜ͱ͔Β • ิ୊4Λ༻͍ͯnGg (ϵ) − G∗ g (ξn

) → 0 • G∗ g (ξn ) − G∗ g (ξ) → 0 ͕ݴ͑Δɻ શͯ߹ΘͤͯnGg − G∗ g (ξ) → 0͕ূ໌Ͱ͖ͨɻ G∗ g (ξ)͸ξʹґଘͨ֬͠཰ม਺Ͱ͋Δɻ
18. ### 4ষͷ෮श θʔλؔ਺ ։ू߹U ⊂ Rd ্ͷඇෛղੳతؔ਺K(w)ͱίϯ ύΫτ୆C∞ ؔ਺ϕ(w)ʹର͠ɺ ζ(z) =

∫ K(w)zϕ(w)dw ͱఆٛ͢Δɻ͜ΕͷۃͷҐஔͱͦͷҐ਺͸ͲͷΑ ͏ͳ৘ใΛ͔࣋ͭʁ

21. ### ಛҟֶशཧ࿦ Remark 4.4ʹ͋ΔΑ͏ʹ Z = ∫ exp(−nβK(w)+β √ nK(w)ξ(w))ϕ(w)dw ͷn

→ ∞ͰͷڍಈΛௐ΂͍ͨɻ
22. ### K ʹ͍ͭͯͷಛҟ఺ղফʹΑΓɺnormal crossing ͷ৔߹ͷੵ෼Z(n, ξ, ϕ)Λ༻͍ͯ Z = ∑ α

Z(n, ξ ◦ gα , ϕ ◦ gα |g′ α |) ͱॻ͚ΔͷͰɺZ(n, ξ, ϕ)ʹ͍ͭͯௐ΂Δͷ͕4.4 ͷ໨ඪɻ
23. ### Zp(n, ξ, ϕ) = ∫ [0,b]r dx ∫ [0,b]s dyK(X,

y)pxhyh′ ϕ(x, y) exp(−nβK(x, y)2 + √ nβK(x, y)ξ(x, y)) ͱఆٛ͢Δɻ
24. ### ͞Βʹ͜ΕͰξ = 0, ϕ = 1ͱஔ͍ͨ΋ͷΛ Zp(n) = ∫ [0,b]r

dx ∫ [0,r]s dy K(x, y)pxh, yh′ exp(−nβK(x, y)2) ͱॻ͘͜ͱʹ͢Δɻ
25. ### ఆཧ 1 (Theorem 4.7). hi + 1 2ki = λ

͕ҰఆͰ h′ j + 1 2k′ j > λ ͱ͢ΔɻK(x, y) = xkyk′ ͷͱ͖ʹɺ͋Δ a1 , a2 > 0͕ଘࡏͯ͠೚ҙͷnʹରͯ͠ a1 (log n)r−1 nλ+p ≤ Zp(n) ≤ a2 (log n)r−1 nλ+p