$30 off During Our Annual Pro Sale. View Details »

パターン認識と機械学習 第1章 #PRML学ぼう PRML輪講 #2 / PRML Seminar 2 go to introduction in machine learning

Shunya Ueta
April 10, 2015

パターン認識と機械学習 第1章 #PRML学ぼう PRML輪講 #2 / PRML Seminar 2 go to introduction in machine learning

筑波大学で開催される、パターン認識と機械学習についての勉強会の資料です。
PRMLと呼ばれるパターン認識機械学習という本についての輪講を行っています。
このスライドは第1章の内容をまとめたものです。
https://github.com/hurutoriya/prml-seminar/tree/master/chapter1
Githubで管理しています。

Shunya Ueta

April 10, 2015
Tweet

More Decks by Shunya Ueta

Other Decks in Research

Transcript

  1. PRML Seminar #1
    PRML Seminar #1
    1.1-1.6.1 #PRML ֶ〱⿸
    Shunya Ueta
    Graduate School of SIE, Univ. of Tsukuba
    Department of Computer Science
    April 13, 2015
    1 / 43

    View Slide

  2. PRML Seminar #1
    Introduction
    〈〣ษڧձ〠〙⿶〛
    PRML ྠߨ #2 ಺༰
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    1.2 ֬཰࿦
    1.4 ࣍ݩ〣ढ⿶
    1.5 ܾఆཧ࿦
    1.6 ৘ใཧ࿦
    2 / 43

    View Slide

  3. PRML Seminar #1
    Introduction
    ࣗݾ঺հ
    ▶ ໊લ:্ా൏໵ (@hurutoriya)
    ▶ ஜ೾େֶେֶӃ 1 ೥ Go to Doctor course :)
    ▶ ৘ใ਺ཧݚڀࣨॴଐ
    ▶ ݚڀ෼໺ : ը૾ೝࣝɾػցֶश
    3 / 43

    View Slide

  4. PRML Seminar #1
    Introduction
    〈〣ษڧձ〠〙⿶〛
    〈〣ษڧձ〠〙⿶〛
    ▶ むのがアೝࣝ〝ػցֶश〠〙⿶〛〣ྠߨ〜『
    ػցֶश〝むのがアೝࣝ〣جૅぇཧղɺ࣮༻゛よ゚〜࢖⿶〈
    〟『ࣄぇ໨త〠なゎべがぇ։࠵「〛⿶ 〳『
    ▶ 2015 ೥ぇ໨ॲ〠Ұप༧ఆ
    ▶ डߨऀ〠〤جૅత〟ඍੵ෼ɾઢܗ୅਺ɾ֬཰౷ܭ〣஌ࣝぇલ
    ఏ〝「〛⿶〳『
    ▶ ࢿྉத〣つアゆ゚ぢがへ〤 Python ぇ࠾༻「〛⿶〳『ɻ
    ▶ ษڧձ〠ؔ『぀৘ใ〠〙⿶〛〤 Hashtag: #PRML ֶ〱⿸ ぇ
    ࢖〘〛ൃ৴「〛⿶ 〳『
    4 / 43

    View Slide

  5. PRML Seminar #1
    Introduction
    PRML ྠߨ #2 ಺༰
    ࠓճ〣୲౰
    1 → 1.6
    5 / 43

    View Slide

  6. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ػցֶश〝〤?
    むのがアೝࣝ (Pattern Recognition):
    ܭࢉػぎ゚っ゙どわぇ௨」〛ɺぶがの〣த〣نଇぇࣗಈత〠ݟ〙
    々ग़『ɻߋ〠〒〣نଇੑぇ༻⿶〛ぶがのぇҟ〟぀じふっ゙〠෼ྨ
    『぀ɻ
    ྫ) खॻ ਺ࣈ〣ೝࣝ
    ೖྗ〝「〛 28 × 28 〣େ 《〣खॻ ਺ࣈ〣ը૾⿿⿴぀ɻೖྗ
    ぶがの〤 784 ࣍ݩ〣࣮਺஋よぜぷ゚ x 〜දݱ〜 ぀ɻよぜぷ゚ x
    ぇೖྗ〝「〛ड々औ〿ɺ〒ぁ⿿ 0 . . . 9 〣〞〣਺ࣈぇද「〛⿶぀⿾
    ぇग़ྗ『぀ػցぇ࡞぀ɻ
    ਤ 1: खॻ ਺ࣈ〣ը૾ྫ
    6 / 43

    View Slide

  7. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ࣮ݱํ๏ (1)
    ਓྗ〠〽぀゚が゚〣࡞੒ → ゚が゚਺〣ൃࢄ (࣮ݱෆՄ)
    ػցֶशతぎゆ゜がば
    ▶ ܇࿅ू߹ (training set):
    N ݸ〣खॻ จࣈ〣େ 〟ू߹ぇ༻ҙ『぀ {x1, . . . , xN }
    ▶ ໨ඪよぜぷ゚ (target vector): t
    Ұ〙Ұ〙〣਺ࣈ〠ରԠ『぀じふっ゙ぇද『よぜぷ゚
    ࠷ऴత〠ಘ〾ぁ぀〣〤 y(x) 〜⿴぀ɻ
    〈〣ؔ਺〠਺ࣈ〣ը૾ x ぇೖྗ『぀〝ɺ໨ඪよぜぷ゚〝ը૾ぶが
    の〠ූ߸ԽՄೳ〟ぶがの⿿߹੒《ぁ〔よぜぷ゚ y(໨ඪよぜぷ゚)
    ⿿゘よ゙アそ《ぁ〔ը૾⿿ฦ〘〛。぀ɻ
    7 / 43

    View Slide

  8. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ࣮ݱํ๏ (2)
    ▶ ܇࿅ (training), ֶश (leaning) ஈ֊ :
    training set 〣〴〜ゑぶ゚Խ《ぁ〛⿶぀ঢ়ଶ
    ▶ ふとぷू߹ (test set) :
    ܇࿅ू߹Ҏ֎〣ぶがの (ະ஌〣ぶがの)
    ▶ ൚Խ (generalization) :
    ܇࿅ू߹Ҏ֎〣ぶがの (ະ஌〣ぶがの) 〠ର「〛దԠՄೳ〠《
    【぀〈〝
    ࣮໰୊〝「〛ೖྗぶがの〤େ 〟ଟ༷ੑぇ࣋〙ɻ→
    ൚Խ⿿த৺త〟՝୊〝〟぀ɻ
    8 / 43

    View Slide

  9. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ྫ)Ajax ぇ࢖〘〔खॻ จࣈೝࣝ
    ਤ 2: Ajax ぇ࢖〘〔खॻ จࣈೝࣝ
    ࢀߟ: http://chasen.org/ taku/software/ajax/hwr/
    9 / 43

    View Slide

  10. PRML Seminar #1
    ୈ 1 ষ ং࿦
    લॲཧ (Preprocessing)
    ࣮ੈք〜〤ɺೖྗม਺〤લॲཧ (Preprocessing) 〠〽〿໰୊ぇղ 
    〹『。「〛⿼。ɻ
    ྫ) खॻ ਺ࣈ
    ▶ ਺ࣈը૾〠มܗ (ぎやくアม׵)ɺ֦େɾॖখぇߦ⿶
    ಉҰ〣େ 《〠ม׵ → ೖྗぶがの〣ଟ༷ੑ〣ݮগ
    લॲཧ〣ஈ֊〤ಛ௃நग़ (feature extraction) 〝ݺ〥ぁ぀ɻ
    ଟ༷ੑݮগ〣໨తҎ֎〠〷ɺܭࢉ〣ߴ଎Խ〣〔〶〠〷༻⿶〾ぁ぀
    ࣄ⿿ଟ⿶ɻ
    10 / 43

    View Slide

  11. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ػցֶश〣෼ྨ
    1. ڭࢣ⿴〿ֶश (supervised learning) : ܇࿅ぶがの⿿゘よ゙ア
    そ《ぁ〛⿶぀ঢ়ଶ〜〣໰୊
    ▶ ぜ゘と෼ྨ (classification) ໰୊ : ֤ೖྗよぜぷ゚ぇ༗ݶݸ〣
    ཭ࢄじふっ゙〠෼ྨ『぀໰୊
    ▶ ճؼ (regression) : ٻ〶぀ग़ྗ⿿Ұ〙〟⿶「〤〒ぁҎ্〣࿈ଓ
    ม਺〜⿴぀〽⿸〟໰୊
    2. ڭࢣ〟「ֶश (unsupervised learning) : ܇࿅ぶがの⿿゘よ゙
    アそ《ぁ〛⿶぀ঢ়ଶ〜〣໰୊
    ▶ ぜ゘との゙アそ (clustering) : ྨࣅ「〔ࣄྫ〣そ゚がゆぇݟ〙
    々぀
    ▶ ີ౓ਪఆ (density estimation) : ೖྗۭؒ〠⿼々぀ぶがの〣෼
    ෍ぇݟ〙々぀
    3. ൒ڭࢣ⿴〿ֶश (semis-upervised learning) : ܇࿅ぶがの⿿゘
    よ゙アそ《ぁ〛⿶぀〷〣〝ඇ゘よ゙アそঢ়ଶ〣෺⿿ࠞࡏ「〛
    ⿶぀ঢ়ଶ〜〣໰୊
    11 / 43

    View Slide

  12. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ڧԽֶश (reinforcement learning)(1)
    ڭՊֶश (reinforcement learning) : ⿴぀༩⿺〾ぁ〔৚݅Լ〜ɺใ
    ुぇ࠷େԽ『぀〽⿸〟ద౰〟ߦಈぇݟ〙々぀໰୊ɻ
    ঢ়ଶ〝ߦಈ〣ܥྻ⿾〾؀ڥ〝〣૬ޓ࡞༻ぇ௨」〛ֶशぇߦ⿸ (ߦ
    ಈج४〤௚ۙ〣ใु〕々〜〤〟。ɺաڈ〣ߦಈ〷ࢀߟ〠《ぁ぀)ɻ
    ڭࢣ⿴〿ֶश〝〣ҧ⿶ : ࠷ద〟౴⿺〤༩⿺〾ぁ』〠ࢼߦࡨޡぇ௨
    」〛ֶशぎ゚っ゙どわࣗ〾⿿࠷దղぇൃݟ『぀
    みひぜせをゑア〠ର『぀ڧԽֶश〣ద༻ (Tesauro 1994)
    ぺゔが゘゚ぼひぷゞがぜ (ୈ 5 ষ) 〠〽〿ɺࣗ෼ࣗ਎〣ぢゃが〝Կ
    ඦສ〷〣だがわぇ〈〟『ඞཁ⿿⿴぀ɻ
    બ୒ࢶ〤ແ਺〠ଘࡏ『぀⿿ɺউར〝⿶⿸ܗ〜「⿾ใुぇ༩⿺぀〈
    〝⿿〜 〟⿶ɻ〒〣〔〶ɺউར〠ؔ܎『぀ख〠ର「〛〤ਖ਼֬〠ใ
    ुぇׂ〿౰〛぀ඞཁ⿿⿴぀ (৴པ౓ׂ〿౰〛໰୊)ɻ
    12 / 43

    View Slide

  13. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ڧԽֶश (reinforcement learning)(2)
    Լه〣̎〙ぇߦ⿶ڧԽֶशぇߦ⿸ (ぷ゛がへざや〣ؔ܎)ɻ
    ▶ ୳ࠪ (exploration) : ৽ن〣ख⿿〞ぁ〰〞༗ޮ〟〣⿾ぇ୳『
    ▶ ར༻ (exploitation) : ߴ⿶ใु⿿ಘ〾ぁ぀〈〝⿿い⿾〘〛⿶぀
    ߦಈぇऔ぀
    13 / 43

    View Slide

  14. PRML Seminar #1
    ୈ 1 ষ ং࿦
    ୈ 1 ষ〜ಋೖ『぀ 3 〙〣ॏཁ〟ಓ۩
    1. ֬཰࿦
    2. ܾఆཧ࿦
    3. ৘ใཧ࿦
    14 / 43

    View Slide

  15. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    ܇࿅ぶがの
    ೖྗ : N ݸ〣؍ଌ஋ x ぇฒ〮〔 x = (x1, . . . , xN )T
    ग़ྗ : 〒ぁ〓ぁ〠ରԠ『぀؍ଌ஋ t = (t1, . . . , tN )T
    ະ஌〣ೖྗม਺ x 〠ର「〛໨ඪม਺ぇ༧ଌ「〔⿶ t (൚Խ)
    ໨ඪ〝『぀ゑぶ゚〤 sin 2πxɺ܇࿅ぶがの〤ぽぐどぇ৐【〛
    N = 10 〜༩⿺〾ぁ぀ɻ
    ਤ 3: ex)N=10 〜܇࿅ぶがの⿿༩⿺〾ぁ〛⿶぀
    15 / 43

    View Slide

  16. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    y(x, w) = w0x0 + w1x1 + · · · + wM xM =
    M

    j=0
    wjxj
    M:ଟ߲ࣜ〣࣍਺ (order)
    ω:ଟ߲ࣜ〣܎਺ɺ〳〝〶〛 ω 〜ද『
    E(w) =
    1
    2
    N

    n=1
    {y(xn, w) − tn}2
    ܇࿅ぶがの〠ଟ߲ࣜぇ⿴〛〤〶぀〈〝〜ɺ܎਺〣஋ぇٻ〶〔⿶ →
    ޡࠩؔ਺ (Error Function) 〣࠷খԽぇ໨ࢦ『ɻ
    16 / 43

    View Slide

  17. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    Fitting Order Case:M=0,1
    0.0 0.2 0.4 0.6 0.8 1.0
    −1.5
    −1.0
    −0.5
    0.0
    0.5
    1.0
    1.5
    ਤ 4: M=0
    0.0 0.2 0.4 0.6 0.8 1.0
    −1.5
    −1.0
    −0.5
    0.0
    0.5
    1.0
    1.5
    ਤ 5: M=1
    17 / 43

    View Slide

  18. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    Fitting Order Case:M=3,9
    0.0 0.2 0.4 0.6 0.8 1.0
    −1.5
    −1.0
    −0.5
    0.0
    0.5
    1.0
    1.5
    ਤ 6: M=3
    0.0 0.2 0.4 0.6 0.8 1.0
    −1.5
    −1.0
    −0.5
    0.0
    0.5
    1.0
    1.5
    ਤ 7: M=9
    18 / 43

    View Slide

  19. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    աֶश〝ฏۉೋ৐ฏํࠜޡࠩ
    9 ࣍〜やくひふくアそ「〔৔߹ɺաֶश (over-fitting)
    ػցֶश〣໨ඪ:ະ஌〣ぶがの〠ର「〛ਫ਼౓〣ߴ⿶༧ଌ (൚Խ)
    ERMS =

    2E(w)/N
    ฏۉೋ৐ฏํࠜޡࠩ (root-mean-square error,RMS error)
    ▶ N 〜ׂ぀〈〝〜つアゆ゚਺〣せをひゆぇফ『
    ▶ ฏํࠜぇऔ぀〈〝〜ɺݩ〣ई౓〠໭『
    19 / 43

    View Slide

  20. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    ぶがのなひぷ〣つぐど〠Ԡ」〔աֶश〣༷ࢠ
    0.0 0.2 0.4 0.6 0.8 1.0
    −1.5
    −1.0
    −0.5
    0.0
    0.5
    1.0
    1.5
    ਤ 8: M=9,N=1000
    20 / 43

    View Slide

  21. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    ޡࠩؔ਺〣ਖ਼ଇԽ (regularization) աֶशぇ੍ޚ
    E(w) =
    1
    2
    N

    n=1
    {y(xn, ω) − tn}2 +
    λ
    2
    ||w||2
    ̎࣍〣ਖ਼ଇԽ〣৔߹ɺ゙ひでճؼ (ridge regression)
    21 / 43

    View Slide

  22. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.1 ଟ߲ࣜۂઢやくひふくアそ
    ݕূ༻ू߹ (Validation set):w ぇܾఆ『぀〔〶〠ぶがのなひぷ
    りが゚へɾぎげぷू߹ (hold-out set) 〝〷ݺ〥ぁ぀ɻ
    ܽ఺: وॏ〟ぶがのぇແବ〠『぀
    22 / 43

    View Slide

  23. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    ֬཰࿦
    ぶがの〠〤ぽぐど⿿ඞ』෇ਵ「ɺぶがのなひぷ〣つぐど〷༗ݶ〜
    ⿴぀ɻෆ࣮֬ੑ⿿ॏཁ〟֓೦〝〟぀ɻ
    ֬཰࿦〣֓೦ぇ؆୯〠આ໌
    ਤ 9: ੺〝੨ɺ̎〙〣ശ⿿⿴぀ɻ੺〣ശ〠〤〿え〉⿿ 2 ݸɺざ゛アで⿿
    6 ݸɺ੨〣ശ〠〤〿え〉⿿ 3 ݸɺざ゛アで⿿Ұݸೖ〘〛⿶぀ɻ੺〣ശぇ
    40%, ੨〣ശぇ 60%〜બ〨ɺՌ෺〤ಉ」֬⿾〾「《〜બ〫ɻ
    23 / 43

    View Slide

  24. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    ֬཰〣جຊత๏ଇ
    ֬཰〣Ճ๏ఆཧ (Sum rule)
    p(X) =

    Y
    p(X, Y )
    ֬཰〣৐๏ఆཧ (Prduct rule)
    p(X, Y ) = p(Y |X)p(X)
    よぐど〣ఆཧ (Bay’s theorem)
    p(Y |X) =
    p(X|Y )p(Y )
    p(X)
    24 / 43

    View Slide

  25. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    よぐど〣ఆཧ〣௚ײత〟આ໌
    〞〣ശぇબ〨〳「〔⿾?
    ࣄલ֬཰ (prior probability)
    ࣄલ〠ಘ〾ぁ぀֬཰஋ p(Box)
    ࣄޙ֬཰ (posterior probability)
    Ռ෺ぇબえ〕ޙ֬ఆ『぀֬཰ p(Box|Fruit)
    Ұ୴Ռ෺⿿ざ゛アで〕〝い⿾ぁ〥ɺ੺⿶ശ〤ざ゛アで〣਺⿿ଟ⿶
    → ੺⿶ശ〜⿴぀֬཰⿿ߴ。〟぀ɻ
    25 / 43

    View Slide

  26. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.1 ࿈ଓม਺〭〣֦ு
    ֬཰ີ౓ (probability density) ର৅:࿈ଓ஋ (continuous value)
    ࿈ଓ஋〜⿴぀ม਺ x ⿿۠ؒ (x, x + δx) 〠ೖ぀֬཰⿿
    δx → 0 〜༩⿺〾ぁ〔࣌〣 x ্〣 p(x)
    1. p(x) ≥ 0
    2.


    −∞
    p(x)dx = 1
    ֬཰࣭ྔ (probability mass) ର৅:཭ࢄू߹ (discrete set)
    ཭ࢄม਺〜⿴぀ x ⿿⿴぀ x 〠〟぀֬཰ p(x)
    26 / 43

    View Slide

  27. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.2 ظ଴஋〝෼ࢄ〠〙⿶〛
    ظ଴஋ (expectation)
    ⿴぀ؔ਺ f(x) 〣֬཰෼෍ p(x) Լ〜〣ฏۉ஋
    ཭ࢄ෼෍ E[f] =

    x
    p(x)f(x)
    ࿈ଓม਺ E[f] =

    p(x)f(x)dx
    ෼ࢄ (variance)
    var[f] = E[(f(x) − E[f(x)])2]
    = E[f(x)2 − 2f(x)(E)[f(x)] + E[f(x)]2]
    = E[f(x)2 − 2f(x)(E)[f(x)] + E[f(x)]2]
    = E[f(x)2] − 2E[f(x)](E)[f(x)] + E[f(x)]2
    = E[f(x)2] − E[f(x)]2 seminar1.5
    27 / 43

    View Slide

  28. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    ڞ෼ࢄ (covariance)
    cov[x, y] = Ex,y[(x − E[x])(y − E[y])]
    = Ex,y[xy + E[x]E[y] − xE[y] − yE[x]]
    = Ex,y[xy] + Ex,y[E[x]E[y]] − Ex,y[xE[y]] − Ex,y[yE[x]]
    = Ex,y[xy] + E[x]E[y] − E[x]E[y]] − E[y]E[x]
    = Ex,y[xy] − E[x]E[y]
    = Ex,y[xy] − E[x]E[y](x, y: independ)
    = E[x]E[y] − E[x]E[y](x, y: independ)
    = 0 seminar1.6
    28 / 43

    View Slide

  29. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.3 Frequentist VS Bayesian
    Frequentist
    ֬཰ぇ゘アはわ〟܁〿ฦ「తࢼߦ〣ස౓〝〴〟『
    Bayesian
    ෆ࣮֬ੑ〣౓߹⿶ぇ֬཰〝『぀
    ྫ) 〈〣ੈք〠Կ౓〷܁〿ฦ「ߦ⿸〈〝⿿〜 ぀ࣄ
    ৅⿿〞ぁ〕々⿴぀⿾ߟ⿺〛〴〛ཉ「⿶ɻ
    ೆۃ〣ණ⿿૕ࣦ『぀〟〞ෆ֬⿾〟ࣄ৅⿿ى 〔〝「
    〽⿸ɻ೥ؒ〞〣ఔ౓༹々〛⿶぀⿾〣৘ใぇಘ぀〈〝
    〜ɺ〒〣৘ใぇదԠ『぀〈〝〜ೆۃ〣ණ⿿૕ࣦ『぀
    ෆ֬⿾《ぇ༧ଌ『぀ɻ
    29 / 43

    View Slide

  30. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.3 Likelifood function working on Frequentist, Bayesian
    ໬౓ؔ਺ (likelihood function)
    p(Data|param) ぶがの〠ର『぀ධՁɻむ゘ゐが
    の〣ؔ਺〝〴〟【぀ɻ
    Frequentist
    む゘ゐがの〤ݻఆ《ぁ〛⿶぀〷〣〝ߟ⿺〾ぁ〛⿶぀ɻ
    ぶがの〣෼෍ぇߟྀ「〛ɺむ゘ゐがの〤ܾఆ《ぁ぀ɻ
    Bayesian
    ぶがの〤།Ұ〠ఆ〳〿ɺむ゘ゐがの〠ؔ『぀ෆ࣮֬
    ੑ〤 w 〣֬཰෼෍〝「〛දݱ《ぁ぀ɻ
    30 / 43

    View Slide

  31. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    Bayesian 〣ར఺
    ࣄલ஌ࣝぇࣗવ〠ಋೖ〜 ぀〈〝
    ྫ) Frequentist 〠⿼々぀؍ଌ〠ภ〿⿿ൃੜ「〔৔߹〣֬཰
    ެฏ〠දɾཪ⿿〜぀ぢぐアぇ 3 ճ౤〆〛ຖճද⿿〜〔ɻݹయత〟
    ࠷໬ਪఆ〜〤ɺද⿿ग़぀֬཰〤 1 〠〟〘〛「〳⿸ɻ
    ໬౓
    ໬౓〤֬཰〝਺஋త〠ಉ」〜⿴぀ɻ
    ྫ) つぐぢ゜ぇৼ〘〛 1 ⿿ࡾճ࿈ଓಉ」෺⿿ग़぀ಉ
    ࣌֬཰〝໬౓〤ಉ஋〜⿴぀ɻ
    ҧ⿶〤ɺ֬཰〤ʮࣄ৅〣֬཰ʯ〜⿴〿ɺ໬౓〤ʮ؍
    ଌぶがのԼ〜〣Ծઆ〣໬౓ʯ〜⿴぀ɻ
    (likelihood for a hypothesis given a set of
    observations)
    31 / 43

    View Slide

  32. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.4 Gaussian distribution
    N(x|µ, σ2) =
    1

    2πσ2
    exp{−
    1
    2σ2
    (x − µ)2}
    params: µ, σ2
    −3 −2 −1 0 1 2 3
    0.00
    0.05
    0.10
    0.15
    0.20
    0.25
    0.30
    0.35
    0.40
    ਤ 10: µ = 0, σ = 1
    32 / 43

    View Slide

  33. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    1.2.5 Re:Fitting curve
    Bayesian 〠⿼々぀ۂઢやくひふくアそ
    p(t|x, w, β) = N(t|y(x, w), β−1)
    ෼෍〣ٯ෼ࢄ〠૬౰『぀む゘ゐがの β ぇఆٛɻ
    ܇࿅ぶがの x, t ぇ࢖〘〛ɺະ஌〣む゘ゐがの w, β ぇٻ〶぀〣〠
    ࠷໬ਪఆぇ༻⿶぀ɻ
    p(t|x, w, β) =
    N

    n=1
    N(tn|y(xn, w), β−1)
    33 / 43

    View Slide

  34. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    Maximum likelifood function
    ໬౓ؔ਺ぇ࠷େԽ『぀《⿶〠〤ɺର਺ぇ༻⿶぀ɻར఺〝「〛
    1. ৐ࢉ⿿Ճࢉ〠มԽ
    2. ֬཰〤গ਺〜දݱ《ぁ぀〣〜ɺ৐ࢉぇߦ⿸〝ぎアはがや゜が
    ⿿සൃ
    p(t|x, w, β) =
    N

    n=1
    N(tn|y(xn, w), β−1)
    ln p(t|x, w, β) = −
    β
    2
    N

    n=1
    {y(xn, w) − tn}2 +
    N
    2
    , ln β −
    N
    2
    ln2π
    34 / 43

    View Slide

  35. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.2 ֬཰࿦
    ༧ଌ෼෍ (predictive distribution)
    む゘ゐがのよぜぷ゚ wML
    ぇ〳』ܾఆ「ɺβML
    ぇਪఆ『぀ɻ
    1
    βML
    =
    1
    n
    N

    n=1
    {y(xn, wML) − tn}2
    βML
    ⿿ܾఆ《ぁ〔〈〝〠〽〿༧ଌ෼෍〝⿶⿸ܗ〜 t 〣֬཰෼෍ぇ
    ߟ⿺぀〈〝⿿〜 ぀ɻ
    p(t|x, w, β) =
    N

    n=1
    N(tn|y(xn, w), β−1)
    35 / 43

    View Slide

  36. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.4 ࣍ݩ〣ढ⿶
    ࣮ੈք໰୊〭〣Ԡ༻
    ۂઢやくひふくアそ〜〤ɺೖྗむ゘ゐがの〤Ұ〙ɻݱ࣮໰୊〜〤
    ೖྗむ゘ゐがの〤ෳ਺ݸଘࡏ『぀〣⿿౰〔〿લ〜⿴぀ɻ
    ࣗ෼〔〖〤 3 ࣍ݩ〣ଘࡏɻ4 ࣍ݩҎ্〣ۭؒ〠ؔ「〛〤زԿత௚
    ײ〤ಇ 〚〾⿶ɻ
    10
    0 1 2 3 4 5 6 7 8 9
    10
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    ਤ 11: 3 ぜ゘とଘࡏ『぀ぶがのなひぷ〣ࢄ෍ਤɺい⿾〿〹『⿶〽⿸〠
    ̎࣍ݩ〣෦෼ۭ〭ؒ
    36 / 43

    View Slide

  37. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.4 ࣍ݩ〣ढ⿶
    ׂ〿౰〛ํ๏〣ఏҊ
    10
    0 1 2 3 4 5 6 7 8 9
    10
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    ਤ 12: ಉぞひへ಺〜࠷〷ଟ⿶ぜ゘とぇׂ〿౰〛぀
    ߴ࣍ݩ〠༗ޮ〟̎〙〣ੑ࣭
    1. ࣮ぶがの〜〷༗ޮ〟ぶがの〤௿࣍ݩ〠ूத「〛⿶぀
    2. ぶがの〤׈〾⿾〟ࣄ⿿ଟ⿶ (ہॴత〠) 〣〜ɺ಺ૠ〠〽〿ରԠ
    『぀
    ֶशゑぶ゚ぇద༻『぀ඞཁ〤〟。ɺ಺ૠ〜ࣄ଍〿぀
    37 / 43

    View Slide

  38. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.5 ܾఆཧ࿦
    About decision theory
    ೖྗよぜぷ゚ x 〝໨ඪม਺ t ⿿⿴〿ɺx 〣৽〔〟஋〠ର「〛ରԠ
    『぀ t ぇٻ〶〔⿶ɻ
    ճؼ໰୊ t 〤࿈ଓม਺
    ぜ゘と෼ྨ t 〤ぜ゘と゘よ゚
    ਪ࿦ (inference) ܇࿅ぶがの x, t ⿾〾ಉ࣌֬཰෼෍ p(x, t) ぇٻ〶
    ぀〈〝
    ܾఆཧ࿦〣ओ୊〤ਪ࿦〜ಘ〾ぁ〔֬཰෼෍ t ぇ༧ଌ「ܾఆ『぀
    〈〝ɻ
    38 / 43

    View Slide

  39. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.5 ܾఆཧ࿦
    key of decision theory
    ▶ Berger, James O. Statistical decision theory and Bayesian
    analysis. Springer Science & Business Media, 1985.
    ▶ Bather, John. ”Decision Theory.A n Introduction to Dynamic
    Programming and Sequential Decisions.” (2000). APA
    ྫ) ױऀ〣 X ઢը૾ぇよぜぷ゚Խ「〔 x ぇ࢖༻「〛ɺ〒〣ױऀ⿿
    ؞⿾〞⿸⿾ぇぜ゘と C1(؞ױऀ),C2(ඇ؞ױऀ) 〠い々〔⿶ɻ
    (p(Ck|x ぇٻ〶〔⿶)
    p(Ck|x) =
    p(x|Ck)p(Ck)
    p(x)
    p(Ck) 〤 X ઢը૾ぇద༻『぀લ〠ਓؒ⿿؞〠⿾⿾぀ࣄલ֬཰ɺ
    p(Ck|x) 〤 X ઢը૾⿾〾ಘ〾ぁ〔৘ใぇ࢖༻「〛よぐど〣ఆཧぇ
    ༻⿶〛मਖ਼「〔ࣄޙ֬཰〜⿴぀ɻ
    39 / 43

    View Slide

  40. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.5 ܾఆཧ࿦
    some decision rule
    ଛࣦؔ਺ (loss function) ܾఆɾߦಈ〠෇ਵ『぀ଛࣦぇද『ؔ਺
    ؞ױऀ〣ぜ゘と෼ྨ〜ɺ〞〖〾⿿ױऀ〠〽〿େ ⿶
    ଛࣦぇ༩⿺぀〕あ⿸⿾?
    ▶ ؞〜〟⿶ਓぇ؞〝਍அ『぀
    ▶ ؞〣ਓぇ؞〝਍அ『぀
    غ٫ざゆてゖア (reject option) ܾఆ⿿ࠔ೉〟৔߹〠〤ɺܾఆぇආ
    々぀બ୒
    1
    0
    0.1
    0.2
    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    Probabability
    θ
    棄却領域
    p(C2|x)
    p(C1|x)
    40 / 43

    View Slide

  41. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.5 ܾఆཧ࿦
    1.5.4 inferense and decision
    ਪ࿦ஈ֊ → ܾఆஈ֊〭〣̏〙〣ぎゆ゜がば
    ੜ੒ゑぶ゚ (generative model)
    ぜ゘と〉〝〠ぜ゘と〣৚݅෇ ີ౓ p(x|Ck) ぇܾఆ
    『぀ਪ࿦໰୊ぇղ。ɻ〒「〛 p(Ck) 〷ٻ〶〛ɺࣄޙ
    ֬཰ p(Ck|x) ぇٻ〶぀ɻ〒「〛ܾఆཧ࿦〠〽〿ぜ゘
    と〠ׂ〿౰〛぀ɻग़ྗ〕々〜〟〟。ೖྗ〷ゑぶ゚Խ
    《ぁ぀〣〜ɺぶがの〣ੜ੒〷Մೳ〠〟぀ɻ
    ࣝผゑぶ゚ (discriminative model)
    ぜ゘とࣄޙ֬཰ p(Ck|x) ぇܾఆ『぀ਪ࿦໰୊ぇղ ɺ
    ܾఆཧ࿦ぇ༻⿶〛৽〔〟 x ぇぜ゘と〠ׂ〿౰〛぀ɻ
    ࣝผؔ਺ (discriminative function) ਪ࿦【』〠௚઀ܾఆ『぀ɻ֤
    ೖྗ x ⿾〾௚઀ぜ゘と゘よ゚〭〝ࣸ૾『぀ɻ
    41 / 43

    View Slide

  42. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.6 ৘ใཧ࿦
    ৘ใྔ〝ごアぷ゜ゃが
    ⿴぀཭ࢄ֬཰ม਺ x ⿿⿴぀〝『぀ɻ
    h(x) = − log2
    p(x) : ৘ใྔぇද『
    H[x] = −

    x
    p(x) log2
    p(x) : ֬཰ม਺ x 〣ごアぷ゜ゃが
    42 / 43

    View Slide

  43. PRML Seminar #1
    ୈ 1 ষ ং࿦
    1.6 ৘ใཧ࿦
    ֬཰ม਺ x ぇ༩⿺ɺ8 ݸ〣Մೳ〟ঢ়ଶぇ౳֬཰〜औ぀ɻ
    〒〣஋ぇड৴ऀ〠 3bit 〣௕《〠「〛ૹ぀ɻ
    〈〣ม਺〣ごアぷ゜ゃが〤ɺ
    H[x] = −8 ×
    1
    8
    log2
    1
    8
    = 3bit
    ࣍〠 x 〤 a, b, c, d, e, f, g 〣 8 〈〣Մೳ〟ঢ়ଶぇ〝〿ɺ
    {1
    2
    , 1
    4
    , 1
    8
    , 1
    16
    , 1
    64
    , 1
    64
    , 1
    64
    , 1
    64
    } 〣֬཰〜༩⿺〾ぁ぀ɻ〒〣ࡍ〣ごアぷ
    ゜ゃが〤
    H[x] = −
    1
    2
    log2
    1
    2

    1
    4
    log2
    1
    4

    1
    8
    log2
    1
    8

    1
    16
    log2
    1
    16

    4
    64
    log2
    1
    64
    = 2bit
    ֬཰〣෼෍⿿ඇҰ༷〟〷〣〽〿〷ɺҰ༷〟෼෍〣〰⿸⿿ごアぷ゜
    ゃが〤ߴ⿶ࣄ⿿い⿾぀ɻごアぷ゜ゃが〤֬཰෼෍〝ີ઀〠ؔ܎⿿
    ⿴぀ɻ
    43 / 43

    View Slide