はじめての『はじめてのパターン認識』第8章
This is a general introduction of support vector machine (SVM) for machine learning, which includes hard-margin SVM, soft-margin SVM and Kernel trick method.
0ͱɼ (x1i , x2i ) ͷڑ d ࣍ͷࣜͰॻ͖දͤΔɽ wT = (w1 w2 ) xi T = (x1i x2i ) ઢܗࣝผฏ໘ D D Ϛʔδϯ ઢܗࣝผฏ໘ʹ ࠷͍ۙσʔλͱͷڑ% ඞͣ Ҏ্ʹͳΔ d D d = |w1 x1i + w2 x2i + b| w2 1 + w2 2 = |wT xi + b| ∥w∥ d = |wT xi + b| ∥w∥ ≥ D ϋʔυϚʔδϯ47.
≥ 1 (wT xi + b ≥ 0) wT xi + b ≤ − 1 (wT xi + b < 0) ࣝผਖ਼ղͷ݅ ͱઢͷڑͷެࣜ ઢ w1 x1 + w2 x2 + b = 0ͱɼ (x1i , x2i ) ͷڑ d ࣍ͷࣜͰॻ͖දͤΔɽ d = |w1 x1i + w2 x2i + b| w2 1 + w2 2 = |wT xi + b| ∥w∥ wT = (w1 w2 ) xi T = (x1i x2i ) D D C1 C2 ti = 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≥ 1 ti (wT xi + b) ≥ 1 ti (wT xi + b) ≥ 1 ti = − 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≤ 1 ϋʔυϚʔδϯ47.
0ͱɼ (x1i , x2i ) ͷڑ d ࣍ͷࣜͰॻ͖දͤΔɽ d = |w1 x1i + w2 x2i + b| w2 1 + w2 2 = |wT xi + b| ∥w∥ wT = (w1 w2 ) xi T = (x1i x2i ) ti (wT xi + b) ≥ 1 D D C1 C2 ti = 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≥ 1 ti (wT xi + b) ≥ 1 ti (wT xi + b) ≥ 1 ti = − 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≤ 1 ϋʔυϚʔδϯ47. ࣝผਖ਼ղͷ݅
0ͱɼ (x1i , x2i ) ͷڑ d ࣍ͷࣜͰॻ͖දͤΔɽ d = |w1 x1i + w2 x2i + b| w2 1 + w2 2 = |wT xi + b| ∥w∥ wT = (w1 w2 ) xi T = (x1i x2i ) ti (wT xi + b) − 1 ≥ 0 D D C1 C2 ti = 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≥ 1 ti (wT xi + b) ≥ 1 ti (wT xi + b) ≥ 1 ti = − 1 Ϋϥεͷਖ਼ղϥϕϧ Ϋϥεʹ͓͍ͯ wT xi + b ≤ 1 ϋʔυϚʔδϯ47. ࣝผਖ਼ղͷ݅
f(w) = min 1 2 wTw w w α w ϥάϥϯδϡͷະఆ๏Λ༻͍ͯ ओΛରʹམͱ͠ࠐΜͰ ઢܗࣝผฏ໘ͷࣜΛٻΊΔ D D C1 C2 ·ͱΊ ϋʔυϚʔδϯࣝผث ઢܗࣝผՄೳͰ͋Δ߹ʹ࠷େϚʔδϯΛ ࣮ݱ͢Δઢܗࣝผฏ໘Λج४ʹ̎Ϋϥε ͷྨΛߦ͏ํ๏ ϋʔυϚʔδϯ47.
Hα + αT1 αT t = 0, C ≥ αi ≥ 0 ࣗ༝αϙʔτϕΫτϧ TW ͢ͳΘͪɹɹɹɹɹɹɹɹͱͳΔ ɹɹɹ͓ΑͼɹɹɹΛ༻͍Εɼ࠷దͳόΠΞεɹ ti (wT xi + b) − 1 = 0 xi = xsv ti = tsv b0 ti (wT xi + b) − 1 = 0 ti (wT xi + b) − 1 = 0 ti (wT xi + b) − 1 = 0 ti (wT xi + b) − 1 = 0 ti (wT 0 xi + b) − 1 = 0 tsv (wT 0 xsv + b0 ) − 1 = 0 b0 = 1 tsv − wT 0 xsv ˠྫ ͔ΒٻΊΔɽ ࣗ༝αϙʔτϕΫτϧΛ༻͍ͨཧ༝σʔλຖʹҟͳΔɹ͕ͳ͍͍ͨ͘͢Ί ɹɹɹɹɹɹɹɹɹɹͱॻ͚Δ্ݶαϙʔτϕΫτϧ͍ʹ͍͘ ξi ti (wT xi + b) − 1 + ξi = 0 4.0Λ༻͍ͯରΛղ͖ɼ࣮ࡍʹɹΛશͯٻΊऴ͑ͨΒʜ ιϑτϚʔδϯ47.