Slide 1

Slide 1 text

ࡾ୐༔հ / Pepabo R&D Institute, GMO Pepabo, Inc. 2018.03.28 Fukuoka.go#10 GoݴޠͰΦϯϥΠϯ֎Ε஋ݕग़ ΤϯδϯSmartSifterΛ࣮૷ͨ͠

Slide 2

Slide 2 text

ϓϦϯγύϧΤϯδχΞ ࡾ୐༔հ!NPOPDISPNFHBOF (.0ϖύϘגࣜձࣾϖύϘݚڀॴ IUUQCMPHNPOPDISPNFHBOFDPN

Slide 3

Slide 3 text

3

Slide 4

Slide 4 text

ҟৗݕ஌

Slide 5

Slide 5 text

ҟৗݕ஌ 5

Slide 6

Slide 6 text

• ݴ͏·Ͱ΋ͳ͘ਖ਼ৗͷ൓ର • ਖ਼ৗͱ͸͍ͭ΋ͷঢ়ଶɽͭ·Γɼ͍ͭ΋ͷঢ়ଶͰ͸ͳ͍͜ͱɽ • ඞͣ͠΋ෆਖ਼ͱ͸ݶΒͳ͍ • Ͱ͸ɼ͍ͭ΋ͷঢ়ଶͱɼͦ͏Ͱ͸ͳ͍͜ͱΛ੾Γ෼͚Δ৚݅͸Կ͔ɽ
 ·ͨɼͦ΋ͦ΋໌֬ʹ੾Γ෼͚Δ͜ͱ͕Ͱ͖Δͷ͔ 6 ҟৗݕ஌ʹ͓͚Δɼҟৗͱ͸

Slide 7

Slide 7 text

• ೖྗσʔλ͔Βσʔλൃੜ෼෍ͷ֬཰ϞσϧΛֶश͠ɼͦͷϞσϧΛجʹɼσʔ λͷҟৗ౓߹͍ɼ·ͨ͸ϞσϧͷҟৗͳมԽ౓߹͍ΛείΞϦϯά͢Δɽ 7 ౷ܭతҟৗݕ஌ ग़య: σʔλϚΠχϯάʹΑΔҟৗݕ஌ (ࢁ੢݈࢘) ೖྗ ֬཰Ϟσϧ ݕग़ର৅ ֎Ε஋ݕग़ ଟ࣍ݩϕΫτϧ ಠཱϞσϧ ֎Ε஋ มԽ఺ݕग़ ଟ࣍ݩ࣌ܥྻ ࣌ܥྻϞσϧ ࣌ܥྻͷٸܹͳมԽ σʔλϚΠχϯάʹΑΔҟৗݕ஌ (ࢁ੢݈࢘) ʹ͓͚Δҟৗݕ஌ͷ෼ྨʢൈਮʣ

Slide 8

Slide 8 text

ҟৗݕ஌ 8 ֎Ε஋ݕग़ ಠཱϞσϧΛԾఆͯ͠ɼ૬ରతʹಛҟ ͳσʔλΛݕग़͢Δ มԽ఺ݕग़ ࣌ܥྻϞσϧΛԾఆͯ͠ɼٸܹͳมԽ Λݕग़͢Δ

Slide 9

Slide 9 text

ҟৗݕ஌ 9 ֎Ε஋ݕग़ ಠཱϞσϧΛԾఆͯ͠ɼ૬ରతʹಛҟ ͳσʔλΛݕग़͢Δ

Slide 10

Slide 10 text

ΦϯϥΠϯ֎Ε஋ݕग़Τϯδϯ SmartSifter Proposed by Yamanishi, K., Takeuchi, J., Williams, G. et al. (2004)

Slide 11

Slide 11 text

• ҟৗΛᮢ஋΍ϗϫΠτϦετʹΑͬͯݕग़͢Δݻఆతͳ൑அج४͸ɼط஌ͷൣ ғͰͷ൑அͱͳΔ • ҟৗΛଞͷҰൠతͳσʔλͱҟͳΔ΋ͷͱߟ͑Δ͜ͱͰ͖ΔͳΒ͹ɼະ஌ͷࣄ ৅΋൑அͰ͖ΔͷͰ͸ -> ֎Ε஋ݕग़ • ୯७ͳ౷ܭతͳ֎Ε஋ݕग़ʢϚϋϥϊϏεڑ཭౳ʣͰ͸σʔλͷൃੜػߏ͕ม Խ͠ͳ͍͜ͱΛલఏͱ͍ͯ͠Δ 11 ֎Ε஋ݕग़ͱͦͷ՝୊

Slide 12

Slide 12 text

• ΦϯϥΠϯ֎Ε஋ݕग़Τϯδϯ • ࣌ؒ͝ͱʹมԽ͍ͯ͘͠σʔλൃੜػߏʹରͯ͠దԠతʹֶश͠ɼείΞϦϯ ά͢Δ • ֶशͱείΞϦϯάΛσʔλೖྗ͝ͱʹஞ࣍ΦϯϥΠϯͰߦ͏ 12 SmartSifter On-line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms. Proposed by Yamanishi, K., Takeuchi, J., Williams, G. et al. (2004) Refs: http://cs.fit.edu/~pkc/id/related/yamanishi-kdd00.pdf

Slide 13

Slide 13 text

SmartSifter 13 ( x , y ) SDLE (Sequentially Discounting Laplace Estimation) SDEM (Sequentially Discounting Expectation and Miximizing) or SPDU (Sequentially Discounting Prototype Updating) Log loss or Hellinger score ཭ࢄ஋ϕΫτϧ x ࿈ଓ஋ϕΫτϧ y p( x ) p( y | x ) ※ SDLEʹΑͬͯಉఆ͞Ε֤ͨηϧ͝ͱʹϞσϧ͕ଘࡏ͢Δ p( x ) p ( y | x ) SDEM (Sequentially Discounting Expectation and Miximizing) or SPDU (Sequentially Discounting Prototype Updating) SDEM (Sequentially Discounting Expectation and Miximizing) or SPDU (Sequentially Discounting Prototype Updating) SDEM (Sequentially Discounting Expectation and Miximizing) or SPDU (Sequentially Discounting Prototype Updating) ϞσϧΛߋ৽ ֘౰͢Δηϧʹ֘౰͢ΔϞσϧΛߋ৽ Ψ΢εࠞ߹෼෍ ώετάϥϜີ౓ SL(xt, yt) = log p (t 1) (xt, yt)

Slide 14

Slide 14 text

SmartSifter 14 Refs: http://cs.fit.edu/~pkc/id/related/yamanishi-kdd00.pdf

Slide 15

Slide 15 text

Written in Golang !

Slide 16

Slide 16 text

monochromegane/go-smartsifter 16 https://github.com/monochromegane/smartsifter

Slide 17

Slide 17 text

monochromegane/go-smartsifter 17 r := 0.1 // Discounting parameter. alpha := 1.5 // Hyper parameter for continuous variables. beta := 1.0 // Hyper parameter for categorical variables. cellNum := 0 // Only continuous variables. mixtureNum := 2 // Number of mixtures for GMM. dim := 2 // Number of dimentions for GMM. ss := smartsifter.NewSmartSifter(r, alpha, beta, cellNum, mixtureNum, dim) logLoss := ss.Input(nil, []float64{0.1, 0.2}, true) fmt.Println("Score using logLoss: %f\n", logLoss)

Slide 18

Slide 18 text

SmartSifter 18

Slide 19

Slide 19 text

SmartSifter -SDLE 19

Slide 20

Slide 20 text

SmartSifter - SDEM 20

Slide 21

Slide 21 text

• ϕΫτϧɼߦྻͷԋࢉ͸ɼgonumΛ࢖༻ • ཭ࢄ஋ϕΫτϧ͸ࠓͷͱ͜ɼ1࣍ݩͷΈ • ϊϯύϥϝτϦοΫͳSPDUͱϔϦϯδϟʔείΞ͸·ͩະ࣮૷ 21 monochromegane/go-smartsifter

Slide 22

Slide 22 text

• ग़యຊͰ͸ɼૹड৴ͷωοτϫʔΫύέοτ৘ใΛجʹͨ͠ωοτϫʔΫ৵ೖ ݕ஌΍ෆ৹ҩྍσʔλݕग़ͷલॲཧͰɽ • ΧςΰϦΧϧʹ෼཭͠ͳ͕Βσʔλൃੜಛੑ͕ҟͳΔ΋ͷʹ༗ޮͦ͏ • σʔλϕʔεલஈͰςʔϒϧ΍ΧϥϜ͝ͱͷύϥϝλ஋΍ૹड৴ྔͳͲ • ୯ҰWebϦΫΤετ౳ΤϑΟϝϥϧͳঢ়ଶͷڍಈͷࠩҟΛجʹͨ͠ਪન • ๨٫܎਺ͳͲͷ࠷దͳϋΠύʔύϥϝʔλͷௐ੔͸ܦݧʹґଘ͢ΔͷͰ͸ɽ • Let’s discussion 22 ΦϯϥΠϯ֎Ε஋ݕग़ɼͲ͏࢖͍͔ͬͯ͘

Slide 23

Slide 23 text

ݚڀһɺੵۃతʹืूதʂ http://rand.pepabo.com/