Adaptiveなウィンドウを求めて 〜 サーベイと実装 Go言語編 〜/fukuokago15-adwin-exphist

Adaptiveなウィンドウを求めて 〜 サーベイと実装 Go言語編 〜/fukuokago15-adwin-exphist

Fukuoka.go#15 + 鹿児島Gophers(オンライン開催)
https://fukuokago.connpass.com/event/164350/

Cd3d2cb2dadf5488935fe0ddaea7938a?s=128

monochromegane

March 02, 2020
Tweet

Transcript

  1. ࡾ୐༔հ / Pepabo R&D Institute, GMO Pepabo, Inc. 2020.03.02 Fukuoka.go#15+ࣛࣇౡGophers

    Adaptiveͳ΢Οϯυ΢ΛٻΊͯ ʙ αʔϕΠͱ࣮૷ Goݴޠฤ ʙ
  2. 1SJODJQBMFOHJOFFS :VTVLF.*:",&!NPOPDISPNFHBOF 1FQBCP3%*OTUJUVUF (.01FQBCP *OD IUUQTCMPHNPOPDISPNFHBOFDPN

  3. ΤϯδχΞϑϨϯυϦʔγςΟ෱ԬΞϫʔυड৆ 3 < Thank you!!

  4. 1. ͸͡Ίʹ 2. Adwin in Go 3. Exponential Histograms in

    Go 4. ·ͱΊ 4 ໨࣍
  5. 1. ͸͡Ίʹ

  6. • λΠϜεςοϓ t͝ͱʹ஋͕؍ଌ͞ΕΔ࣌ܥྻσʔλ • 1෼͝ͱͷαʔό΁ͷΞΫηεස౓ͷਪҠɺ1࣌ؒ͝ͱͷΫϦοΫ཰ͷਪҠ etc… 6 Time series data

    and Windowing 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0 ?, ?, ?, ?, ?, ?,?, … աڈ ະདྷ ݱࡏ
  7. • ແݶʹଓ࣌͘ܥྻσʔλͷશظؒσʔλΛอ࣋ɾ෼ੳ͢Δͷ͸ݱ࣮తͰ͸ͳ͍ • ௚ۙ nεςοϓͷσʔλΛؚΉ΢Οϯυ΢Λର৅ͱͯ͠อ࣋ɾ෼ੳ͢Δ 7 Time series data and

    Windowing 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0 ?, ?, ?, ?, ?, ?,?, … n=7 աڈ ະདྷ t=15 t=9 t=15
  8. • ແݶʹଓ࣌͘ܥྻσʔλͷશظؒσʔλΛอ࣋ɾ෼ੳ͢Δͷ͸ݱ࣮తͰ͸ͳ͍ • ௚ۙ nεςοϓͷσʔλΛؚΉ΢Οϯυ΢Λର৅ͱͯ͠อ࣋ɾ෼ੳ͢Δ 8 Time series data and

    Windowing 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0 ?, ?, ?, ?, ?, ?,?, … n=7 t=15 t=9 t=15 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0 ?, ?, ?, ?, ?, ?,?, … t=16 t=10 t=16
  9. • ΢Οϯυ΢αΠζܾఆͷδϨϯϚ • ྫ͑͹ɺ΢Οϯυ΢ʹؚ·ΕΔ஋ͷฏۉΛར༻͢ΔͳΒ͹ • αΠζ͕গͳ͚Ε͹ɺաڈͷ৘ใΛऔΓ͜΅͢ • αΠζ͕ଟ͚Ε͹ɺ௚ۙͷ৘ใͷॏཁੑ͕ബΕΔ 9 How

    many is the optimal window size?
  10. 10 Kaburaya autoscaler • ఏҊ͢ΔΦʔτεέʔϦϯά੍ޚܥ[1]ʹ͓͍ͯαʔό͋ͨΓͷϨεϙϯελΠ ϜΛੑೳࢦඪͱͯ͠ར༻ • ͲΕ͙Β͍ͷظؒͷฏۉϨεϙϯελΠϜΛऔಘ͢Ε͹Α͍ͷ͔ = ΢Οϯ

    υ΢αΠζͷܾఆ • Ҡಈฏۉʁ • ՃॏҠಈฏۉʁ • ࢦ਺Ҡಈฏۉʁ <>ࡾ୐༔հ ܀ྛ݈ଠ࿠ ,BCVSBZB"VUP4DBMFSଟ؀ڥͰͷӡ༻ੑΛߟྀͨࣗ͠཯దԠܕΦʔτεέʔϦϯά੍ޚܥ Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ ࿦จू  QQ /PW
  11. • ΢Οϯυ΢αΠζܾఆͷδϨϯϚ • αΠζ͕গͳ͚Ε͹ɺաڈͷ৘ใΛऔΓ͜΅͢ • αΠζ͕ଟ͚Ε͹ɺ௚ۙͷ৘ใͷॏཁੑ͕ബΕΔ 11 Adaptive windowing •

    มԽ͕ͳ͍ظؒ͸ͳΔ΂͘௕͍ظؒͷ৘ใΛอ࣋͢Δ • มԽ͕͋ͬͨ࣌͸ɺաڈͷ৘ใΛ੾Γࣺͯɺ௚ۙͷ৘ใΛ༏ઌ͢Δ ࠓ೔ͷΞϓϩʔν"EBQUJWF8JOEPXJOH దԠతͳ΢Οϯυ΢
  12. • ࿦จͷΞϧΰϦζϜ͸ಡΜ͚ͩͩͩͱΘ͔Βͳ͍ʢ๻͸ʣ • ଞͷݴޠͷ࣮૷΋࠷ऴతʹ͸࿦จͱಥ͖߹ΘͤͯಡΉͷͰࣗ෼Ͱ࣮૷͢Δఔ౓ ʹ͸͕͔͔࣌ؒΔ • ࣗ෼Ͱ࣮૷͢Δ͜ͱͰ࿦จʹର͢Δཧղ͕֨ஈʹ޿͕Δ • Ұ౓࣮૷͓͚ͯ͠͹αʔϏεͰར༻͢Δࡍʹ࠷దͳݴޠ΁ͷίϯόʔτ͸ָ •

    GoͰ࣮૷͢Δͱɺཚ਺Λ൐͏γϛϡϨʔγϣϯ݁ՌͷฏۉԽͳͲʹର͢Δฒ ྻγϛϡϨʔλͷ࣮૷͕ൺֱత༰қ 12 Why Go?
  13. 2. ADWIN in Go

  14. • “Learning from time-changing data with adaptive windowing”, Bifet, Albert,

    and Ricard Gavalda; Proceedings of the 2007 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, 2007 • ࣌ܥྻσʔλͷ෼෍͕࣌ؒܦաʹΑͬͯมԽ͢Δ͜ͱΛલఏʹ࠷దͳ΢Οϯυ ΢αΠζΛಈతʹܾఆ͢ΔΞϧΰϦζϜ (ADWIN) • ΢Οϯυ΢αΠζ͕େ͖͘ͳͬͨ৔߹ʹ΋ܭࢉྔ΍ϝϞϦ࢖༻ྔΛ཈͑ΔͨΊ ʹExponential histogramsͱ͍͏σʔλߏ଄Λಋೖ (ADWIN2) 14 ADWIN: Adaptive Windowing Algorithm
  15. 15 ADWIN in Go https://github.com/monochromegane/adwin • ADWINΛGoͰ࣮૷

  16. 16 ADWIN: Adaptive Windowing Algorithm • ฏۉ0.8ɺඪ४ภࠩ0.01ͷཚ਺ Λ࣌ܥྻσʔλͱͯ͠ੜ੒ • ్த͔ΒฏۉΛ0.4ʹมߋ

    • ΢Οϯυ΢αΠζͷมԽʹ஫໨ ᶃ҆ఆظؒ΢Οϯυ΢αΠζ૿Ճ ᶄมԽ఺΢Οϯυ΢αΠζ࡟ݮ ᶅ҆ఆظؒ΢Οϯυ΢αΠζ૿Ճ
  17. ᶃ มԽ఺ͷݕ஌ • ΢Οϯυ΢Λաڈͱݱࡏʹೋ෼ͯ͠஋ͷ܏޲ʹมԽ͕͋ͬͨՕॴΛݟ͚ͭΔ • > Whenever two “large enough”

    sub windows of W exhibit “distinct enough” average, • ౷ܭͷݕఆతͳΞϓϩʔν 17 How to achieve adaptive windowing? 1, 1, 0, 1 1 1, 0, 1 1, 1 0, 1 1, 1, 1
  18. ᶃ มԽ఺ͷݕ஌ • ྫ͑͹ೋ෼ͨ͠΢Οϯυ΢ಉ࢜ΛʮରԠͷͳ͍tݕఆʯͰߟ͑Δʢ࣮ࡍ͸΋ͬ ͱෳࡶɻ࿦จͷAppendix A ࢀরʣ 18 How to

    achieve adaptive windowing? 11.01, 12.02, 11.01, 12.02, 11.01 11.00, 12.01, 11.02, 12.01, 11.02 t = 0.0057 9.92, 9.01, 9.82, 9.01, 9.12 t = 6.4083 ˎࣗ༝౓ͷU஋͸
  19. ᶄ ΢Οϯυ΢ͷॖখ • มԽ఺Λݕ஌ͨ͠Βɺ࠷΋ݹ͍ํͷ΢Οϯυ΢ΛҰͭ࡟ݮ • มԽ͕ͳ͘ͳΔ·Ͱ܁Γฦ͢ 19 How to achieve

    adaptive windowing? 11.00, 12.01, 11.02, 12.01, 11.02 9.92, 9.01, 9.82, 9.01, 9.12
  20. 20 ADWIN: Adaptive Windowing Algorithm • มԽ͕ͳ͚Ε͹΢Οϯυ΢αΠ ζ͸ͲΜͲΜେ͖͘ͳΔ • ΢Οϯυ΢αΠζʹൺྫͯ͠

    • ϝϞϦ࢖༻ྔ͕૿Ճ • ܭࢉྔ͕૿Ճ • ΢Οϯυ΢ͷ෼ׂճ਺ • ݕఆճ਺
  21. 3. Exponential Histograms in Go

  22. • “Maintaining Stream Statistics over Sliding Windows”, M.Datar, A.Gionis, P.Indyk,

    R.Motwani; ACM-SIAM, 2002 • େ͖ͳαΠζͷ΢Οϯυ΢޲͚ͷσʔλߏ଄ • աڈσʔλΛόέοτʹαϚϥΠζͯ͠ϝϞϦ༰ྔͱܭࢉྔΛ࡟ݮ • ͨͩ͠஋ͷूܭ͸ۙࣅ஋ͱͳΔ • ७ਮͳExponential histograms͸Bits(0or1)͔Positive Integers(ਖ਼ͷ੔਺)ͷ࣌ ܥྻσʔλͷΈѻ͏ʢADWIN2Ͱ͸࣮਺΋ѻ͑ΔΑ͏ʹ͍ͯ͠Δʣ 22 Exponential histograms: data structure for sliding windows
  23. 23 Exponential histograms: data structure for sliding windows 1 1

    1 2 1 1 2 3 1 1 2 3 2 3 NFSHF 4 1 2 3 4 5 0 2 3 4 6 1 2 3 4 6 2 4 6 NFSHF *HOPSF 7 1 2 4 6 7 8 1 2 4 6 7 8 2 4 7 8 4 7 8 NFSHFY Time Bit Bucket Time Bit Bucket .FSHFTJ[F .FSHFTJ[F
  24. • όέοτͷ࣋ͭλΠϜελϯϓ͸Ұͭલͷόέοτ·ͰͷλΠϜελϯϓͷؒ ʹόέοταΠζ෼ͷ1ؚ͕·ΕΔ͜ͱΛࣔ͢ • όέοτͷλΠϜελϯϓ͕΢Οϯυ΢αΠζΛ௒͑Δ৔߹ʢظݶ੾Εͷόέο τʣ͸࡟আ͞ΕΔ • ࣮ࡍ͸TOTALʢظݶ಺ͷόέοτͷ஋ͷ߹ܭʣͱLASTʢظݶ಺ͷόέοτͷ ࠷େαΠζ (2^i)ʣͱ͍͏ม਺Λߋ৽͢Δ

    • Χ΢ϯτ͸ TOTAL - LAST/2 ͰۙࣅͰ͖Δʢabsolute error͸LAST/2ʣ • ޡࠩΛͲͷఔ౓ڐ༰Ͱ͖Δ͔ΛύϥϝλEpsilonͰ੍ޚ 24 Exponential histograms: data structure for sliding windows
  25. • ࿦จͰ͸Positive Integers΋঺հ͞Ε͍ͯΔ • ୯७ʹ஋͕Nͷ৔߹ɺNճͷσʔλૠೖ͕ߦΘΕΔ • ܭࢉྔ͸ͱΓ͏Δ੔਺ͷ࠷େ஋·Ͱ૿Ճ 25 Exponential histograms:

    data structure for sliding windows
  26. 26 Exponential Histograms in Go https://github.com/monochromegane/exponential-histograms • Exponential Histograms (Bits,

    Positive Integers)ΛGoͰ࣮૷
  27. 27 ADWIN2: Adaptive Windowing using ExpHist • ΢Οϯυ΢Λ഑ྻ͔ΒExpHistʢͷѥछʣʹมߋ • ࣮਺Λ֨ೲ͠ɺόέοτ͸߹ܭ஋Λอ࣋͢Δ

    • αϒ΢Οϯυ΢ͷࠩͷݕఆΛ࠷΋աڈʢ=௕͍ظؒʣͷόέοτͱͦΕҎ֎Λ ൺֱ͢Δ͚ͩʹมߋ • มߋΛݕ஌ͨ͠৔߹͸ɺͦͷ࠷΋աڈόέοτΛ࡟আͯ͠ऴΘΓ
  28. 28 ADWIN2 in Go https://github.com/monochromegane/adwin • ADWIN2ΛGoͰ࣮૷

  29. 29 ADWIN2: Adaptive Windowing using ExpHist &YQ)JTUͷಋೖʹΑͬͯ΢Οϯυ΢ͷ࡟আ͕ όονతʹͳΔ͕͜ͷྫͰ͸໰୊ͳͦ͞͏

  30. 4. ·ͱΊ

  31. • ࿦จͷΞϧΰϦζϜͷཧղͷͨΊʹADWINͱExponential HistogramsΛGoݴ ޠͰ࣮૷ͨ͠ • Adaptiveͳ΢Οϯυ΢ͷ࣮ݱ͢ΔɺADWINΛ঺հͨ͠ • େ͖ͳαΠζͷεϥΠσΟϯά΢Οϯυ΢ΛলϝϞϦͰ࣮ݱ͢ΔExponential HistogramsΛ঺հͨ͠ •

    ͜ΕΒΛ૊Έ߹ΘͤͨADWIN2Λ঺հͨ͠ • GoͰ࣮૷͢Δ͜ͱͰΞϧΰϦζϜͷཧղͰ͖Δ૚Λ޿͛Δ͜ͱʹߩݙʢଟ෼ʣ 31 ·ͱΊ
  32. None