Upgrade to Pro — share decks privately, control downloads, hide ads and more …

異常検知の基礎と実践 〜正規分布による異常検知〜

tsurubee
September 20, 2017

異常検知の基礎と実践 〜正規分布による異常検知〜

1次元正規分布に基づく異常検知の理論とPythonによる実装

tsurubee

September 20, 2017
Tweet

More Decks by tsurubee

Other Decks in Technology

Transcript

  1. ҟৗݕ஌ͷجૅͱ࣮ફ

    ϓϩάϥϚͷͨΊͷ਺ֶษڧձ!෱Ԭ

    View Slide

  2. ࣗݾ঺հ
    Խֶͷम࢜߸Λऔಘ
    ௽ాʢͭΔͨʣ
    5XJUUFS!UTVSVCFF
    ΤϯδχΞྺϲ݄
    ফ๷࢜ʹͳΔʢফ๷ୂɾٹٸୂɾՐࡂௐࠪ൝ʣ
    *5ΤϯδχΞʹస৬ʢ೥݄ʙʣ
    ʻܦྺʼ
    ܦྺ

    View Slide

  3. ໨࣍
    ʙجૅฤʙ
    l ҟৗݕ஌ͱ͸ʁ
    l ҟৗݕ஌ͷԠ༻ྫ
    l ҟৗσʔλྫ
    l ҟৗݕ஌ͷΞϓϩʔν
    l ౷ܭతҟৗݕ஌ͷߟ͑ํ
    ʙ࣮ફฤʙ
    l ϗςϦϯάཧ࿦ʹΑΔҟৗݕ஌
    l 1ZUIPOʹΑΔ࣮૷

    View Slide

  4. ҟৗݕ஌ͱ͸ʁ

    View Slide

  5. େଟ਺ͷσʔλͱ͸ৼΔ෣͍͕
    ҟͳΔσʔλΛݕग़͢Δٕज़

    View Slide

  6. େଟ਺ͷσʔλͱ͸ৼΔ෣͍͕
    ҟͳΔσʔλΛݕग़͢Δٕज़
    σʔλϚΠχϯά
    نଇੑ ҟৗ
    σʔλͷࢁ

    View Slide

  7. ҟৗݕ஌ͷԠ༻ྫ

    View Slide

  8. ҟৗݕ஌ͷԠ༻ྫ
    ίϯϐϡʔλ΢Πϧε΍%PT߈ܸͷૣظൃݟ
    ηΩϡϦςΟ෼໺
    ނো༧ஹݕ஌
    ػց෼໺
    ྲྀߦͷݕ஌ɾ৽τϐοΫͷൃݟɾ
    ϢʔβߦಈͷมԽݕ஌
    ϚʔέςΟϯά෼໺

    View Slide

  9. ҟৗσʔλྫ

    View Slide

  10. ҟৗσʔλྫ̍

    View Slide

  11. ҟৗσʔλྫ̍
    ҟৗʂ

    View Slide

  12. ҟৗσʔλྫ̍
    ҟৗʂ
    if value > 120:
    print('ERROR!')
    JGจͰݕ஌Ͱ͖ͦ͏ʂ

    View Slide

  13. ҟৗσʔλྫ̎

    View Slide

  14. ҟৗσʔλྫ̎
    Կ͔ҟৗ͕ى͖ͯΔ
    ҟৗʹ΋͍Ζ͍Ζ͋Δ

    View Slide

  15. ҟৗݕ஌ͷ෼ྨ
    ֎Ε஋ݕ஌ มԽ఺ݕ஌
    ҟৗݕ஌
    ٸܹͳৼΔ෣͍ͷมԽΛݕ஌
    ࣌ܥྻϞσϧ
    ଞͱ͸େ͖͘ҟͳΔ஋Λݕ஌
    ಠཱϞσϧ

    View Slide

  16. ҟৗݕ஌ͷ෼ྨ
    ֎Ε஋ݕ஌ มԽ఺ݕ஌
    ҟৗݕ஌
    ࠓճ͸ίονͷ࿩ʂ
    ٸܹͳৼΔ෣͍ͷมԽΛݕ஌
    ࣌ܥྻϞσϧ
    ଞͱ͸େ͖͘ҟͳΔ஋Λݕ஌
    ಠཱϞσϧ

    View Slide

  17. ҟৗݕ஌ͷΞϓϩʔν

    View Slide

  18. ҟৗݕ஌ͷΞϓϩʔν
    ஫ɿॏͳΔ෦෼΋͋Γ·͢
    Ξϓϩʔνͷछྨ ख๏ྫ
    ϧʔϧϕʔεɾΞϓϩʔν
    ڑ཭ʹΑΔΞϓϩʔν ,ۙ๣๏ɺಛҟεϖΫτϧม׵๏ɺଞ
    ౷ܭతΞϓϩʔν ϗςϦϯάཧ࿦ɺΧʔωϧີ౓ਪఆ๏ɺଞ
    ػցֶशΞϓϩʔν αϙʔτϕΫλʔϚγϯʢ47.ʣɺ
    χϡʔϥϧωοτϫʔΫɺଞ

    View Slide

  19. ҟৗݕ஌ͷΞϓϩʔν
    ΞϓϩʔνʹΑΓඞཁͳٻΊΒΕΔ஌͕ࣝएׯҟͳΔ
    ஫ɿॏͳΔ෦෼΋͋Γ·͢
    Ξϓϩʔνͷछྨ ख๏ྫ
    ϧʔϧϕʔεɾΞϓϩʔν
    ڑ཭ʹΑΔΞϓϩʔν ,ۙ๣๏ɺಛҟεϖΫτϧม׵๏ɺଞ
    ౷ܭతΞϓϩʔν ϗςϦϯάཧ࿦ɺΧʔωϧີ౓ਪఆ๏ɺଞ
    ػցֶशΞϓϩʔν αϙʔτϕΫλʔϚγϯʢ47.ʣɺ
    χϡʔϥϧωοτϫʔΫɺଞ
    ࠓճ͸ίϨʂ

    View Slide

  20. ͭ·Γࠓճ͸ɺ
    ʮ౷ܭతҟৗݕ஌ʹΑΔ
    ֎Ε஋ݕ஌ʯ
    ͷ࿩Ͱ͢

    View Slide

  21. ౷ܭతҟৗݕ஌ͷ
    ߟ͑ํ

    View Slide

  22. ౷ܭతҟৗݕ஌ͷߟ͑ํ
    ඪຊநग़
    ඪຊ
    ਅͷ෼෍
    ౷ܭతҟৗݕ஌Ͱ͸ɺ؍ଌσʔλ͕͋Δಛఆͷ֬཰Ϟσϧ ֬཰෼෍

    ͔Βੜ੒͞Ε͍ͯΔͱԾఆ͢Δɻ
    ฼ूஂ
    ਖ਼ৗ࣌ͷ
    ෼෍
    ֶश
    σʔλ͔Βਖ਼ৗͱͳΔϞσϧΛͭ͘Γɺ͔ͦ͜Β֎ΕΔ΋ͷΛҟৗͱ͢Δɻ

    View Slide

  23. ೖྗσʔλ
    ֬཰Ϟσϧ
    ͷֶश
    είΞܭࢉ
    ग़ྗ
    ؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश
    ᶃ ະ஌ύϥϝʔλΛؚΉ֬཰෼෍ΛԾఆ
    ᶄ σʔλ͔Βະ஌ύϥϝʔλΛਪఆ
    ֶशͨ͠ϞσϧΛجʹɺҟৗ౓߹͍ΛείΞϦϯά
    ᮢ஋ͷઃఆ



    ౷ܭతҟৗݕ஌ͷجຊεςοϓ

    View Slide

  24. ʙ࣮ફฤʙ
    ϗςϦϯάཧ࿦ʹΑΔ
    ҟৗݕ஌

    View Slide

  25. ༻͍Δσʔλ
    %BWJTσʔλɿਓͷੑผɺଌఆͨ͠਎௕ɾମॏɺ
    ͓Αͼࣗݾਃࠂͷ਎௕ɾମॏσʔλ
    IUUQTWJODFOUBSFMCVOEPDLHJUIVCJP3EBUBTFUTEBUBTFUTIUNM

    View Slide

  26. ༻͍Δσʔλ
    %BWJTσʔλɿਓͷੑผɺଌఆͨ͠਎௕ɾମॏɺ
    ͓Αͼࣗݾਃࠂͷ਎௕ɾମॏσʔλ
    IUUQTWJODFOUBSFMCVOEPDLHJUIVCJP3EBUBTFUTEBUBTFUTIUNM
    มྔͷཧ࿦Λͬ͘͡Γ΍Γ·͢
    ଌఆମॏͷΈʂ

    View Slide

  27. σʔλͷՄࢹԽ
    ਓͷମॏσʔλͷ෼෍

    View Slide

  28. σʔλͷՄࢹԽ
    ਓͷମॏσʔλͷ෼෍
    ֎Ε஋
    ౷ܭֶతʹ٬؍తͳਫ४Ͱݕ஌͢Δʹ͸ʁ
    ֎Ε஋
    ౷ܭతҟৗݕ஌

    View Slide

  29. ೖྗσʔλ
    ֬཰Ϟσϧ
    ͷֶश
    είΞܭࢉ
    ग़ྗ



    ࠶ܝ
    ౷ܭతҟৗݕ஌ͷجຊεςοϓ
    ؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश
    ᶃ ະ஌ύϥϝʔλΛؚΉ֬཰෼෍ΛԾఆ
    ᶄ σʔλ͔Βະ஌ύϥϝʔλΛਪఆ
    ֶशͨ͠ϞσϧΛجʹɺҟৗ౓߹͍ΛείΞϦϯά
    ᮢ஋ͷઃఆ

    View Slide

  30. ̍؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश
    ˞σʔλͷதʹ͸ҟৗͳ؍ଌσʔλؚ͕·Ε͍ͯͳ͍͔ɺ
    ؚ·Ε͍ͯͨͱͯ͠΋ͦͷӨڹ͸ແࢹͰ͖ΔͱԾఆ
    ؍ଌσʔλ͕ݸ͋Δͱ͖ɺσʔλΛ·ͱΊͯͱ͍͏ه߸Ͱද͢ɻ
    ᶃ ະ஌ύϥϝʔλΛؚΉ֬཰෼෍ΛԾఆ
    ɿฏۉ
    2ɿ෼ࢄ
    ֬཰Ϟσϧ ʹ ਖ਼ن෼෍ΛԾఆ
    ϗςϦϯάཧ࿦

    View Slide

  31. ᶄ σʔλ͔Βະ஌ύϥϝʔλΛਪఆ
    ฏۉ、෼ࢄ2Λ؍ଌσʔλ͔Βਪఆ ࠷໬ਪఆ
    ܭࢉͷ౎߹্ɺ໬౓ؔ਺ͷࣗવର਺Λͱͬͨର਺໬౓ؔ਺Λ࠷େԽ͢Δɻ
    ̍؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश
    ؍ଌσʔλ͕ޓ͍ʹಠཱ

    View Slide

  32. ͱͨ͠৔߹ɺର਺໬౓໬౓ؔ਺͸ɺ
    ࠷େԽ͢ΔύϥϝʔλΛٻΊΔͨΊɺͱͰͦΕͧΕภඍ෼ͯ͠θϩͱ͓͘
    ̍؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश

    View Slide

  33. ฏۉɾ෼ࢄͷਪఆ஋ͱͯ͠ɺ؍ଌσʔλͷඪຊฏۉɾඪຊ෼ࢄΛ
    ࠾༻͢Δ͜ͱΛҙຯ͍ͯ͠Δ
    ֶशࡁΈ֬཰Ϟσϧʢ༧ଌ෼෍ʣ
    ͕ಘΒΕΔɻ
    ˞ਪఆ஋Ͱ͋Δ͜ͱΛ໌ࣔ͢ΔͨΊʮ? ϋοτ
    ʯΛ͚ͭͨɻ
    ̍؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश

    View Slide

  34. ֶ̎शͨ͠ϞσϧΛجʹɺҟৗ౓߹͍ΛείΞϦϯά
    ҟৗ౓ͷఆٛͱͯ͠ɺ
    ෛͷର਺໬౓Λ࠾༻
    ৽ͨͳ؍ଌ஋(ʹର͢Δҟৗ౓(() ͸ɺ
    ୈ߲͸؍ଌσʔλ(ʹґଘ͠ͳ͍ͷͰແࢹ͢ΔɻશମʹΛ͔͚Δͱɺ
    ඪຊฏۉ͔ΒͷͣΕͷେ͖͞
    ΋ͱ΋ͱ͹Β͖ͭͷେ͖͍΋ͷ͸ଟগͷ
    ͣΕͰ΋ଟΊʹݟΔ
    ෼ࢠɿ
    ෼฼ɿ
    ҟৗ౓
    ௚ײʹ͍ۙʂ

    View Slide

  35. ̏ᮢ஋ͷઃఆ
    ҟৗ౓ͷᮢ஋Λઃఆ͢Δ͜ͱͰҟৗ൑ఆͰ͖Δʂ
    Ͱ΋ɺᮢ஋͸Ͱ͖Δ͚ͩ٬؍తج४ʹج͍ܾͮͯΊ͍ͨɾɾ
    ϗςϦϯάཧ࿦ͷҟৗ౓
    ҟৗ౓͕ै͏֬཰෼෍Λɺ
    ໌ࣔతʹಋ͘͜ͱ͕Ͱ͖Δʂ
    ͭ·Γɺύʔηϯτ஋ʹΑΓҟৗ൑ఆΛߦ͏͜ͱ͕Ͱ͖Δʂ
    ྫ͑͹
    ʮਓʹਓ͔͍͠ͳ͍ϨΞͩͬͨ͞Βҟৗͱ൑அ͠Α͏ʯ

    View Slide

  36. ̏ᮢ஋ͷઃఆ
    σʔλ਺͕े෼ʹେ͖͍࣌ɺҟৗ౓(()͸ࣗ༝౓ͷΧΠೋ৐෼෍ʹै͏
    ɿඪ४ਖ਼ن෼෍(0, 1) ʹै͏֬཰ม਺
    ࣗ༝౓OͷΧΠೋ৐෼෍ɿ
    ྫ͑͹ɺ(() = 2.0ͷ஋
    ͕ى͜Γ͏Δ֬཰͸ʁ
    ࣗ༝౓ͷΧΠೋ৐෼෍
    ͜͜Λੵ෼͢Δʂ

    View Slide

  37. 1ZUIPOʹΑΔ࣮૷

    View Slide

  38. ࡞Γ͍ͨ΋ͷ
    ೖྗͱͯ͠ɺσʔλͱҟৗ౓ͷᮢ஋
    Λ༩͑Δͱɺ
    ग़ྗͱͯ͠ɺҟৗ஋ͱͦͷΠϯσοΫε൪߸
    Λฦؔ͢਺Λ࡞Δɻ
    ࢖༻ϥΠϒϥϦ
    /VN1Z
    4DJ1Z ΧΠೋ৐෼෍ͷ
    ੵ෼஋ͷࢉग़ʹར༻
    ࢖༻ݴޠ
    1ZUIPO

    View Slide

  39. import numpy as np
    from scipy import stat
    def hotelling_1d(data, threshold):
    """
    Parameters
    ----------
    data : Numpy array
    threshold : float
    Returns
    -------
    List of tuples where each tuple contains index number
    and anomalous value.
    """
    #Covert raw data into the degree of abnormality
    avg = np.average(data)
    var = np.var(data)
    data_abn = [(x - avg)**2 / var for x in data]
    #Set the threshold of abnormality
    abn_th = stats.chi2.interval(1-threshold, 1)[1]
    #Abnormality determination
    result = []
    for (index, x) in enumerate(data_abn):
    if x > abn_th:
    result.append((index, data[index]))
    return result

    View Slide

  40. 1ZUIPOʹΑΔ࣮૷
    hotelling_1d(data, 0.01)
    #-> [(11, 166), (20, 119)]
    ᮢ஋

    View Slide

  41. 1ZUIPOʹΑΔ࣮૷
    hotelling_1d(data, 0.01)
    #-> [(11, 166), (20, 119)]
    ᮢ஋

    View Slide

  42. ϗςϦϯάཧ࿦ͷݶք
    ؍ଌσʔλಉ͕࢜ޓ͍ʹಠཱͰɺͦΕͧΕ͕୯Ұͷ
    ਖ਼ن෼෍ʹै͍ͬͯΔͱԾఆ͍ͯ͠ΔͨΊɺ
    ̍ෳ਺ͷϞʔυ͕͋ΔΑ͏ͳෳࡶͳܥ΁ͷద༻͕ࠔ೉
    ̎஋͕ಈతʹมԽ͢Δ࣌ܥྻσʔλ΁ͷద༻͕ࠔ೉

    View Slide

  43. ·ͱΊ
    ϗςϦϯάཧ࿦Ͱ͸ɺ؍ଌσʔλΛਖ਼ن෼෍
    ʹै͏ͱԾఆ͢Δɻ
    ఆٛͨ͠ҟৗ౓͕ɺΧΠೋ৐෼෍ʹै͏ͨΊɺ
    ΧΠೋ৐෼෍ʹج͖ͮܭࢉͨ͠ҟৗ౓ͷᮢ஋
    ʹΑΓҟৗ൑ఆͰ͖Δɻ
    ଟ࣍ݩσʔλ΍࣌ܥྻσʔλͷҟৗݕ஌͸
    ·ͨผͷػձʢ1Z'VLVPLBɺσʔλαΠΤϯε
    ษڧձʁʣʹ࿩͠·͢ʂ

    View Slide