Upgrade to Pro — share decks privately, control downloads, hide ads and more …

異常検知の基礎と実践 〜正規分布による異常検知〜

tsurubee
September 20, 2017

異常検知の基礎と実践 〜正規分布による異常検知〜

1次元正規分布に基づく異常検知の理論とPythonによる実装

tsurubee

September 20, 2017
Tweet

More Decks by tsurubee

Other Decks in Technology

Transcript

  1. ໨࣍ ʙجૅฤʙ l ҟৗݕ஌ͱ͸ʁ l ҟৗݕ஌ͷԠ༻ྫ l ҟৗσʔλྫ l ҟৗݕ஌ͷΞϓϩʔν

    l ౷ܭతҟৗݕ஌ͷߟ͑ํ ʙ࣮ફฤʙ l ϗςϦϯάཧ࿦ʹΑΔҟৗݕ஌ l 1ZUIPOʹΑΔ࣮૷
  2. ೖྗσʔλ ֬཰Ϟσϧ ͷֶश είΞܭࢉ ग़ྗ  ؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश ᶃ ະ஌ύϥϝʔλΛؚΉ֬཰෼෍ΛԾఆ ᶄ

    σʔλ͔Βະ஌ύϥϝʔλΛਪఆ  ֶशͨ͠ϞσϧΛجʹɺҟৗ౓߹͍ΛείΞϦϯά  ᮢ஋ͷઃఆ ᶃ ᶄ ᶅ ౷ܭతҟৗݕ஌ͷجຊεςοϓ
  3. ೖྗσʔλ ֬཰Ϟσϧ ͷֶश είΞܭࢉ ग़ྗ ᶃ ᶄ ᶅ ࠶ܝ ౷ܭతҟৗݕ஌ͷجຊεςοϓ

     ؍ଌσʔλ͔Βσʔλੜ੒ͷ֬཰ϞσϧΛֶश ᶃ ະ஌ύϥϝʔλΛؚΉ֬཰෼෍ΛԾఆ ᶄ σʔλ͔Βະ஌ύϥϝʔλΛਪఆ  ֶशͨ͠ϞσϧΛجʹɺҟৗ౓߹͍ΛείΞϦϯά  ᮢ஋ͷઃఆ
  4. import numpy as np from scipy import stat def hotelling_1d(data,

    threshold): """ Parameters ---------- data : Numpy array threshold : float Returns ------- List of tuples where each tuple contains index number and anomalous value. """ #Covert raw data into the degree of abnormality avg = np.average(data) var = np.var(data) data_abn = [(x - avg)**2 / var for x in data] #Set the threshold of abnormality abn_th = stats.chi2.interval(1-threshold, 1)[1] #Abnormality determination result = [] for (index, x) in enumerate(data_abn): if x > abn_th: result.append((index, data[index])) return result