Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for Anomaly Detection

TB
August 09, 2020

Machine Learning for Anomaly Detection

Talk @ Data Science Indonesia (DSI Chapter Riau 2020/08/09)

TB

August 09, 2020
Tweet

More Decks by TB

Other Decks in Technology

Transcript

  1. Data Science Indonesia - DSI Riau T. Budianto 2020/08/09 Machine

    Learning for Anomaly Detection Concept and Application
  2. • Concept • De fi ning anomaly detection • Detection

    approaches • Technical section • Algorithms • Application domains • Conclusion Contents
  3. Fishing Analogy Fish in our criteria A Fishing Company PASS!!

    Control and monitoring team Only fi sh in some criteria allowed PASS!! Help this guy by using an anomaly detection system
  4. • Dataset: { x(1), x(2), … , x(m) } Defining

    anomaly detection x_2 (shape) x_1 (color) Anomalies in two-dimensional data set Region consisting of a majority of observations. Normal data • Fish features: x_1: color x_2: shape … New fi sh: x_test OK Anomaly Use case: concerning to preserve fi sh and sea creatures. Adopted from Prof. Andrew Ng (Coursera)
  5. Anomaly vs Novelty Novelty detection is the identi fi cation

    of a novel (new) or unobserved patterns. Any data points outside decision threshold considered as anomaly Unobserved patterns
  6. • Removing anomalies for training a more reliable model. •

    Obtaining critical information from the data. • Minimizing error by proposing an early alarm system. Why finding anomalies?
  7. • Supervised • Training data labeled with “normal” and “anomaly”.

    • Similar to classi fi cation with high class imbalance. • Semi-supervised • Training using only “normal” data, and test data contaminated with “anomaly” points. • Unsupervised • No label assumption. • Based on assumption that anomalies are very rare compared to normal data. Detection approaches Goldstein, Markus, and Seiichi Uchida. "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data." PloS one 11.4 (2016): e0152173.
  8. • Point anomalies • An individual data instance is anomalous

    with respect to the data. • Contextual anomalies • An individual data instance is anomalous within a context. • Also referred to as conditional anomalies*. • Collective anomalies • A collection of related data instances is anomalous. • Requires a relationship among data instances. • The individual instances within a collective anomaly are not anomalous by themselves. Structure of anomalies X Y N N o o O Normal Anomaly * Xiuyao Song, Mingxi Wu, Christopher Jermaine, Sanjay Ranka, Conditional Anomaly Detection, IEEE Transactions on Data and Knowledge Engineering, 2006. Anomalous Subsequence
  9. • Label – Each test instance is given a normal

    or anomaly label – This is especially true of classification-based approaches • Score – Each test instance is assigned an anomaly score • Allows the output to be ranked • Requires an additional threshold parameter Output of anomaly
  10. • Given a dataset D, fi nd all the data

    points x E D with anomaly scores greater than some threshold t. • Given a dataset D, fi nd all the data points x E D having the top-n largest anomaly scores. Variants of anomaly detection problem
  11. • Accuracy is not su ffi cient metric for evaluation.

    • Trivial classi fi er that labels everything with the normal class can achieve 99.9% accuracy!! Evaluation of anomaly detection Confusion Matrix Anomaly Normal Normal Anomaly • Focus on both recall and precision – Recall (R) = TP/(TP + FN) – Precision (P) = TP/(TP + FP) • F – measure = 2*R*P/(R+P) Recall: Proportion of actual positives identi fi ed correctly. Precision: Proportion of positive identi fi cations actually correct.
  12. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

    1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ROC curves for different outlier detection techniques False alarm rate Detection rate AUC Evaluation of anomaly detection • Standard measures for evaluating anomaly detection problems. • Recall (Detection rate) - ratio between the number of correctly detected anomalies and the total number of anomalies • False alarm (false positive) rate – ratio 
 between the number of data records 
 from normal class that are misclassified 
 as anomalies and the total number of 
 data records from normal class. • ROC Curve is a trade-off between 
 detection rate and false alarm rate. • It tells how much a model capable distinguishing between classes. I d e a l R O C curve
  13. • SVMs are based on the concept of fi nding

    a boundary that maximizes the margin between two classes. • In one-class problems the information regarding the anomaly class data is unavailable. • Tackles the absence of anomaly class data by maximizing the boundary with respect to the origin. Anomaly detection by a One-class SVM OCSVM Support Vector Machine
  14. Anomaly detection using Autoencoder The histogram of residuals for normal

    (blue) and anomalous (red) pixels • A dimensionality reduction based anomaly detection. Baur, Christoph, et al. "Deep autoencoding models for unsupervised anomaly segmentation in brain MR images." International MICCAI Brainlesion Workshop. Springer, Cham, 2018.
  15. Anomaly detection by One-class neural nets Chalapathy, Raghavendra, Aditya Krishna

    Menon, and Sanjay Chawla. "Anomaly detection using one-class neural networks." arXiv preprint arXiv:1802.06360 (2018). • An autoencoder is trained to obtain the representative features of the input. • The encoder of pre-trained autoencoder is copied and fed as input to the feed- forward network with one hidden layer.
  16. • G e n e r a t i v

    e a d v e r s a r i a l n e t w o r k s f o r a n o m a l y detection. • Tr a i n g e n e r a t o r G a n d discriminator D through minimax game. • Detecting anomalies by measuring residual loss and discriminator loss. Anomaly detection by GAN Schlegl, Thomas, et al. "Unsupervised anomaly detection with generative adversarial networks to guide marker discovery." International conference on information processing in medical imaging. Springer, Cham, 2017. Loss function Anomaly score
  17. • Intrusion detection • Fraud detection • Anomaly in healthcare

    • Malware detection • Time series and video anomaly detection Application domains
  18. • De fi ning a representative normal region. • The

    boundary between normal/abnormal behavior is often not precisely de fi ned. • Availability of training samples. • Imbalanced datasets. • Normal data behavior and library is also evolving. Challenges
  19. • Anomaly detection can help on fi nding critical information

    in data. • Problem in anomaly detection is dependent on the application domains. • Require di ff erent approaches to solve di ff erent problems and data availability. • More advance methods proposed using deep learning techniques on complex and high- dimensional data. Conclusion
  20. • Chalapathy, Raghavendra, and Sanjay Chawla. "Deep learning for anomaly

    detection: A survey." arXiv preprint arXiv:1901.03407 (2019). • Baur, Christoph, et al. "Deep autoencoding models for unsupervised anomaly segmentation in brain MR images." International MICCAI Brainlesion Workshop. Springer, Cham, 2018. • Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Anomaly detection: A survey." ACM computing surveys (CSUR) 41.3 (2009): 1-58. • Goldstein, Markus, and Seiichi Uchida. "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data." PloS one 11.4 (2016): e0152173. • Andrew Ng, “Machine Learning Course”, Coursera. (accessed in 2020). • Nick Radcli ff , “Anomaly Detection”, PyData London, 2018. • Sanjay Chawla and Varun Chandola, “Anomaly Detection: A Tutorial”, 2011. References