Machine Learning for Anomaly Detection

Data Science Indonesia - DSI Riau T. Budianto 2020/08/09 Machine
Learning for Anomaly Detection Concept and Application

• Concept • De fi ning anomaly detection • Detection
approaches • Technical section • Algorithms • Application domains • Conclusion Contents

Concept

Fishing Analogy Fish in our criteria A Fishing Company PASS!!
Control and monitoring team Only fi sh in some criteria allowed PASS!! Help this guy by using an anomaly detection system

• Dataset: { x(1), x(2), … , x(m) } Defining
anomaly detection x_2 (shape) x_1 (color) Anomalies in two-dimensional data set Region consisting of a majority of observations. Normal data • Fish features: x_1: color x_2: shape … New fi sh: x_test OK Anomaly Use case: concerning to preserve fi sh and sea creatures. Adopted from Prof. Andrew Ng (Coursera)

Anomalies or outliers are data points that have different behavior
from the most of data.

Anomaly vs Novelty Novelty detection is the identi fi cation
of a novel (new) or unobserved patterns. Any data points outside decision threshold considered as anomaly Unobserved patterns

Out-of-distribution Slightly di ff erent meaning between anomaly detection, novelty
detection and out-of-distribution detection.

• Removing anomalies for training a more reliable model. •
Obtaining critical information from the data. • Minimizing error by proposing an early alarm system. Why finding anomalies?

Detection approaches

• Supervised • Training data labeled with “normal” and “anomaly”.
• Similar to classi fi cation with high class imbalance. • Semi-supervised • Training using only “normal” data, and test data contaminated with “anomaly” points. • Unsupervised • No label assumption. • Based on assumption that anomalies are very rare compared to normal data. Detection approaches Goldstein, Markus, and Seiichi Uchida. "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data." PloS one 11.4 (2016): e0152173.

• Point anomalies • An individual data instance is anomalous
with respect to the data. • Contextual anomalies • An individual data instance is anomalous within a context. • Also referred to as conditional anomalies*. • Collective anomalies • A collection of related data instances is anomalous. • Requires a relationship among data instances. • The individual instances within a collective anomaly are not anomalous by themselves. Structure of anomalies X Y N N o o O Normal Anomaly * Xiuyao Song, Mingxi Wu, Christopher Jermaine, Sanjay Ranka, Conditional Anomaly Detection, IEEE Transactions on Data and Knowledge Engineering, 2006. Anomalous Subsequence

• Label – Each test instance is given a normal
or anomaly label – This is especially true of classification-based approaches • Score – Each test instance is assigned an anomaly score • Allows the output to be ranked • Requires an additional threshold parameter Output of anomaly

• Given a dataset D, fi nd all the data
points x E D with anomaly scores greater than some threshold t. • Given a dataset D, fi nd all the data points x E D having the top-n largest anomaly scores. Variants of anomaly detection problem

• Accuracy is not su ffi cient metric for evaluation.
• Trivial classi fi er that labels everything with the normal class can achieve 99.9% accuracy!! Evaluation of anomaly detection Confusion Matrix Anomaly Normal Normal Anomaly • Focus on both recall and precision – Recall (R) = TP/(TP + FN) – Precision (P) = TP/(TP + FP) • F – measure = 2*R*P/(R+P) Recall: Proportion of actual positives identi fi ed correctly. Precision: Proportion of positive identi fi cations actually correct.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ROC curves for different outlier detection techniques False alarm rate Detection rate AUC Evaluation of anomaly detection • Standard measures for evaluating anomaly detection problems. • Recall (Detection rate) - ratio between the number of correctly detected anomalies and the total number of anomalies • False alarm (false positive) rate – ratio   between the number of data records   from normal class that are misclassified   as anomalies and the total number of   data records from normal class. • ROC Curve is a trade-off between   detection rate and false alarm rate. • It tells how much a model capable distinguishing between classes. I d e a l R O C curve

Technical section

• SVMs are based on the concept of fi nding
a boundary that maximizes the margin between two classes. • In one-class problems the information regarding the anomaly class data is unavailable. • Tackles the absence of anomaly class data by maximizing the boundary with respect to the origin. Anomaly detection by a One-class SVM OCSVM Support Vector Machine

Anomaly detection using Autoencoder The histogram of residuals for normal
(blue) and anomalous (red) pixels • A dimensionality reduction based anomaly detection. Baur, Christoph, et al. "Deep autoencoding models for unsupervised anomaly segmentation in brain MR images." International MICCAI Brainlesion Workshop. Springer, Cham, 2018.

Anomaly detection by One-class neural nets Chalapathy, Raghavendra, Aditya Krishna
Menon, and Sanjay Chawla. "Anomaly detection using one-class neural networks." arXiv preprint arXiv:1802.06360 (2018). • An autoencoder is trained to obtain the representative features of the input. • The encoder of pre-trained autoencoder is copied and fed as input to the feed- forward network with one hidden layer.

• G e n e r a t i v
e a d v e r s a r i a l n e t w o r k s f o r a n o m a l y detection. • Tr a i n g e n e r a t o r G a n d discriminator D through minimax game. • Detecting anomalies by measuring residual loss and discriminator loss. Anomaly detection by GAN Schlegl, Thomas, et al. "Unsupervised anomaly detection with generative adversarial networks to guide marker discovery." International conference on information processing in medical imaging. Springer, Cham, 2017. Loss function Anomaly score

• Intrusion detection • Fraud detection • Anomaly in healthcare
• Malware detection • Time series and video anomaly detection Application domains

• De fi ning a representative normal region. • The
boundary between normal/abnormal behavior is often not precisely de fi ned. • Availability of training samples. • Imbalanced datasets. • Normal data behavior and library is also evolving. Challenges

• Anomaly detection can help on fi nding critical information
in data. • Problem in anomaly detection is dependent on the application domains. • Require di ff erent approaches to solve di ff erent problems and data availability. • More advance methods proposed using deep learning techniques on complex and high- dimensional data. Conclusion

• Chalapathy, Raghavendra, and Sanjay Chawla. "Deep learning for anomaly
detection: A survey." arXiv preprint arXiv:1901.03407 (2019). • Baur, Christoph, et al. "Deep autoencoding models for unsupervised anomaly segmentation in brain MR images." International MICCAI Brainlesion Workshop. Springer, Cham, 2018. • Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Anomaly detection: A survey." ACM computing surveys (CSUR) 41.3 (2009): 1-58. • Goldstein, Markus, and Seiichi Uchida. "A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data." PloS one 11.4 (2016): e0152173. • Andrew Ng, “Machine Learning Course”, Coursera. (accessed in 2020). • Nick Radcli ff , “Anomaly Detection”, PyData London, 2018. • Sanjay Chawla and Varun Chandola, “Anomaly Detection: A Tutorial”, 2011. References

• https://github.com/yzhao062/anomaly-detection-resources • https://machinelearningmastery.com/one-class-classi fi cation- algorithms/ Resources

Machine Learning for Anomaly Detection

Machine Learning for Anomaly Detection

TB

More Decks by TB

Other Decks in Technology

Featured

Transcript

Data Science Indonesia - DSI Riau T. Budianto 2020/08/09 Machine

• Concept • De fi ning anomaly detection • Detection

Concept

Fishing Analogy Fish in our criteria A Fishing Company PASS!!

• Dataset: { x(1), x(2), … , x(m) } Defining

Anomalies or outliers are data points that have different behavior

Anomaly vs Novelty Novelty detection is the identi fi cation

Out-of-distribution Slightly di ff erent meaning between anomaly detection, novelty

• Removing anomalies for training a more reliable model. •

Detection approaches

• Supervised • Training data labeled with “normal” and “anomaly”.

• Point anomalies • An individual data instance is anomalous

• Label – Each test instance is given a normal

• Given a dataset D, fi nd all the data

• Accuracy is not su ffi cient metric for evaluation.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Technical section

• SVMs are based on the concept of fi nding

Anomaly detection using Autoencoder The histogram of residuals for normal

Anomaly detection by One-class neural nets Chalapathy, Raghavendra, Aditya Krishna

• G e n e r a t i v

• Intrusion detection • Fraud detection • Anomaly in healthcare

• De fi ning a representative normal region. • The

• Anomaly detection can help on fi nding critical information

• Chalapathy, Raghavendra, and Sanjay Chawla. "Deep learning for anomaly

• https://github.com/yzhao062/anomaly-detection-resources • https://machinelearningmastery.com/one-class-classi fi cation- algorithms/ Resources