Anomaly Detection. Part 1 – Basics

Slide 1

Slide 1 text

1 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks BUILD SOFTWARE TO TEST SOFTWARE exactpro.com Lecture 1. Anomaly Detection Basics ANOMALY DETECTION FOR AI TESTING 20 MAY | 10.00 GET | 11.30 SLST Rostislav Yavorski Head of Research, Exactpro

Slide 2

Slide 2 text

2 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly, also known as outlier or novelty ● Data points, events, or observations that deviate from normal behaviour ● Instances or collections of data that occur very rarely in the data set ● Observations which appear to be inconsistent with the remainder of the data 2

Slide 3

Slide 3 text

3 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Challenges in Anomaly Detection ● Definition of normal behaviour is extremely challenging ● Noise data aren’t anomalies ● The definition of anomaly is domain-specific ● Anomalies evolve over time ● Getting a set of labeled anomalous instances is difficult 3

Slide 4

Slide 4 text

4 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks ● To compute the mean or standard deviation ● To improve linear regression models for better predictions ● To boost the performance of machine learning algorithms Sometimes anomalies are discarded as waste 4

Slide 5

Slide 5 text

5 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Sometimes anomalies are most desirable ● fraud detection ● fault detection ● system health monitoring ● event detection in sensor networks ● detecting ecosystem disturbances ● defect detection in images ● medical diagnosis ● law enforcement ● cyber-security intrusion detection 5

Slide 6

Slide 6 text

6 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #1 Mean and Standard Deviation ANOMALY DETECTION FOR AI TESTING 6

Slide 7

Slide 7 text

7 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 7

Slide 8

Slide 8 text

8 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 8 Computing Mean and Standard Deviation

Slide 9

Slide 9 text

9 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 9 Computing Mean and Standard Deviation

Slide 10

Slide 10 text

10 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 10 Computing Mean and Standard Deviation

Slide 11

Slide 11 text

11 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 Outlier, very rare value Meaningless results, hardly interpretable 11 Computing Mean and Standard Deviation

Slide 12

Slide 12 text

12 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #2 Linear Regression ANOMALY DETECTION FOR AI TESTING 12

Slide 13

Slide 13 text

13 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://www.product-pro.com/preparing-for-mass-production/ 13

Slide 14

Slide 14 text

14 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 14

Slide 15

Slide 15 text

15 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 15 Prediction error is rather small

Slide 16

Slide 16 text

16 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0) Performance Prediction with Linear Regression 16

Slide 17

Slide 17 text

17 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Performance Prediction with Linear Regression Inconsistent observation Poor prediction quality 17 Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0)

Slide 18

Slide 18 text

18 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #3 Fraud Detection ANOMALY DETECTION FOR AI TESTING 18

Slide 19

Slide 19 text

19 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://www.zoho.com/books/articles/payment-fraud.html 19

Slide 20

Slide 20 text

20 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A deliberate act aimed at obtaining an unauthorised benefit: ● Theft or misappropriation of funds placed in one's trust ● Forgery or alteration of documents or computer files ● Authorisation of payment for services not performed ● Receipt of unearned wages or benefits ● Identity theft 20 Fraud

Slide 21

Slide 21 text

21 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators ● Excessive number of checking accounts ● Frequent changes in banking accounts ● Behavioural changes: drugs, alcohol, gambling ● Lifestyle changes: expensive cars, jewelry, homes, clothes 21

Slide 22

Slide 22 text

22 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators ● Excessive number of checking accounts ● Frequent changes in banking accounts ● Behavioural changes: drugs, alcohol, gambling ● Lifestyle changes: expensive cars, jewelry, homes, clothes Deviate from normal behaviour 22

Slide 23

Slide 23 text

23 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/ ! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve

Slide 24

Slide 24 text

24 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/ ! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve

Slide 25

Slide 25 text

25 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #4 Medical Diagnosis ANOMALY DETECTION FOR AI TESTING 25

Slide 26

Slide 26 text

26 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 26

Slide 27

Slide 27 text

27 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly detection captures unique characteristics of the physiological data that could oﬀer information about the patient Medical Diagnosis 27

Slide 28

Slide 28 text

28 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Loftus, Tyler J., et al. "Opportunities for machine learning to improve surgical ward safety." The American Journal of Surgery 220.4 (2020): 905-913. 28 Physiological data Ward admission Streaming electronic health records Early risk stratiﬁcation guides initial triage to ward vs. intensive care unit Eﬃcient, automated, wireless data acquisition making Wearables Machine learning Clinical assessment Early recovery Decompensation Rapid response Delayed recovery Rehabilitation Discharge home Accurate phenotyping, augmented decision-making Early recognition Cardiac arrest Automated alerts, augmented decision-making, early intervention

Slide 29

Slide 29 text

29 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #5 Fault Detection ANOMALY DETECTION FOR AI TESTING 29

Slide 30

Slide 30 text

30 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://climatix-group.com/wp-content/uploads/2020/01/HVAC-cotractor-Leeds-1.jpg 30

Slide 31

Slide 31 text

31 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Fault Detection Monitoring a system identifying when a fault has occurred and pinpointing the type of fault and its location. https://camatsystem.com/ 31

Slide 32

Slide 32 text

32 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A. DESCRIPTIVE ANALYTICS. Detect whether an item is functioning well or not by comparing the information received from it with historical data. B. DIAGNOSTIC ANALYTICS. Identify the causes of the fault. This process should consider trends in health history and operational context. C. PREDICTIVE ANALYTICS. Predict the state of the item within the future to detect any possible fault beforehand. D. PRESCRIPTIVE ANALYTICS. Elaborate maintenance plans considering the previous predictions to cut back fault. Fault Management Systems https://www.cloudmantra.net/blog/fault-detection-using-machine-learning-techniques/ 32 MANAGER SYSTEM MONITORING FAULT DETECTION FAULT PREDICTION ROOT CASE ANALYSIS FAULT PREVENTION AND RECOVERY

Slide 33

Slide 33 text

33 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Conclusion ANOMALY DETECTION FOR AI TESTING 33

Slide 34

Slide 34 text

34 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Terms 34 An outlier is a data point that differs significantly from other observations. Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour.

Slide 35

Slide 35 text

35 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks ● Definition of normal behaviour is extremely challenging ● Noise data aren’t anomalies ● The definition of anomaly is domain-specific ● Anomalies evolve over time ● Getting a checklist of all possible anomalies is difficult 35 Challenges in Anomaly Detection

Slide 36

Slide 36 text

36 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks AI Testing Talks Thank You!