Slide 1

Slide 1 text

1 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks BUILD SOFTWARE TO TEST SOFTWARE Lecture 2. Statistical Methods ANOMALY DETECTION FOR AI TESTING Rostislav Yavorski Head of Research, Exactpro 30 MAY | 10.00 GET | 11.30 SLST

Slide 2

Slide 2 text

2 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Terms An outlier is a data point that differs significantly from other observations Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour

Slide 3

Slide 3 text

3 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Plan 1. Graphical Methods 2. Interquartile Range 3. Tukey's Fences 4. Seasonal and Trend Decomposition (STL) 5. Statistical Hypothesis Test 6. p-value and t-statistic 7. SciPy library

Slide 4

Slide 4 text

4 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Graphical Methods

Slide 5

Slide 5 text

5 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks First, divide the entire range of values into a series of intervals, "bins" or "buckets", and then count how many values fall into each interval. Histogram

Slide 6

Slide 6 text


Slide 7

Slide 7 text

7 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A scatter chart displays the relationship between 2 numeric variables. The position of each dot on the horizontal and vertical axes indicates values for a data point. Temperature °C Ice Cream Sales 14.2° $215 16.4° $325 11.9° $185 18.5° $406 22.1° $522 19.4° $412 25.1° $614 23.4° $544 15.2° $332 18.1° $421 22.6° $445 17.2° $408 Scatter Plot

Slide 8

Slide 8 text


Slide 9

Slide 9 text

9 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Interquartile Range

Slide 10

Slide 10 text

10 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Quartile Q1 is the middle number between the minimum and the median of the data set. Q2 (median) is the value separating the higher half from the lower half of a set. Q3 is the middle value between the median and the maximum of the data set.

Slide 11

Slide 11 text

11 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Q3 3rd quartile Q2 median Q1 1st quartile Quartile Q1, the first quartile: 25% of the data is below this point. Q2, the second quartile: 50% of the data lies below this point (it is the median) Q3, the third quartile: 75% of the data lies below this point. ¼ of data ¼ of data ¼ of data ¼ of data

Slide 12

Slide 12 text

12 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Quartile 3, 2, 3, 4, 9, 2, 10, 6, 8, 9, 3, 9, 8, 4, 10 2, 2, 3, 3, 3, 4, 4, 6, 8, 8, 9, 9, 9, 10, 10 Raw data: Ordered data: lower half upper half

Slide 13

Slide 13 text

13 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 2, 2, 3, 3, 3, 4, 4, 6, 8, 8, 9, 9, 9, 10, 10 Ordered data: lower half upper half min max median Q2 Q1 Q3 13 Quartile

Slide 14

Slide 14 text

14 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 2, 2, 3, 3, 3, 4, 4, 6, 8, 8, 9, 9, 9, 10, 10 Ordered data: lower half upper half min max median Q2 Q1 Q3 14 Quartile Five-number summary

Slide 15

Slide 15 text

15 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Tukey's Fences An outlier is any observation outside the range: [ Q1 - k(Q3 - Q1), Q3 + k(Q3 - Q1) ] where ● Q1 and Q3 are the lower and upper quartiles ● k is some non-negative constant John Tukey proposed that ● k = 1.5 indicates an "outlier", and ● k = 3 indicates data that is "far out" John Wilder Tukey (1915 – 2000)

Slide 16

Slide 16 text

16 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 2, 2, 3, 3, 3, 4, 4, 6, 8, 8, 9, 9, 9, 10, 10 Ordered data: lower half upper half min max median Q2 Q1 Q3 Q1 = 3, Q3 = 9, Interquartile range: Q3 - Q1 = 9 - 3 = 6 Lower outlier limit = Q1 - 1.5(Q3 - Q1) = 3 - 1.5×6 = -6 Upper outlier limit = Q1 + 1.5(Q3 - Q1) = 9 + 1.5 ×6 = 18 [ Q1 - k(Q3 - Q1), Q3 + k(Q3 - Q1) ] Tukey's Fences

Slide 17

Slide 17 text

17 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Boxplot: five numbers summary

Slide 18

Slide 18 text

18 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 0 0.5 1.0 1.5 2.0 Median Maximum Third Quartile First Quartile Minimum IQR Boxplot: five numbers summary

Slide 19

Slide 19 text

19 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Boxplot: five numbers summary 2, 2, 3, 3, 3, 4, 4, 6, 8, 8, 9, 9, 9, 10, 10 Ordered data: lower half upper half min max median Q2 Q1 Q3

Slide 20

Slide 20 text

20 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Boxplot: five numbers summary

Slide 21

Slide 21 text

21 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Seasonal and Trend Decomposition (STL)

Slide 22

Slide 22 text

22 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Seasonal-Trend Decomposition using LOESS (STL) STL decomposes a time series into three components: ● trend ● seasonal ● residual (noise) using Loess method LOESS = LOcally EStimated Scatterplot Smoothing LOESS curve approximation

Slide 23

Slide 23 text

23 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Raw data Trend: Seasonal: Remainder: + +

Slide 24

Slide 24 text

24 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Example Monthly airline passengers during the years 1949-1960 Anomaly Detection using STL

Slide 25

Slide 25 text

25 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly Detection using STL Example Web traffic data

Slide 26

Slide 26 text

26 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Statistical Hypothesis Test

Slide 27

Slide 27 text

27 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Statistical Hypothesis A statistical hypothesis test is a method used to decide whether the data at hand support a particular hypothesis. Null Hypothesis (H 0 ) and the Alternative Hypothesis (H A ): H 0 : The observed difference is due to chance alone. There are no anomalies. H A : Parameters of the distribution have changed. There is an anomaly.

Slide 28

Slide 28 text

28 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks p-value The probability of obtaining test results is at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. The p-value is used to quantify the statistical significance of a result. A small p-value means that observed outcome would be unlikely under the null hypothesis. Small p-values are strong evidence against the null hypothesis.

Slide 29

Slide 29 text

29 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks p-value Anomaly Anomaly More Likely Observations Observed Data Point P-value Probability Density Set of Possible Results

Slide 30

Slide 30 text

30 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks t-statistic The ratio of the departure of the estimated value of a parameter from its hypothesised value to its standard error: It is used along with p-value when running hypothesis tests where the p-value tells us what the odds are of the results to have happened. 30

Slide 31

Slide 31 text

31 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks SciPy – algorithms for ● optimisation ● integration ● interpolation ● eigenvalue problems ● algebraic equations ● differential equations ● statistics and many other classes of problems.

Slide 32

Slide 32 text

32 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Thank you! Questions?