Slide 1

Slide 1 text

Machine learning under test PostDoc @ University of Salerno, Italy EuroPython 2015 @ Bilbao Valerio Maggio @leriomaggio

Slide 2

Slide 2 text

Testing Machine Learning Code & Algorithms (a.k.a. Models) • Part 1: Common Risks and Pitfalls 
 (related to learning models) • Part 2: Testing Machine Learning Code • What does it mean? • What tools I’m required to use?

Slide 3

Slide 3 text

Please answer to Five questions THREE questions • Do you already know what Machine Learning is? • Do you already know/use/hear about Testing ? • Have you ever used Scikit-Learn?

Slide 4

Slide 4 text

So, what is Machine Learning? Machine learning is the systematic study of algorithms and systems that improve their knowledge or performance with experience T. Mitchell, 1997

Slide 5

Slide 5 text

ML at a glance

Slide 6

Slide 6 text

Example(1): (Linear) Regression

Slide 7

Slide 7 text

Example(2): Classification

Slide 8

Slide 8 text

Example(3): clustering

Slide 9

Slide 9 text

Example(3): clustering

Slide 10

Slide 10 text

@jakevdp

Slide 11

Slide 11 text

@jakevdp

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

So Basically… Machine learning teaches machines how to carry out tasks by themselves. It is that simple. 
 The complexity comes with the details. W. Richert & L.P. Coelho, 2013 Building Machine Learning Systems with Python

Slide 14

Slide 14 text

How to choose your model?

Slide 15

Slide 15 text

Risk with Machine Learning • Unstable data • programming fault (despite outliers reduction) • Underfitting • the learning function does not take into account enough information to accurately model the phenomenon • Overfitting • the learning function does not generalise enough to properly model the phenomenon • Unpredictable Future • We don’t actually know if our model is working or not! 
 (running time checking) a.k.a. What to test?

Slide 16

Slide 16 text

How to cope with risks

Slide 17

Slide 17 text

Deal with Unstable Data

Slide 18

Slide 18 text

import unittest

Slide 19

Slide 19 text

import unittest

Slide 20

Slide 20 text

(scikit) Data Representation

Slide 21

Slide 21 text

from numpy import testing

Slide 22

Slide 22 text

Assert almost equal

Slide 23

Slide 23 text

Assert Array equal all_close: |a-b| <= (atol + rtol * |b|)

Slide 24

Slide 24 text

Floating Point Comparison ULP: Unit Least Precision

Slide 25

Slide 25 text

np.testing.decorators

Slide 26

Slide 26 text

unittest.mock python2: pip install mock

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Deal with Unstable Data Know your Data Choose the most important features

Slide 30

Slide 30 text

iris dataset

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Discriminative features

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Model Generalisation & Model Optimism

Slide 35

Slide 35 text

Model Generalisation & Overfitting

Slide 36

Slide 36 text

Model Generalisation & Overfitting Results Underfitting Bad Performance Overfitting

Slide 37

Slide 37 text

1 • All 150 training examples are correctly identified • Polynomial Degree 4 for features • This does not mean that our model is perfect! • Indeed, it is far from that! • We can simulate this by splitting our data into a training set and a testing set.

Slide 38

Slide 38 text

Learning Curve

Slide 39

Slide 39 text

Polynomial Degree 1 Polynomial Degree 4

Slide 40

Slide 40 text

How to evaluate the performance of a Regression Model More accurate when comparing multiple models!

Slide 41

Slide 41 text

Evaluate (our)
 Regression Model(s)

Slide 42

Slide 42 text

RMSE: The closer to zero, 
 the better the model performance R2 Score: The closer to one, 
 the better the model performance

Slide 43

Slide 43 text

Conclusions

Slide 44

Slide 44 text

Something is (always) better than nothing

Slide 45

Slide 45 text

(one of) the interesting things left behind Fuzz testing or fuzzing is an (automated) software testing technique that involves providing invalid, unexpected, 
 or random data to the inputs (source: Wikipedia) https://hypothesis.readthedocs.org/en/latest/ Check out Hypothesis

Slide 46

Slide 46 text

Thanks a lot for your kind attention +ValerioMaggio [email protected] it.linkedin.com/in/ valeriomaggio @leriomaggio