Slide 1

Slide 1 text

Teaching Machine Learning Piotr Migdał, PhD http://p.migdal.pl @pmigdal PyData Warsaw Conference 19 Oct 2017

Slide 2

Slide 2 text

PhD in quantum physics theory (2014, ICFO, Barcelona) data scientist (deepsense.ai / consultant)
 machine learning deep learning data-viz (D3.js)

Slide 3

Slide 3 text

I teach…

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Outline • Everything is easy • Typical problems • Getting pragmatic

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

A person first learns classical mechanics • …by playing with balls, blocks? • …by learning Newton laws, differential calculus?

Slide 8

Slide 8 text

A person first learns natural numbers • …by counting apples, toys? • …by the von Neumann construction?

Slide 9

Slide 9 text

i ~ ˙ = ⇣ ~2 2m r2 + V ( x ) ⌘ ˆ H = E

Slide 10

Slide 10 text

A person first learns quantum mechanics • ...by learning linear algebra, complex numbers?

Slide 11

Slide 11 text

http://p.migdal.pl/2016/08/15/quantum-mechanics-for-high-school-students.html

Slide 12

Slide 12 text

http://quantumgame.io/

Slide 13

Slide 13 text

A person first learns machine learning • ...by studying computer science, mathematics
 and statistics for years?

Slide 14

Slide 14 text

you don’t understand Machine Learning unless you can teach it with pen&paper (or at least - JavaScript)

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

https://generalabstractnonsense.com/2017/03/A-quick-look-at-Support-Vector-Machines/

Slide 17

Slide 17 text

Machine Learning https://twitter.com/b0rk/status/821922905890103298/photo/1

Slide 18

Slide 18 text

Decision trees, visually http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

Slide 19

Slide 19 text

http://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html Julia Bazińska’s talk “Exploring word2vec vector space" - tomorrow at 15:00 Word2vec

Slide 20

Slide 20 text

Matrix factorizaton http://p.migdal.pl/matrix-decomposition-viz/, work in progress

Slide 21

Slide 21 text

https://distill.pub/

Slide 22

Slide 22 text

…but at least Deep Learning is hard, isn’t it?

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Spreadsheet-based deep learning http://www.deepexcel.net/

Slide 27

Slide 27 text

http://setosa.io/ev/image-kernels/

Slide 28

Slide 28 text

Things which are not problems • I don’t know Python • They are only high-school • They are not from STEM

Slide 29

Slide 29 text

Trypophobia detector by high-school students https://github.com/cytadela8/trypophobia

Slide 30

Slide 30 text

Trypophobia detector by high-school students https://github.com/cytadela8/trypophobia

Slide 31

Slide 31 text

Problems with teachers • Too much math details & too little insight • Too much historical inertia • No plots • Too little real data
 (e.g. all np.random.randn(n, m))

Slide 32

Slide 32 text

Real data, plots > arrays

Slide 33

Slide 33 text

Pragmatic algorithms • kNN • Linear + Logistic
 Regression • Random Forest • XGBoost • Neural Networks

Slide 34

Slide 34 text

Problems with clients • Everyone is an “expert” • Squeezing one semester (or a few)
 into a few days (or just one) • Deep learning will solve all our problems • Installation!!!

Slide 35

Slide 35 text

Logistic Regression
 vs Random Forest vs Deep Learning https://github.com/szilard/benchm-ml http://datascience.la/benchmarking-random-forest-implementations/

Slide 36

Slide 36 text

Easy setup • Python 3 with Anaconda • Jupyter Notebook • Neptune.ML

Slide 37

Slide 37 text

Thank you! Questions? more on my blog
 http://p.migdal.pl/ @pmigdal