Slide 1

Slide 1 text

Machine Learning 101 Ali Akbar Septiandri Universitas Al Azhar Indonesia

Slide 2

Slide 2 text

Previously...

Slide 3

Slide 3 text

Cross Industry Standard Process for Data Mining (CRISP-DM)

Slide 4

Slide 4 text

Data Science Venn Diagram

Slide 5

Slide 5 text

What is the role of machine learning algorithms?

Slide 6

Slide 6 text

“Fundamentally, machine learning involves building mathematical models to help understand data.” - Jake VanderPlas

Slide 7

Slide 7 text

Tasks in Machine Learning 1. Predicting stock price 2. Differentiating cat vs. dog pictures 3. Spam identification 4. Community detection 5. Mimicking famous painting style 6. Mastering the game of go and chess 7. etc.

Slide 8

Slide 8 text

Task Categories 1. Supervised learning a. Predicting stock price b. Differentiating cat vs. dog pictures c. Spam identification 2. Unsupervised learning a. Community detection b. Mimicking famous painting style 3. Reinforcement learning a. Mastering the game of go and chess

Slide 9

Slide 9 text

- Iris Dataset - by R.A. Fisher (1936) - 4 attributes: sepal length, sepal width, petal length, petal width - 3 labels: Iris Setosa, Iris Versicolour, Iris Virginica Let’s take an example dataset...

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Nearest Neighbour - Finding the closest reference - What does it mean by “closest”? - Humans comprehend visualisations very well - Can computers do the same?

Slide 16

Slide 16 text

At the lowest level, computers only understand 0 or 1

Slide 17

Slide 17 text

Euclidean Distance

Slide 18

Slide 18 text

Euclidean Distance

Slide 19

Slide 19 text

Are you sure?

Slide 20

Slide 20 text

1. Find some k closest references 2. Use majority vote 3. We need to compute pairwise distances k-Nearest Neighbours

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Conventional statistics can not do that

Slide 23

Slide 23 text

We need high computational power

Slide 24

Slide 24 text

What if we only want to see the subgroups in the data?

Slide 25

Slide 25 text

Clustering - Finding subgroups in the data - Your neighbours in the same housing complex regardless of their class - Unsupervised learning

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

k-Means Clustering

Slide 28

Slide 28 text

k-Means Clustering 1. Uses Euclidean distance as well 2. k = number of clusters 3. Centroids to represent clusters

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Deep Learning

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Digit Recognition MNIST Dataset

Slide 35

Slide 35 text

Classifying objects from pictures [Krizhevsky, 2009]

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

A neural network [Nielsen, 2016]

Slide 39

Slide 39 text

Logistic Regression y = σ(w 0 + w 1 x 1 )

Slide 40

Slide 40 text

Predicting traffic jams from CCTV pictures

Slide 41

Slide 41 text

Mimicking famous paintings

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

Other Machine Learning Algorithms

Slide 44

Slide 44 text

Naive Bayes

Slide 45

Slide 45 text

Decision trees

Slide 46

Slide 46 text

Linear regression with polynomial basis functions

Slide 47

Slide 47 text

“No free lunch”

Slide 48

Slide 48 text

Thank you