Slide 1

Slide 1 text

Machine Learning Decision Trees

Slide 2

Slide 2 text

Outline ● Decision Trees ● Tree Overview ● Types of Decision Trees ● Supervised Learning ● Unsupervised Learning ● Use cases ● Advantages & Disadvantages ● Building Decision Trees

Slide 3

Slide 3 text

Decision Tree Uses ● Classification – E.g.: male vs female ● Prediction – E.g.: Will it rain? Male? Yes No Cloudy? Yes No Windy? No Yes No

Slide 4

Slide 4 text

Trees ● Structure – Nodes – Root – Leaves – Parent & Children ● Types – Binary – Ternary – Quartnary – Red and Black Cloudy? No Windy? No Yes

Slide 5

Slide 5 text

Types of Decision Trees ● Classification ● Regression ● Random Forest ● Rotation Forest

Slide 6

Slide 6 text

Supervised Learning ● Inferring meaning from labeled data – Dataset D: {l1 :x1 ,l2 :x2 ,…,ln :xn } ● {Name: John, Gender: Male, Age: 21, Income: 30k} ● {Name: Jane, Gender: Female, Age: 20, Income: 12k} ● {Name: Buhari, Gender: Male, Age: Ancient, Income: 30m}

Slide 7

Slide 7 text

Unsupervised Learning ● Principal Component Analysis (PCA) ● Clustering ● Anomaly detection

Slide 8

Slide 8 text

Use Cases ● Anything (almost) Gender Age Income (k) Going to the local bar? M 22 10 F 36 20 F 20 100 M 36 80 F 27 60 Male? Yes Under 26? Under 40? No Income over 25? Yes No Yes

Slide 9

Slide 9 text

Advantages ● Simple ● Combining trees ● White box model ● Efficient processing with large datasets

Slide 10

Slide 10 text

Disadvantages ● Hard to build a tree ● Large trees ● Overfitting (deep trees)

Slide 11

Slide 11 text

Generating Trees ● Iterative Dichotomiser 3 (ID3) ● C4.5 ● PCA

Slide 12

Slide 12 text

Iterative Dichotomiser 3 (ID3) ● Find the entropy of the attributes in the dataset ● Based on the attribute with the least entropy, divide the dataset into subsets ● Add the least entropy attribute as a node to the decision tree

Slide 13

Slide 13 text

Example ● Gender: M or F ● Income: Over 50k, under 50k ● Age: Over 30, under 30 ● Entropy (1st round) – Gender: 1 – Income: 0.99277 – Age: 0.934068 Gender Income Age M 40 45 F 40 15 F 100 46 F 40 34 M 20 37 F 20 36 F 100 20 M 100 36 M 60 48 M 10 33 F 80 18 M 70 43 M 60 26 M 90 36 F 60 35 M 40 18 F 100 28 M 20 16 F 20 48 F 90 41

Slide 14

Slide 14 text

Example ● Entropy (2nd round) – Under 30 ● Gender: 0.995727 ● Income : 0.995727 – Over 30 ● Gender: 0.985228 ● Income : 0.985228 Under 30? Income > 50k Female? Male? Income > 50k

Slide 15

Slide 15 text

Questions? ● Thank You