forLoop
August 22, 2016

# Decision Trees

Nnamdi talk on Decision Trees at the last forLoop Machine Learning event in Lagos.

August 22, 2016

## Transcript

2. ### Outline • Decision Trees • Tree Overview • Types of

Decision Trees • Supervised Learning • Unsupervised Learning • Use cases • Advantages & Disadvantages • Building Decision Trees
3. ### Decision Tree Uses • Classification – E.g.: male vs female

• Prediction – E.g.: Will it rain? Male? Yes No Cloudy? Yes No Windy? No Yes No
4. ### Trees • Structure – Nodes – Root – Leaves –

Parent & Children • Types – Binary – Ternary – Quartnary – Red and Black Cloudy? No Windy? No Yes
5. ### Types of Decision Trees • Classification • Regression • Random

Forest • Rotation Forest
6. ### Supervised Learning • Inferring meaning from labeled data – Dataset

D: {l1 :x1 ,l2 :x2 ,…,ln :xn } • {Name: John, Gender: Male, Age: 21, Income: 30k} • {Name: Jane, Gender: Female, Age: 20, Income: 12k} • {Name: Buhari, Gender: Male, Age: Ancient, Income: 30m}
7. ### Unsupervised Learning • Principal Component Analysis (PCA) • Clustering •

Anomaly detection
8. ### Use Cases • Anything (almost) Gender Age Income (k) Going

to the local bar? M 22 10 F 36 20 F 20 100 M 36 80 F 27 60 Male? Yes Under 26? Under 40? No Income over 25? Yes No Yes
9. ### Advantages • Simple • Combining trees • White box model

• Efficient processing with large datasets
10. ### Disadvantages • Hard to build a tree • Large trees

• Overfitting (deep trees)

PCA
12. ### Iterative Dichotomiser 3 (ID3) • Find the entropy of the

attributes in the dataset • Based on the attribute with the least entropy, divide the dataset into subsets • Add the least entropy attribute as a node to the decision tree
13. ### Example • Gender: M or F • Income: Over 50k,

under 50k • Age: Over 30, under 30 • Entropy (1st round) – Gender: 1 – Income: 0.99277 – Age: 0.934068 Gender Income Age M 40 45 F 40 15 F 100 46 F 40 34 M 20 37 F 20 36 F 100 20 M 100 36 M 60 48 M 10 33 F 80 18 M 70 43 M 60 26 M 90 36 F 60 35 M 40 18 F 100 28 M 20 16 F 20 48 F 90 41
14. ### Example • Entropy (2nd round) – Under 30 • Gender:

0.995727 • Income : 0.995727 – Over 30 • Gender: 0.985228 • Income : 0.985228 Under 30? Income > 50k Female? Male? Income > 50k