Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Decision Trees

0b0ba94d014c694b932ada74f1c9f1af?s=47 forLoop
August 22, 2016

Decision Trees

Nnamdi talk on Decision Trees at the last forLoop Machine Learning event in Lagos.

0b0ba94d014c694b932ada74f1c9f1af?s=128

forLoop

August 22, 2016
Tweet

Transcript

  1. Machine Learning Decision Trees

  2. Outline • Decision Trees • Tree Overview • Types of

    Decision Trees • Supervised Learning • Unsupervised Learning • Use cases • Advantages & Disadvantages • Building Decision Trees
  3. Decision Tree Uses • Classification – E.g.: male vs female

    • Prediction – E.g.: Will it rain? Male? Yes No Cloudy? Yes No Windy? No Yes No
  4. Trees • Structure – Nodes – Root – Leaves –

    Parent & Children • Types – Binary – Ternary – Quartnary – Red and Black Cloudy? No Windy? No Yes
  5. Types of Decision Trees • Classification • Regression • Random

    Forest • Rotation Forest
  6. Supervised Learning • Inferring meaning from labeled data – Dataset

    D: {l1 :x1 ,l2 :x2 ,…,ln :xn } • {Name: John, Gender: Male, Age: 21, Income: 30k} • {Name: Jane, Gender: Female, Age: 20, Income: 12k} • {Name: Buhari, Gender: Male, Age: Ancient, Income: 30m}
  7. Unsupervised Learning • Principal Component Analysis (PCA) • Clustering •

    Anomaly detection
  8. Use Cases • Anything (almost) Gender Age Income (k) Going

    to the local bar? M 22 10 F 36 20 F 20 100 M 36 80 F 27 60 Male? Yes Under 26? Under 40? No Income over 25? Yes No Yes
  9. Advantages • Simple • Combining trees • White box model

    • Efficient processing with large datasets
  10. Disadvantages • Hard to build a tree • Large trees

    • Overfitting (deep trees)
  11. Generating Trees • Iterative Dichotomiser 3 (ID3) • C4.5 •

    PCA
  12. Iterative Dichotomiser 3 (ID3) • Find the entropy of the

    attributes in the dataset • Based on the attribute with the least entropy, divide the dataset into subsets • Add the least entropy attribute as a node to the decision tree
  13. Example • Gender: M or F • Income: Over 50k,

    under 50k • Age: Over 30, under 30 • Entropy (1st round) – Gender: 1 – Income: 0.99277 – Age: 0.934068 Gender Income Age M 40 45 F 40 15 F 100 46 F 40 34 M 20 37 F 20 36 F 100 20 M 100 36 M 60 48 M 10 33 F 80 18 M 70 43 M 60 26 M 90 36 F 60 35 M 40 18 F 100 28 M 20 16 F 20 48 F 90 41
  14. Example • Entropy (2nd round) – Under 30 • Gender:

    0.995727 • Income : 0.995727 – Over 30 • Gender: 0.985228 • Income : 0.985228 Under 30? Income > 50k Female? Male? Income > 50k
  15. Questions? • Thank You