Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Make Machine Learning Boring Again: Best Practices for Using Machine Learning in Businesses - Albuquerque Machine Learning Meetup (Online) - Aug 2020

szilard
August 16, 2020
69

Make Machine Learning Boring Again: Best Practices for Using Machine Learning in Businesses - Albuquerque Machine Learning Meetup (Online) - Aug 2020

szilard

August 16, 2020
Tweet

More Decks by szilard

Transcript

  1. Make Machine Learning Boring Again: Best
    Practices for Using Machine Learning in
    Businesses
    Szilard Pafka, PhD
    Chief Scientist, Epoch
    Albuquerque Machine Learning Meetup (Online)
    Aug 2020

    View Slide

  2. View Slide

  3. Disclaimer:
    I am not representing my employer (Epoch) in this talk
    I cannot confirm nor deny if Epoch is using any of the methods, tools,
    results etc. mentioned in this talk

    View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. y = f (x1, x2, ... , xn)
    Source: Hastie etal, ESL 2ed

    View Slide

  10. y = f (x1, x2, ... , xn)

    View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. #1 Use the Right Algo

    View Slide

  16. Source: Andrew Ng

    View Slide

  17. View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. *

    View Slide

  35. #2 Use Open Source

    View Slide

  36. View Slide

  37. View Slide

  38. View Slide

  39. View Slide

  40. View Slide

  41. in 2006
    - cost was not a factor!
    - data.frame
    - [800] packages

    View Slide

  42. View Slide

  43. View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. View Slide

  48. View Slide

  49. #3 Simple > Complex

    View Slide

  50. View Slide

  51. 10x

    View Slide

  52. View Slide

  53. View Slide

  54. View Slide

  55. View Slide

  56. View Slide

  57. View Slide

  58. View Slide

  59. View Slide

  60. #4 Incorporate Domain Knowledge
    Do Feature Engineering (Still)
    Explore Your Data
    Clean Your Data

    View Slide

  61. View Slide

  62. View Slide

  63. View Slide

  64. View Slide

  65. View Slide

  66. View Slide

  67. View Slide

  68. View Slide

  69. View Slide

  70. View Slide

  71. #5 Do Proper Validation
    Avoid: Overfitting, Data Leakage

    View Slide

  72. View Slide

  73. View Slide

  74. View Slide

  75. View Slide

  76. View Slide

  77. View Slide

  78. View Slide

  79. View Slide

  80. View Slide

  81. View Slide

  82. View Slide

  83. View Slide

  84. View Slide

  85. View Slide

  86. #5+ Model Debugging
    Un-Black Boxing/Understanding,
    Interpretability, Fairness

    View Slide

  87. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day
    Readmission - Rich Caruana etal
    On one of the pneumonia datasets, the rule-based system learned the rule
    “HasAsthama(x) ⇒ LowerRisk(x)”, i.e., that patients who have a history of asthma have
    lower risk of dying from pneumonia than the general population
    patients with a history of asthma usually were admitted not only to the hospital but
    directly to the ICU (Intensive Care Unit). [...] the aggressive care received by asthmatic
    patients was so effective that it lowered their risk of dying from pneumonia compared to
    the general population
    models trained on the data incorrectly learn that asthma lowers risk, when in fact
    asthmatics have much higher risk (if not hospitalized)
    The logistic regression model also learned that having asthma lowered risk, but this
    could easily be corrected by changing the weight on the asthma feature from negative
    to positive (or to zero).

    View Slide

  88. View Slide

  89. View Slide

  90. View Slide

  91. View Slide

  92. #6 Batch or Real-Time Scoring?

    View Slide

  93. View Slide

  94. https://medium.com/@HarlanH/patterns-for-connecting-predictive-models-to-software-products-f9b6e923f02d

    View Slide

  95. https://medium.com/@dvelsner/deploying-a-simple-machine-learning-model-in-a-modern-web-application-flask-angular-docker-a657db075280
    your app

    View Slide

  96. View Slide

  97. View Slide

  98. R/Python:
    - Slow(er)
    - Encoding of categ. variables

    View Slide

  99. #7 Do Online Validation as Well

    View Slide

  100. View Slide

  101. https://www.oreilly.com/ideas/evaluating-machine-learning-models/page/2/orientation

    View Slide

  102. https://www.oreilly.com/ideas/evaluating-machine-learning-models/page/2/orientation

    View Slide

  103. https://www.oreilly.com/ideas/evaluating-machine-learning-models/page/2/orientation
    https://www.slideshare.net/FaisalZakariaSiddiqi/netflix-recommendations-feature-engineering-with-time-travel

    View Slide

  104. #8 Monitor Your Models

    View Slide

  105. View Slide

  106. https://www.retentionscience.com/blog/automating-machine-learning-monitoring-rs-labs/

    View Slide

  107. https://www.retentionscience.com/blog/automating-machine-learning-monitoring-rs-labs/

    View Slide

  108. View Slide

  109. 20%
    80%
    (my guess)

    View Slide

  110. 20%
    80%
    (my guess)

    View Slide

  111. #9 Business Value
    Seek / Measure / Sell

    View Slide

  112. View Slide

  113. View Slide

  114. View Slide

  115. View Slide

  116. View Slide

  117. #10 Make it Reproducible

    View Slide

  118. View Slide

  119. View Slide

  120. View Slide

  121. View Slide

  122. View Slide

  123. View Slide

  124. View Slide

  125. View Slide

  126. View Slide

  127. #11 Use the Cloud (Virtual Servers)

    View Slide

  128. ML training:
    lots of CPU cores
    lots of RAM
    limited time

    View Slide

  129. ML training:
    lots of CPU cores
    lots of RAM
    limited time
    ML scoring:
    separated servers

    View Slide

  130. #12 Don’t Use ML (cloud) services
    (MLaaS)

    View Slide

  131. View Slide

  132. “ the people that know what they’re doing just use open source, and the
    people that don’t will not get anything to work, ever, even with APIs.”
    https://bradfordcross.com/five-ai-startup-predictions-for-2017/

    View Slide

  133. #13 Use High-Level APIs
    but not GUIs

    View Slide

  134. View Slide

  135. View Slide

  136. #14 Kaggle Doesn’t Matter (Mostly)

    View Slide

  137. View Slide

  138. already pre-processed data
    less domain knowledge
    (or deliberately hidden)
    AUC 0.0001 increases "relevant"
    no business metric
    no actual deployment
    models too complex
    no online evaluation
    no monitoring
    data leakage

    View Slide

  139. # 15 GPUs (Depends)

    View Slide

  140. Aggregation 100M rows 1M groups
    Join 100M rows x 1M rows
    time [s]
    time [s]

    View Slide

  141. Aggregation 100M rows 1M groups
    Join 100M rows x 1M rows
    time [s]
    time [s]
    “Motherfucka!”

    View Slide

  142. View Slide

  143. #16 Tuning and Auto ML (Depends)

    View Slide

  144. Ben Recht, Kevin Jamieson: http://www.argmin.net/2016/06/20/hypertuning/

    View Slide

  145. https://arxiv.org/pdf/1907.00909.pdf

    View Slide

  146. View Slide

  147. “There is no AutoML system which consistently
    outperforms all others. On some datasets, the performance
    differences can be significant, but on others the AutoML
    methods are only marginally better than a Random Forest.
    On 2 datasets, all frameworks perform worse than a
    Random Forest.”

    View Slide

  148. Winner stability in data
    science competitions
    Test Set N=100K, Models M=1000

    View Slide

  149. Winner stability in data
    science competitions
    Test Set N=100K, Models M=3000

    View Slide

  150. Winner stability in data
    science competitions
    Test Set N=10K, Models M=1000

    View Slide

  151. Winner stability in data
    science competitions
    Test Set N=10K, Models M=3000

    View Slide

  152. Meta: Ignore the Hype

    View Slide

  153. View Slide

  154. Is This AI?

    View Slide

  155. View Slide

  156. View Slide

  157. View Slide

  158. How to Start?

    View Slide

  159. View Slide

  160. View Slide