Introduction to Machine Learning (Nusantech Webinar)

Introduction to Machine Learning (Nusantech Webinar)

Introduction to Machine Learning for Nusantech Webinar. Demo materials: bit.ly/intro-ml-nusantech

2b6d7bdd43058e87f53866eb86538a59?s=128

Galuh Sahid

June 28, 2020
Tweet

Transcript

  1. Introduction to Machine Learning Galuh Sahid @galuhsahid | galuh.me Nusantech

    Webinar - June 28, 2020
  2. • Data Scientist at Gojek • Google Developer Expert in

    Machine Learning • Co-host podcast Kartini Teknologi (kartiniteknologi.id) @galuhsahid Hi! I’m Galuh.
  3. @galuhsahid https://bit.ly/intro-ml-nusantech

  4. • Definition of machine learning & its difference with traditional

    programming • Machine learning flow • Defining ML problem • Acquiring, getting to know, & preparing your data • Training your model • Making predictions • Tools & resources • Demo Outline @galuhsahid
  5. @galuhsahid Photo by Bram Van Oost from Unsplash

  6. @galuhsahid Photo by Frank V. from Unsplash

  7. @galuhsahid Photo by Bence Boros from Unsplash

  8. @galuhsahid Photo by Nordwood from Unsplash

  9. @galuhsahid Photo by Krsto Jevtic from Unsplash

  10. It’s an exciting time to learn about machine learning! @galuhsahid

  11. But… what is machine learning? @galuhsahid

  12. A field of study that gives computers the ability to

    learn without being explicitly programmed. Arthur Samuel (1959) @galuhsahid
  13. How is machine learning different from traditional programming? @galuhsahid

  14. Traditional Programming Rules Data Answers @galuhsahid

  15. if pixel[5][7] is black and pixel [5][6] is black and

    pixel [5][8] is black and …: if pixel[6][7] is black and pixel[6][7] is black and …: return “panda” … … … else: return “cat” Photo by Damian Patowski from Unsplash @galuhsahid
  16. if pixel[5][7] is black and pixel [5][6] is black and

    pixel [5][7] is black and …: if pixel[6][7] is black and pixel[6][7] is black and …: return “panda” … … … else: return “not cat” Photo by Dušan Smetana from Unsplash @galuhsahid
  17. Machine Learning Answers Data Rules

  18. Answers Data Panda Cat Cat Photo by Max Baskalov and

    Zane Lee from Unsplash Panda @galuhsahid
  19. ? Cat ? Photo by Cyrus Chew from Unsplash @galuhsahid

  20. Machine learning? Artificial intelligence? Neural networks? Deep learning? @galuhsahid

  21. Source @galuhsahid

  22. Rule engine Knowledge Graphs Source @galuhsahid

  23. Regression Decision Tree Random Forest Source @galuhsahid

  24. Machine learning flow @galuhsahid

  25. @galuhsahid Define your machine learning problem Acquire, get to know,

    & prepare your data Train your model Use the model to make predictions Adapted from Introduction to ML Problem Framing
  26. Step #1 Define your machine learning problem @galuhsahid

  27. Type #1 Binary classification Classifies input into one of two

    categories @galuhsahid
  28. Spam or not spam Binary Classification @galuhsahid

  29. Type #2 Multi-class classification Classifies input into one of more

    than two categories @galuhsahid
  30. Language prediction Multi-class Classification @galuhsahid

  31. Type #3 Regression Predicts a value on a continuous scale

    @galuhsahid
  32. Visit duration & wait time estimates Regression @galuhsahid

  33. Type #4 Catalog organization Produces a set of result to

    present to users @galuhsahid
  34. Recommender System Catalog Organization @galuhsahid

  35. Type #5 Generative model Focuses on generating data rather than

    classifying or organizing it @galuhsahid
  36. Generating a new face Generative Model @galuhsahid

  37. Generative Model Generating a new text @galuhsahid talktotransformer.com

  38. Step #2 Acquire, get to know, & prepare your data

    @galuhsahid
  39. Step #2 Acquire, get to know, & prepare your data

    @galuhsahid You need to know: • What are the types of data that you can use • Where to get them • How to get to know your data • How to prepare your data
  40. Tabular @galuhsahid Data Type #1

  41. Text @galuhsahid Data Type #2

  42. Sound @galuhsahid Data Type #3

  43. Image @galuhsahid Data Type #4

  44. Where to get the data? Data @galuhsahid - Use a

    ready-to-use dataset - Extract the data by yourself - Collet and build your own dataset from scratch
  45. Google Research @galuhsahid Data Source #1 https://research.google/tools/ datasets/

  46. Google Dataset @galuhsahid Data Source #2 https:// datasetsearch.research.google.com

  47. Kaggle Datasets @galuhsahid Data Source #3 https://www.kaggle.com/datasets

  48. Exploratory Data Analysis Getting to know your data - Analyze

    your data to summarize their main characteristics - Examples include: check for basic statistics (e.g. mean, median), missing data, outliers @galuhsahid
  49. Feature engineering Preparing your data - Handling categorical data @galuhsahid

  50. Feature engineering Preparing your data - Handling outliers @galuhsahid

  51. Step #3 Train your model @galuhsahid

  52. Step #3 Train your model @galuhsahid You need to know:

    • What is a feature • What is a model • How does the training process work • How loss helps our model to get better • How evaluation metrics help us know if our model is good enough
  53. Examples of features for house price prediction Features @galuhsahid

  54. Examples of features for house price prediction Features We want

    to predict this… @galuhsahid
  55. Examples of features for house price prediction Features We want

    to predict this… …using these features @galuhsahid
  56. What are the features for an image classification task? Features

    @galuhsahid
  57. What are the features for an image classification task? Features

    @galuhsahid Source
  58. What is a model? Model - A model maps examples

    to predicted labels - It is defined by weights that are learned during the training process - Once trained, you can use it to make predictions about data that it has never seen before @galuhsahid
  59. What is a model? Model - There are many algorithms

    that you can use: • Linear regression • Logistic regression • Decision tree • Support Vector Machine (SVM) • Naive Bayes • kNN • … @galuhsahid
  60. The training process Model - Iteration 1: 2*number of floors

    + 3*area size = predicted house price Model Data Predictions House #1: predicted: 200 million actual: 500 million difference: 300 million @galuhsahid
  61. The training process Model - Iteration 1: 2*number of floors

    + 3*area size = predicted house price Model Data Predictions House #1: predicted: 400 million actual: 500 million difference: 100 million - Iteration 2: 4*number of floors + 6*area size = predicted house price @galuhsahid
  62. The training process Model - Iteration 1: 2*number of floors

    + 3*area size = predicted house price Model Data Predictions House #1: predicted: 400 million actual: 500 million difference: 100 million - Iteration 2: 4*number of floors + 6*area size = predicted house price Our model does not get smart right away - it needs to be “trained” @galuhsahid
  63. How loss helps our model get better Model @galuhsahid High

    Loss Low Loss - Arrows represent loss - Blue lines represent predictions Adapted from Machine Learning Crash Course
  64. How evaluation metrics help us know that our model is

    good enough Model @galuhsahid - Evaluation metrics: • Accuracy • Mean Absolute Error • Root Mean Squared Error • … and more Actual Spam Actual Not Spam Predicted Spam 15 10 Predicted Not Spam 5 30 Accuracy: (Correctly classified spam emails + correctly classified not spam emails)/total emails = (15 + 30)/(15+10+5+30) = 75%
  65. Step #4 Use the model to make predictions @galuhsahid

  66. Tools & resources @galuhsahid

  67. Programming languages Tools & resources - Python or R is

    usually the go-to programming language - However, you can now train your own machine learning models using JavaScript thanks to TensorFlow.js @galuhsahid
  68. Libraries Tools & resources - Data manipulation: numpy, pandas -

    NLP: NLTK, spaCy - Image processing: PIL, OpenCV - Machine learning: scikit-learn, TensorFlow, TensorFlow Lite @galuhsahid
  69. Teachable Machine Tools & resources @galuhsahid

  70. TensorFlow Hub Tools & resources @galuhsahid

  71. Demo #1: Predicting house prices @galuhsahid

  72. @galuhsahid https://bit.ly/intro-ml-nusantech

  73. Demo #2: Image classification @galuhsahid

  74. Image classification @galuhsahid https://main-suit.glitch.me

  75. Image classification @galuhsahid https://main-suit.glitch.me

  76. References

  77. More machine learning • On building ML projects: First Steps

    Towards Your First Machine Learning Project • On ML with JavaScript: Machine Learning on the Web • On ML with TensorFlow: A Whirlwind Tour of Machine Learning with TensorFlow @galuhsahid
  78. Learning resources •Deep Learning with Python (book) by François Chollet

    •Machine Learning Glossary •Machine Learning Crash Course •TensorFlow Tutorials •Teachable Machine Tutorials (1, 2, 3) •But what is a neural network? (video) @galuhsahid
  79. Thank you! @galuhsahid