$30 off During Our Annual Pro Sale. View Details »

Introduction to Machine Learning (Nusantech Webinar)

Introduction to Machine Learning (Nusantech Webinar)

Introduction to Machine Learning for Nusantech Webinar. Demo materials: bit.ly/intro-ml-nusantech

Galuh Sahid

June 28, 2020
Tweet

More Decks by Galuh Sahid

Other Decks in Technology

Transcript

  1. Introduction to
    Machine Learning
    Galuh Sahid
    @galuhsahid | galuh.me
    Nusantech Webinar - June 28, 2020

    View Slide

  2. • Data Scientist at Gojek
    • Google Developer Expert in Machine Learning
    • Co-host podcast Kartini Teknologi (kartiniteknologi.id)
    @galuhsahid
    Hi! I’m Galuh.

    View Slide

  3. @galuhsahid
    https://bit.ly/intro-ml-nusantech

    View Slide

  4. • Definition of machine learning & its difference with
    traditional programming
    • Machine learning flow
    • Defining ML problem
    • Acquiring, getting to know, & preparing your data
    • Training your model
    • Making predictions
    • Tools & resources
    • Demo
    Outline
    @galuhsahid

    View Slide

  5. @galuhsahid
    Photo by Bram Van Oost from Unsplash

    View Slide

  6. @galuhsahid
    Photo by Frank V. from Unsplash

    View Slide

  7. @galuhsahid
    Photo by Bence Boros from Unsplash

    View Slide

  8. @galuhsahid
    Photo by Nordwood from Unsplash

    View Slide

  9. @galuhsahid
    Photo by Krsto Jevtic from Unsplash

    View Slide

  10. It’s an exciting time to learn about
    machine learning!
    @galuhsahid

    View Slide

  11. But… what is machine learning?
    @galuhsahid

    View Slide

  12. A field of study that gives computers the ability to learn without being
    explicitly programmed.
    Arthur Samuel (1959)
    @galuhsahid

    View Slide

  13. How is machine learning different
    from traditional programming?
    @galuhsahid

    View Slide

  14. Traditional
    Programming
    Rules
    Data
    Answers
    @galuhsahid

    View Slide

  15. if pixel[5][7] is black and pixel [5][6]
    is black and pixel [5][8] is black and …:
    if pixel[6][7] is black and pixel[6][7]
    is black and …:
    return “panda”



    else:
    return “cat”
    Photo by Damian Patowski from Unsplash @galuhsahid

    View Slide

  16. if pixel[5][7] is black and pixel [5][6]
    is black and pixel [5][7] is black and …:
    if pixel[6][7] is black and pixel[6][7]
    is black and …:
    return “panda”



    else:
    return “not cat”
    Photo by Dušan Smetana from Unsplash @galuhsahid

    View Slide

  17. Machine
    Learning
    Answers
    Data
    Rules

    View Slide

  18. Answers
    Data
    Panda
    Cat
    Cat
    Photo by Max Baskalov and Zane Lee from Unsplash
    Panda
    @galuhsahid

    View Slide

  19. ?
    Cat
    ?
    Photo by Cyrus Chew from Unsplash @galuhsahid

    View Slide

  20. Machine learning? Artificial
    intelligence? Neural networks? Deep
    learning?
    @galuhsahid

    View Slide

  21. Source @galuhsahid

    View Slide

  22. Rule engine
    Knowledge
    Graphs
    Source @galuhsahid

    View Slide

  23. Regression
    Decision Tree
    Random Forest
    Source @galuhsahid

    View Slide

  24. Machine learning flow
    @galuhsahid

    View Slide

  25. @galuhsahid
    Define your
    machine
    learning
    problem
    Acquire, get to
    know, &
    prepare your
    data
    Train your
    model
    Use the model
    to make
    predictions
    Adapted from Introduction to ML Problem Framing

    View Slide

  26. Step #1
    Define your machine learning
    problem
    @galuhsahid

    View Slide

  27. Type #1
    Binary classification
    Classifies input into one of two categories
    @galuhsahid

    View Slide

  28. Spam or not spam
    Binary Classification
    @galuhsahid

    View Slide

  29. Type #2
    Multi-class classification
    Classifies input into one of more than two categories
    @galuhsahid

    View Slide

  30. Language prediction
    Multi-class Classification
    @galuhsahid

    View Slide

  31. Type #3
    Regression
    Predicts a value on a continuous scale
    @galuhsahid

    View Slide

  32. Visit duration & wait time estimates
    Regression
    @galuhsahid

    View Slide

  33. Type #4
    Catalog organization
    Produces a set of result to present to users
    @galuhsahid

    View Slide

  34. Recommender System
    Catalog Organization
    @galuhsahid

    View Slide

  35. Type #5
    Generative model
    Focuses on generating data rather than classifying or organizing it
    @galuhsahid

    View Slide

  36. Generating a new face
    Generative Model
    @galuhsahid

    View Slide

  37. Generative Model
    Generating a new text
    @galuhsahid
    talktotransformer.com

    View Slide

  38. Step #2
    Acquire, get to know, & prepare your
    data
    @galuhsahid

    View Slide

  39. Step #2
    Acquire, get to know, & prepare your
    data
    @galuhsahid
    You need to know:
    • What are the types of data that you can use
    • Where to get them
    • How to get to know your data
    • How to prepare your data

    View Slide

  40. Tabular
    @galuhsahid
    Data Type #1

    View Slide

  41. Text
    @galuhsahid
    Data Type #2

    View Slide

  42. Sound
    @galuhsahid
    Data Type #3

    View Slide

  43. Image
    @galuhsahid
    Data Type #4

    View Slide

  44. Where to get the data?
    Data
    @galuhsahid
    - Use a ready-to-use dataset
    - Extract the data by yourself
    - Collet and build your own dataset from scratch

    View Slide

  45. Google
    Research
    @galuhsahid
    Data Source #1
    https://research.google/tools/
    datasets/

    View Slide

  46. Google
    Dataset
    @galuhsahid
    Data Source #2
    https://
    datasetsearch.research.google.com

    View Slide

  47. Kaggle
    Datasets
    @galuhsahid
    Data Source #3
    https://www.kaggle.com/datasets

    View Slide

  48. Exploratory Data Analysis
    Getting to know your data
    - Analyze your data to summarize their main characteristics
    - Examples include: check for basic statistics (e.g. mean, median), missing
    data, outliers
    @galuhsahid

    View Slide

  49. Feature engineering
    Preparing your data
    - Handling categorical data
    @galuhsahid

    View Slide

  50. Feature engineering
    Preparing your data
    - Handling outliers
    @galuhsahid

    View Slide

  51. Step #3
    Train your model
    @galuhsahid

    View Slide

  52. Step #3
    Train your model
    @galuhsahid
    You need to know:
    • What is a feature
    • What is a model
    • How does the training process work
    • How loss helps our model to get better
    • How evaluation metrics help us know if our model is good enough

    View Slide

  53. Examples of features for house price
    prediction
    Features
    @galuhsahid

    View Slide

  54. Examples of features for house price
    prediction
    Features
    We want to predict this…
    @galuhsahid

    View Slide

  55. Examples of features for house price
    prediction
    Features
    We want to predict this…
    …using these features
    @galuhsahid

    View Slide

  56. What are the features for an image
    classification task?
    Features
    @galuhsahid

    View Slide

  57. What are the features for an image
    classification task?
    Features
    @galuhsahid
    Source

    View Slide

  58. What is a model?
    Model
    - A model maps examples to predicted labels
    - It is defined by weights that are learned during the training process
    - Once trained, you can use it to make predictions about data that it has
    never seen before
    @galuhsahid

    View Slide

  59. What is a model?
    Model
    - There are many algorithms that you can use:
    • Linear regression
    • Logistic regression
    • Decision tree
    • Support Vector Machine (SVM)
    • Naive Bayes
    • kNN
    • …
    @galuhsahid

    View Slide

  60. The training process
    Model
    - Iteration 1: 2*number of floors + 3*area size = predicted house price
    Model
    Data Predictions
    House #1:
    predicted: 200 million
    actual: 500 million
    difference: 300 million
    @galuhsahid

    View Slide

  61. The training process
    Model
    - Iteration 1: 2*number of floors + 3*area size = predicted house price
    Model
    Data Predictions
    House #1:
    predicted: 400 million
    actual: 500 million
    difference: 100 million
    - Iteration 2: 4*number of floors + 6*area size = predicted house price
    @galuhsahid

    View Slide

  62. The training process
    Model
    - Iteration 1: 2*number of floors + 3*area size = predicted house price
    Model
    Data Predictions
    House #1:
    predicted: 400 million
    actual: 500 million
    difference: 100 million
    - Iteration 2: 4*number of floors + 6*area size = predicted house price
    Our model does not get smart right
    away - it needs to be “trained”
    @galuhsahid

    View Slide

  63. How loss helps our model get better
    Model
    @galuhsahid
    High Loss Low Loss
    - Arrows represent loss
    - Blue lines represent predictions
    Adapted from Machine Learning Crash Course

    View Slide

  64. How evaluation metrics help us know that our model is
    good enough
    Model
    @galuhsahid
    - Evaluation metrics:
    • Accuracy
    • Mean Absolute Error
    • Root Mean Squared Error
    • … and more
    Actual Spam Actual Not Spam
    Predicted Spam 15 10
    Predicted Not
    Spam
    5 30
    Accuracy:
    (Correctly classified spam emails + correctly classified not spam
    emails)/total emails = (15 + 30)/(15+10+5+30) = 75%

    View Slide

  65. Step #4
    Use the model to make predictions
    @galuhsahid

    View Slide

  66. Tools & resources
    @galuhsahid

    View Slide

  67. Programming languages
    Tools & resources
    - Python or R is usually the go-to programming language
    - However, you can now train your own machine learning models using
    JavaScript thanks to TensorFlow.js
    @galuhsahid

    View Slide

  68. Libraries
    Tools & resources
    - Data manipulation: numpy, pandas
    - NLP: NLTK, spaCy
    - Image processing: PIL, OpenCV
    - Machine learning: scikit-learn, TensorFlow, TensorFlow Lite
    @galuhsahid

    View Slide

  69. Teachable Machine
    Tools & resources
    @galuhsahid

    View Slide

  70. TensorFlow Hub
    Tools & resources
    @galuhsahid

    View Slide

  71. Demo #1: Predicting house prices
    @galuhsahid

    View Slide

  72. @galuhsahid
    https://bit.ly/intro-ml-nusantech

    View Slide

  73. Demo #2: Image classification
    @galuhsahid

    View Slide

  74. Image classification
    @galuhsahid
    https://main-suit.glitch.me

    View Slide

  75. Image classification
    @galuhsahid
    https://main-suit.glitch.me

    View Slide

  76. References

    View Slide

  77. More machine learning
    • On building ML projects: First Steps Towards Your First Machine
    Learning Project
    • On ML with JavaScript: Machine Learning on the Web
    • On ML with TensorFlow: A Whirlwind Tour of Machine Learning
    with TensorFlow
    @galuhsahid

    View Slide

  78. Learning resources
    •Deep Learning with Python (book) by François Chollet
    •Machine Learning Glossary
    •Machine Learning Crash Course
    •TensorFlow Tutorials
    •Teachable Machine Tutorials (1, 2, 3)
    •But what is a neural network? (video)
    @galuhsahid

    View Slide

  79. Thank you!
    @galuhsahid

    View Slide