$30 off During Our Annual Pro Sale. View Details »

First Steps Towards Your First Machine Learning Project

First Steps Towards Your First Machine Learning Project

Girls in ICT Brunei 2020

Galuh Sahid

April 26, 2020
Tweet

More Decks by Galuh Sahid

Other Decks in Technology

Transcript

  1. First Steps Towards Your
    First Machine Learning
    Project
    Galuh Sahid (Twitter: @galuhsahid)
    Data Scientist at Gojek
    Google Developer Expert in Machine Learning

    View Slide

  2. How to get started with
    machine learning?

    View Slide

  3. Ideas for your first project
    Formulating your machine
    learning problem
    Tools that you can use

    View Slide

  4. Ideas for your first project

    View Slide

  5. It has to be… something you enjoy

    View Slide

  6. Spotify playlists

    View Slide

  7. Google Home (photo credit: unsplash.com)

    View Slide

  8. I love…
    • Sports
    • Idea: build a model that predicts
    championship winners based on past
    history
    • Literature
    • Idea: text generator in the style of famous
    authors
    • Movies
    • Idea: classify movie review sentiments

    View Slide

  9. It has to be… something you enjoy

    View Slide

  10. It has to be… something you enjoy
    It does not
    have to be…
    something
    complicated or novel

    View Slide

  11. Formulating your machine
    learning problem

    View Slide

  12. Formulating an ML problem
    Supervised Learning

    View Slide

  13. Formulating an ML problem
    Supervised Learning
    Leaf Width Leaf Length Species
    2.7 4.9 small-leaf
    3.2 5.5 big-leaf
    2.9 5.1 small-leaf
    3.4 6.8 big-leaf
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  14. Formulating an ML problem
    Supervised Learning
    Leaf Width Leaf Length Species
    2.7 4.9 small-leaf
    3.2 5.5 big-leaf
    2.9 5.1 small-leaf
    3.4 6.8 big-leaf
    features label The features and their corresponding labels are fed into an
    algorithm in a process called training.
    What happens during training? The algorithm will gradually
    determine the relationship between features and their
    corresponding labels.
    This relationship is called the model.
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  15. Formulating an ML problem
    Supervised Learning
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  16. Formulating an ML problem
    Supervised Learning
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  17. Formulating an ML problem
    Supervised Learning
    Leaf Width Leaf Length Species
    2.7 4.9 small-leaf
    3.2 5.5 big-leaf
    2.9 5.1 small-leaf
    3.4 6.8 big-leaf
    2.1 3.4 ? small-leaf (our
    prediction)
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  18. Supervised Learning
    Classification
    Formulating an ML problem
    Supervised Learning

    View Slide

  19. Supervised Learning
    Classification
    Is this a spam/not a spam?
    Is the sentiment of this movie review
    positive or negative?
    Binary
    Is this a picture of a cat or a dog?
    Formulating an ML problem
    Supervised Learning

    View Slide

  20. Supervised Learning
    Classification
    Is this a comedy, horror, or drama
    movie?
    Is this the voice of a dog, a bird, a
    cat, or a grasshopper?
    Binary
    Is this a picture of a shirt, a book,
    or a food?
    Multi-class
    Formulating an ML problem
    Supervised Learning

    View Slide

  21. Supervised Learning
    Classification
    Binary Multi-class
    Regression
    What is the price of the house if it
    has 2 floors, 4 bedrooms, and 2
    bathrooms?
    What is the temperature in Paris
    tomorrow?
    Formulating an ML problem
    Supervised Learning

    View Slide

  22. Unsupervised Learning
    Formulating an ML problem
    Supervised Learning

    View Slide

  23. Formulating an ML problem
    Unsupervised Learning
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  24. Formulating an ML problem
    Unsupervised Learning
    Adapted from Google’s Machine Learning Problem Framing
    Sports
    Entertainment

    View Slide

  25. Unsupervised Learning
    Formulating an ML problem
    Supervised Learning

    View Slide

  26. Formulating an ML problem
    Identifying Data Sources
    Use ready-to-use dataset
    Collect & build your own dataset from scratch
    Extract the data from existing data sources

    View Slide

  27. Formulating an ML problem
    Ready-to-use dataset
    Google Research (https://research.google/tools/datasets/)

    View Slide

  28. Formulating an ML problem
    Ready-to-use dataset
    Google Dataset (datasetsearch.research.google.com)

    View Slide

  29. Formulating an ML problem
    Ready-to-use dataset
    Google Dataset (datasetsearch.research.google.com)

    View Slide

  30. Formulating an ML problem
    Ready-to-use dataset
    Kaggle Datasets (https://www.kaggle.com/datasets)

    View Slide

  31. Formulating an ML problem
    Ready-to-use dataset
    • Oftentimes data cleansing, manipulation, transformations are still necessary
    • You need to know the labels that you expect

    View Slide

  32. Formulating an ML problem
    Ready-to-use dataset
    Example: predicting the sentiment of movie reviews (positive or negative)

    View Slide

  33. Formulating an ML problem
    Extract the data by yourself
    - Scraping Twitter data
    Example of applications:
    - Sentiment analysis (must be labeled)
    - Topic detection
    e.g. 50% of tweets: Chris Evans’ new TV series, 25% of the
    tweets: Avengers, 25% of the tweets: Golden Globes
    - Scraping news websites

    View Slide

  34. Formulating an ML problem
    Extract the data by yourself
    • Might be time-consuming

    View Slide

  35. Formulating an ML problem
    Build your own dataset from scratch
    • Might be time-consuming, especially if you need a lot of data
    • Need to ensure that the way your data is collected suits the real-world condition
    - Example: building a bird audio dataset by recording sounds of different birds

    View Slide

  36. Formulating an ML problem
    Now you know:
    - The problem statement of your project (e.g. “Our problem is best framed as 3-class classification, which predicts whether a video
    will be in one of three classes—{very popular, somewhat popular, not popular}—28 days after being uploaded”)
    - What data you need to process (text? Images?)
    - Whether you need labeled data or not
    - Possible algorithms for your project
    Adapted from Google’s Machine Learning Problem Framing

    View Slide

  37. Tools & resources

    View Slide

  38. Tools & resources
    Programming languages
    • Python is usually the go-to programming language
    • However, you can now train your own machine learning models using JavaScript thanks to TensorFlow.js

    View Slide

  39. Tools & resources
    Libraries
    • Data manipulation: numpy, pandas
    • NLP: NLTK, spaCy
    • Image processing: PIL, OpenCV
    • Machine learning: scikit-learn, TensorFlow, TensorFlow Lite

    View Slide

  40. Tools & resources
    Teachable Machine
    Teachable Machine

    View Slide

  41. Tools & resources
    Teachable Machine
    Teachable Machine

    View Slide

  42. Tools & resources
    TensorFlow Hub
    TensorFlow Hub

    View Slide

  43. Tools & resources
    TensorFlow Hub
    Example: mobilebert, compressed version of BERT (Bidirectional Encoder Representations
    from Transformers)

    View Slide

  44. Tools & resources
    TensorFlow Hub

    View Slide

  45. Tools & resources
    Learning Resources
    • Deep Learning with Python (book) by François Chollet
    • Machine Learning Glossary
    • Machine Learning Crash Course
    • TensorFlow Tutorials
    • Teachable Machine Tutorials (1, 2, 3)

    View Slide

  46. Takeaways
    • Building machine learning projects can be a great way to learn machine learning
    • ML projects don’t have to be super fancy or complicated, they just have to be
    something you enjoy :)
    • The process of formulating your ML problem can help you figure out your next steps
    (e.g. collect what data, use what tools, possible algorithms)
    • There are lots of resources & tools out there to help you!

    View Slide

  47. Thank you!
    @galuhsahid

    View Slide