Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning 101

Dfda3ce33093a2ce23246410c5087a92?s=47 Ali Akbar S.
December 18, 2017

Machine Learning 101

Dfda3ce33093a2ce23246410c5087a92?s=128

Ali Akbar S.

December 18, 2017
Tweet

Transcript

  1. Machine Learning 101 Ali Akbar Septiandri Universitas Al Azhar Indonesia

  2. Previously...

  3. Cross Industry Standard Process for Data Mining (CRISP-DM)

  4. Data Science Venn Diagram

  5. What is the role of machine learning algorithms?

  6. “Fundamentally, machine learning involves building mathematical models to help understand

    data.” - Jake VanderPlas
  7. Tasks in Machine Learning 1. Predicting stock price 2. Differentiating

    cat vs. dog pictures 3. Spam identification 4. Community detection 5. Mimicking famous painting style 6. Mastering the game of go and chess 7. etc.
  8. Task Categories 1. Supervised learning a. Predicting stock price b.

    Differentiating cat vs. dog pictures c. Spam identification 2. Unsupervised learning a. Community detection b. Mimicking famous painting style 3. Reinforcement learning a. Mastering the game of go and chess
  9. - Iris Dataset - by R.A. Fisher (1936) - 4

    attributes: sepal length, sepal width, petal length, petal width - 3 labels: Iris Setosa, Iris Versicolour, Iris Virginica Let’s take an example dataset...
  10. None
  11. None
  12. None
  13. None
  14. None
  15. Nearest Neighbour - Finding the closest reference - What does

    it mean by “closest”? - Humans comprehend visualisations very well - Can computers do the same?
  16. At the lowest level, computers only understand 0 or 1

  17. Euclidean Distance

  18. Euclidean Distance

  19. Are you sure?

  20. 1. Find some k closest references 2. Use majority vote

    3. We need to compute pairwise distances k-Nearest Neighbours
  21. None
  22. Conventional statistics can not do that

  23. We need high computational power

  24. What if we only want to see the subgroups in

    the data?
  25. Clustering - Finding subgroups in the data - Your neighbours

    in the same housing complex regardless of their class - Unsupervised learning
  26. None
  27. k-Means Clustering

  28. k-Means Clustering 1. Uses Euclidean distance as well 2. k

    = number of clusters 3. Centroids to represent clusters
  29. None
  30. None
  31. None
  32. Deep Learning

  33. None
  34. Digit Recognition MNIST Dataset

  35. Classifying objects from pictures [Krizhevsky, 2009]

  36. None
  37. None
  38. A neural network [Nielsen, 2016]

  39. Logistic Regression y = σ(w 0 + w 1 x

    1 )
  40. Predicting traffic jams from CCTV pictures

  41. Mimicking famous paintings

  42. None
  43. Other Machine Learning Algorithms

  44. Naive Bayes

  45. Decision trees

  46. Linear regression with polynomial basis functions

  47. “No free lunch”

  48. Thank you