Word embeddings under the hood - Applied Machine Learning Conference

Word embeddings under the hood - Applied Machine Learning Conference

Slides from the talk "Word embeddings under the hood: How neural networks learn from language" as presented on April 12, 2018 during the Applied Machine Learning Conference at the Tom Tom Founders Festival in Charlottesville, Virginia.

https://tomtomfest.com/machine-learning/

6f93ad3f479fad3da149d9372d1585a1?s=128

Patrick Harrison

April 12, 2018
Tweet

Transcript

  1. Word Embeddings Under the Hood How Neural Networks Learn from

    Language
  2. None
  3. Good news! Text data is everywhere.

  4. Bad news… there is way too much. We need computers

    to help!
  5. We started with the scallop dish as an appetizer, followed

    by the spaghetti with tomato sauce and duck and foie gras ravioli. How do we represent data like this?
  6. 1 2 3 … V we 1 0 0 …

    0 started 0 1 0 … 0 with 0 0 1 … 0 … … … … … … ravioli 0 0 0 … 1 One-Hot Encoding
  7. …but one-hot encoding leaves a lot to be desired. Are

    better word representations possible?
  8. y -2 -1 0 1 2 x -2 -1 0

    1 2 beer wine cocktail spoon fork knife spaghetti pasta lasagna
  9. y -2 -1 0 1 2 x -2 -1 0

    1 2 beer wine cocktail spoon fork knife spaghetti pasta lasagna x y spaghetti 1.0 1.5 pasta 1.2 1.3 … … … fork 0.0 -0.7 spoon -0.5 -1.5
  10. “You shall know a word by the company it keeps.”

    — J.R. Firth, 1957 Postulate #1
  11. “Neural networks learn useful, new data representations.” — Rumelhart, Hinton

    & Williams, 1986 (paraphrased) Postulate #2
  12. context clues neural networks ? = +

  13. Context clues as training data?

  14. None
  15. None
  16. spaghetti followed 1 spaghetti by 1 … … … spaghetti

    sauce 1
  17. spaghetti followed 1 spaghetti by 1 … … … spaghetti

    sauce 1 spaghetti we 0 spaghetti parking 0 … … … spaghetti sushi 0
  18. with by 1 with the 1 … … … with

    and 1 with appetizer 0 with loud 0 … … … with up 0
  19. Minimum Viable Introduction to Neural Networks

  20. None
  21. None
  22. None
  23. None
  24. None
  25. “Sigmoid” Activation Function Weighted Input Activation Value (z) = 1

    1 + e z <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit>
  26. “Sigmoid” Activation Function Weighted Input Activation Value 0.88 (z) =

    1 1 + e z <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit> <latexit sha1_base64="p67myG3hCMkCTKXgKC8P9kriayg=">AAACBnicdVDLSsNAFJ3UV62vqEtBBotQEUMionYhFNy4rGBsoYllMp3UoTNJmJkIbcjOjb/ixoWKW7/BnX/j9CH4PHDhcM693HtPkDAqlW2/G4Wp6ZnZueJ8aWFxaXnFXF27lHEqMHFxzGLRDJAkjEbEVVQx0kwEQTxgpBH0Tod+44YISePoQvUT4nPUjWhIMVJaapubnqRdjiqDHXgCvVAgnDl55uySq2xvkOdts2xbVdupHjrwN3Ese4QymKDeNt+8ToxTTiKFGZKy5diJ8jMkFMWM5CUvlSRBuIe6pKVphDiRfjb6I4fbWunAMBa6IgVH6teJDHEp+zzQnRypa/nTG4p/ea1Uhcd+RqMkVSTC40VhyqCK4TAU2KGCYMX6miAsqL4V4mukw1A6upIO4fNT+D9x962qZZ8flGv1SRpFsAG2QAU44AjUwBmoAxdgcAvuwSN4Mu6MB+PZeBm3FozJzDr4BuP1A0VMmJI=</latexit>
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. A neural network for learning context clues?

  38. (spaghetti, tomato, 1)

  39. None
  40. None
  41. None
  42. None
  43. None
  44. (weight matrix for the hidden layer)

  45. (weight matrix for the hidden layer)

  46. (weight matrix for the output layer)

  47. (weight matrix for the output layer)

  48. None
  49. None
  50. Training our network on the first context clue

  51. Training our network on the first context clue (there will

    be lots of these)
  52. 1. Make a prediction

  53. 1. Make a prediction 2. Measure how wrong we are

  54. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong
  55. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  56. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  57. None
  58. None
  59. None
  60. None
  61. None
  62. None
  63. None
  64. None
  65. None
  66. Weighted Input Activation Value

  67. Weighted Input Activation Value 0.51

  68. None
  69. “forward pass”

  70. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  71. None
  72. “Loss” Function Model Prediction Penalty L(ˆ y) = ln(ˆ y)

    <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  73. “Loss” Function Model Prediction Penalty right answer: 0 L(ˆ y)

    = ln (1 ˆ y) <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit> <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit> <latexit sha1_base64="Iltg1Ek3NfsdWz6wTqNhnEEl0fo=">AAACBnicdVDLSsNAFJ34rPEVdSnIYBHqoiERUbsQim5cuKhgbaENZTKdtEMnkzAzEULIzo2/4saFilu/wZ1/4/QFPg9cOJxzL/fe48eMSuU4H8bM7Nz8wmJhyVxeWV1btzY2b2SUCEzqOGKRaPpIEkY5qSuqGGnGgqDQZ6ThD86HfuOWCEkjfq3SmHgh6nEaUIyUljrWzmWp3UcqS/N989QstxnPSm55KuUdq+jYFcetHLnwN3FtZ4QimKDWsd7b3QgnIeEKMyRly3Vi5WVIKIoZyc12IkmM8AD1SEtTjkIivWz0Rw73tNKFQSR0cQVH6teJDIVSpqGvO0Ok+vKnNxT/8lqJCk68jPI4UYTj8aIgYVBFcBgK7FJBsGKpJggLqm+FuI8EwkpHZ+oQpp/C/0n9wK7YztVhsXo2SaMAtsEuKAEXHIMquAA1UAcY3IEH8ASejXvj0XgxXsetM8ZkZgt8g/H2CXbFmAQ=</latexit>
  74. “Loss” Function Model Prediction Penalty L(ˆ y) = ln(ˆ y)

    <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  75. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) =

    ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  76. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  77. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) =

    ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  78. “Loss” Function Model Prediction Penalty 0.51 0.67 L(ˆ y) ˆ

    y = 1 ˆ y <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> <latexit sha1_base64="ZDKHOKtBhi+AUW8uVfyzc5lgHDI=">AAACKnicdZDLSsNAFIYnXmu8RV26GSyCLgxJVdouBNGNCxcVrApNKZPJxA6dXJg5EUrI+7jxVVzoQsWtD+K0RlHRAwM/338OZ87vp4IrcJwXY2JyanpmtjJnzi8sLi1bK6sXKskkZW2aiERe+UQxwWPWBg6CXaWSkcgX7NIfHI/8yxsmFU/icximrBuR65iHnBLQqGcdeaEk1My9gAkg+HTL6xPIh8V28cVKUpgH5o5Ztrsju+Q9q+rY9Wat0XSwYzvj0sKt7Tfqu9gtSRWV1epZD16Q0CxiMVBBlOq4TgrdnEjgVLDC9DLFUkIH5Jp1tIxJxFQ3H99a4E1NAhwmUr8Y8Jh+n8hJpNQw8nVnRKCvfnsj+JfXySBsdHMepxmwmH4sCjOBIcGj4HDAJaMghloQKrn+K6Z9otMAHa+pQ/i8FP8v2jW7aTtne9XDVplGBa2jDbSFXFRHh+gEtVAbUXSL7tETejbujEfjxXj9aJ0wypk19KOMt3dcg6bV</latexit> L(ˆ y) = ln(ˆ y) <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> <latexit sha1_base64="h5s5pjYPsLV+hHg1TrPssOb1TR0=">AAACAHicdVDLSgMxFM3UV62vUTeCm2AR6sIyI6J2IRTcuHBRwbFCO5RMmmlDM5khuSOUoW78FTcuVNz6Ge78G9OH4vPAhZNz7iX3niARXIPjvFm5qemZ2bn8fGFhcWl5xV5du9RxqijzaCxidRUQzQSXzAMOgl0lipEoEKwe9E6Gfv2aKc1jeQH9hPkR6UgeckrASC1746zU7BLI+oOd492mkJ+vll10yhXHrRy4+Ddxy84IRTRBrWW/NtsxTSMmgQqidcN1EvAzooBTwQaFZqpZQmiPdFjDUEkipv1sdMEAbxuljcNYmZKAR+rXiYxEWvejwHRGBLr6pzcU//IaKYRHfsZlkgKTdPxRmAoMMR7GgdtcMQqibwihiptdMe0SRSiY0AomhI9L8f/E2ytXys75frFam6SRR5toC5WQiw5RFZ2iGvIQRTfoDj2gR+vWureerOdxa86azKyjb7Be3gFNVZZs</latexit> right answer: 1
  79. Move in the opposite direction from the gradient

  80. None
  81. None
  82. weights ! prediction ! penalty <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit

    sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit>
  83. weights ! prediction ! penalty <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit

    sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> <latexit sha1_base64="eVL4WEZtvPgtjqEq07EOc/9Js/E=">AAACOHicdVDLSgMxFM34rPVVdekmWARXZUbEx67gxp1VbBXaUjLpbRvMJENyRy1DP8uNn+FO3LhQxK1fYKat4PNC4OSce25yTxhLYdH3H7yJyanpmdncXH5+YXFpubCyWrM6MRyqXEttLkJmQQoFVRQo4SI2wKJQwnl4eZjp51dgrNDqDPsxNCPWVaIjOENHtQrHDYQbTK9BdHtoB7Qhteqa7MKM0dd0JLuRbcEzx78doJjE/qBVKPqlAz842A3obxCU/GEVybgqrcJ9o615EoFCLpm19cCPsZkyg4JLGOQbiYWY8UvWhbqDikVgm+lw8QHddEybdrRxRyEdsl8dKYus7Ueh64wY9uxPLSP/0uoJdvabqVBxgqD46KFOIilqmqVI28IAR9l3gHEj3F8p7zHDOLqs8y6Ez03p/6C2XQr8UnCyUyyfjuPIkXWyQbZIQPZImRyRCqkSTm7JI3kmL96d9+S9em+j1glv7Fkj38p7/wDiJ7C9</latexit> better weights ! better prediction ! lower penalty <latexit sha1_base64="Z9+o83RNTjN+dEwr7JyKvsxrHrM=">AAACTHicdVBNSwMxFMzW7/pV9eglWARPZVdE7U3w4lHFaqEtJZu+tqHZZEneWsvSH+jFgzd/hRcPighmawWrdiAwmTfDSyaMpbDo+09ebmZ2bn5hcSm/vLK6tl7Y2Ly2OjEcKlxLbaohsyCFggoKlFCNDbAolHAT9k6z+c0tGCu0usJBDI2IdZRoC87QSc0CryPcYRoCIhjaB9Hpoh3SutSqY7ILM0b36YTLLWgJnuWnGaXuZz5QTOJg2CwU/VLZD8qHAf1LgpI/QpGMcd4sPNZbmicRKOSSWVsL/BgbKTMouIRhvp5YiBnvsQ7UHFUsAttIR2UM6a5TWrStjTsK6Uj9mUhZZO0gCp0zYti1v2eZ+N+slmD7uJEKFScIin8taieSoqZZs7QlDHCUA0cYN8K9lfIuM4y7zmzelfD9UzqdXO+XAr8UXBwUTy7HdSySbbJD9khAjsgJOSPnpEI4uSfP5JW8eQ/ei/fufXxZc944s0UmkJv/BArxuBs=</latexit> <latexit sha1_base64="Z9+o83RNTjN+dEwr7JyKvsxrHrM=">AAACTHicdVBNSwMxFMzW7/pV9eglWARPZVdE7U3w4lHFaqEtJZu+tqHZZEneWsvSH+jFgzd/hRcPighmawWrdiAwmTfDSyaMpbDo+09ebmZ2bn5hcSm/vLK6tl7Y2Ly2OjEcKlxLbaohsyCFggoKlFCNDbAolHAT9k6z+c0tGCu0usJBDI2IdZRoC87QSc0CryPcYRoCIhjaB9Hpoh3SutSqY7ILM0b36YTLLWgJnuWnGaXuZz5QTOJg2CwU/VLZD8qHAf1LgpI/QpGMcd4sPNZbmicRKOSSWVsL/BgbKTMouIRhvp5YiBnvsQ7UHFUsAttIR2UM6a5TWrStjTsK6Uj9mUhZZO0gCp0zYti1v2eZ+N+slmD7uJEKFScIin8taieSoqZZs7QlDHCUA0cYN8K9lfIuM4y7zmzelfD9UzqdXO+XAr8UXBwUTy7HdSySbbJD9khAjsgJOSPnpEI4uSfP5JW8eQ/ei/fufXxZc944s0UmkJv/BArxuBs=</latexit> <latexit sha1_base64="Z9+o83RNTjN+dEwr7JyKvsxrHrM=">AAACTHicdVBNSwMxFMzW7/pV9eglWARPZVdE7U3w4lHFaqEtJZu+tqHZZEneWsvSH+jFgzd/hRcPighmawWrdiAwmTfDSyaMpbDo+09ebmZ2bn5hcSm/vLK6tl7Y2Ly2OjEcKlxLbaohsyCFggoKlFCNDbAolHAT9k6z+c0tGCu0usJBDI2IdZRoC87QSc0CryPcYRoCIhjaB9Hpoh3SutSqY7ILM0b36YTLLWgJnuWnGaXuZz5QTOJg2CwU/VLZD8qHAf1LgpI/QpGMcd4sPNZbmicRKOSSWVsL/BgbKTMouIRhvp5YiBnvsQ7UHFUsAttIR2UM6a5TWrStjTsK6Uj9mUhZZO0gCp0zYti1v2eZ+N+slmD7uJEKFScIin8taieSoqZZs7QlDHCUA0cYN8K9lfIuM4y7zmzelfD9UzqdXO+XAr8UXBwUTy7HdSySbbJD9khAjsgJOSPnpEI4uSfP5JW8eQ/ei/fufXxZc944s0UmkJv/BArxuBs=</latexit> <latexit sha1_base64="Z9+o83RNTjN+dEwr7JyKvsxrHrM=">AAACTHicdVBNSwMxFMzW7/pV9eglWARPZVdE7U3w4lHFaqEtJZu+tqHZZEneWsvSH+jFgzd/hRcPighmawWrdiAwmTfDSyaMpbDo+09ebmZ2bn5hcSm/vLK6tl7Y2Ly2OjEcKlxLbaohsyCFggoKlFCNDbAolHAT9k6z+c0tGCu0usJBDI2IdZRoC87QSc0CryPcYRoCIhjaB9Hpoh3SutSqY7ILM0b36YTLLWgJnuWnGaXuZz5QTOJg2CwU/VLZD8qHAf1LgpI/QpGMcd4sPNZbmicRKOSSWVsL/BgbKTMouIRhvp5YiBnvsQ7UHFUsAttIR2UM6a5TWrStjTsK6Uj9mUhZZO0gCp0zYti1v2eZ+N+slmD7uJEKFScIin8taieSoqZZs7QlDHCUA0cYN8K9lfIuM4y7zmzelfD9UzqdXO+XAr8UXBwUTy7HdSySbbJD9khAjsgJOSPnpEI4uSfP5JW8eQ/ei/fufXxZc944s0UmkJv/BArxuBs=</latexit>
  84. 1. How do we know which direction to nudge each

    weight?
  85. 1. How do we know which direction to nudge each

    weight? 2. How can we calculate this automatically for all the weights?
  86. None
  87. None
  88. None
  89. None
  90. None
  91. None
  92. None
  93. None
  94. None
  95. None
  96. None
  97. None
  98. None
  99. None
  100. None
  101. None
  102. None
  103. “back propagation”

  104. None
  105. None
  106. None
  107. None
  108. None
  109. Did it work?

  110. None
  111. None
  112. “stochastic gradient descent”

  113. Context clues trained: 1

  114. 1. Make a prediction 2. Measure how wrong we are

    3. Tune the model to become slightly less wrong 4. Repeat with the next context clue
  115. None
  116. None
  117. None
  118. None
  119. Learning to predict Learning to represent

  120. None
  121. we can measure distances!

  122. Context clues trained: 1 topics discerning masked sweets carmelized shelly

    cue prepare “amazing” as focus word: 0 cheerful succulent adjusting pop antenna suggesting vinegary brothers “server” as focus word: 0 ignorant sop refrigerators bags recliner introduce covered petco “spaghetti” as focus word: 1
  123. Fast forward…

  124. Context clues trained: 2,000,000 awesome delicious super here ) $

    customer excellent “amazing” as focus word: 1,854 thru along crab tacos / windows chef 1 “server” as focus word: 780 dollar rings loves = opened wrapped form provided “spaghetti” as focus word: 84
  125. Fast forward…

  126. Context clues trained: 100,000,000 incredible awesome outstanding excellent phenomenal fabulous

    superb fantastic “amazing” as focus word: 87,864 waiter waitress bartender hostess guide technician cashier barista “server” as focus word: 48,492 risotto veal katsu goat turkey enchilada raspberry meatloaf “spaghetti” as focus word: 3,600
  127. None
  128. None
  129. bun american + mexican ⇡ tortilla <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit> <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit>

    <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit> <latexit sha1_base64="nx3V5F3dTPHOtTf0fAS/HCGGUlI=">AAACLXicdZBLSwMxEMez9V1fVY9egkUQxLIrovZWEMGTKFgV2lKy6bQNzSZLMistpZ/Ii19FD4IPvPo1TNsVfA4E/vnNTCbzD2MpLPr+k5eZmJyanpmdy84vLC4t51ZWL61ODIcy11Kb65BZkEJBGQVKuI4NsCiUcBV2job5qxswVmh1gb0YahFrKdEUnKFD9dxxFaGL/TBRg+xOdnxhERhX4Mh2SiLojkGVxbHR3RSjNiikZIN6Lu8Xin5Q3A/obxEU/FHkSRpn9dxDtaF5EoFCLpm1lcCPsdZn7j0uwc1JLMSMd1gLKk4q9yVb64/WHdBNRxq0qY07CumIfu1wC1jbi0JXGTFs25+5IfwrV0mweVjrCxUnCIqPBzUTSVHToXe0IQxwlD0nGDfC/ZXyNjOMo3M460z43JT+L8q7hWIhON/Ll05TN2bJOtkgWyQgB6RETsgZKRNObsk9eSYv3p336L16b+PSjJf2rJFv4b1/AGJDqk4=</latexit>
  130. “word2vec skip-gram negative sampling”

  131. No magical black box AI… Just context clues and some

    arithmetic! Bonus: now you know the fundamentals of all neural network learning
  132. careers.spglobal.com