Bayes is BAE

Bayes is BAE

Before programming, before formal probability there was Bayes. He introduced the notion that multiple uncertain estimates which are related could be combined to form a more certain estimate. It turns out that this extremely simple idea has a profound impact on how we write programs and how we can think about life. The applications range from machine learning and robotics to determining cancer treatments. In this talk we'll take an in depth look at Bayses rule and how it can be applied to solve problems in programming and beyond.

Db953d125f5cc49756edb6149f1b813e?s=128

Richard Schneeman

May 08, 2017
Tweet

Transcript

  1. WELCOME

  2. Bayes is BAE

  3. None
  4. Introducing our Protagonist

  5. None
  6. None
  7. None
  8. None
  9. None
  10. None
  11. Divine Benevolence, or an Attempt to Prove That the Principal

    End of the Divine Providence and Government is the Happiness of His Creatures
  12. &

  13. An Introduction to the Doctrine of Fluxions, and a Defence

    of the Mathematicians Against the Objections of the Author of The Analyst
  14. Harry Potter & the Sorcerer’s Stone

  15. Why do we care?

  16. 1720

  17. None
  18. 1720s

  19. 1720

  20. None
  21. None
  22. None
  23. None
  24. No

  25. None
  26. None
  27. None
  28. Machine learning

  29. Artificial Intelligence

  30. They Call me @Schneems

  31. Maintain Sprockets

  32. Georgia Tech Online Masters

  33. Georgia Tech Online Masters

  34. None
  35. None
  36. Automatic Certificate
 Management

  37. SSL

  38. Heroku CI

  39. Review Apps

  40. Self Promotion

  41. None
  42. None
  43. None
  44. None
  45. None
  46. None
  47. None
  48. But wait Schneems, what can we do? “

  49. Call your state representatives

  50. But wait Schneems, what can we do more? “

  51. degerrymander texas .org

  52. Un-Patriotic Un-Texan

  53. Back to Bayes

  54. Artificial Intelligence

  55. None
  56. None
  57. None
  58. None
  59. None
  60. None
  61. None
  62. Low Information state

  63. None
  64. Predict

  65. Measure

  66. Measure + Predict

  67. Convolution

  68. Kalman Filter

  69. None
  70. None
  71. None
  72. Do you like money?

  73. None
  74. None
  75. None
  76. None
  77. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  78. Probability P(A ∣ B) = P(B ∣ A) P(A) P(B)

  79. Probability of $3.7 mil given Heads P(A ∣ B) =

    P(B ∣ A) P(A) P(B)
  80. Probability of $3.7 mil given Heads P(A ∣ B) =

    P(B ∣ A) P(A) P(B)
  81. probability of heads P(A ∣ B) = P(B ∣ A)

    P(A) P(B) P(B) =
  82. probability of heads P(B) = H H H T

  83. P(B) = H H H T probability of heads P(B)

    = 0.5 * 0.5 + 0.5 * 1 0.75 P(B) =
  84. 0.75 P(A ∣ B) = P(B ∣ A) P(A) P(B)

    P(B) = 0.75
  85. P(A) = P(A ∣ B) = P(B ∣ A) P(A)

    P(B) 0.75 probability of $3.7 million
  86. probability of $3.7 million $$$ Nope P(A) =

  87. $$$ Nope 0.5 probability of $3.7 million P(A) = P(A)

    =
  88. P(A ∣ B) = P(B ∣ A) P(A) P(B) 0.50

    0.75 0.50 P(A) =
  89. P(A ∣ B) = P(B ∣ A) P(A) P(B) P(B

    ∣ A) = 0.75 0.50 probability of heads given $3.7
  90. probability of heads given $3.7 H T P(B ∣ A)

    =
  91. P(B ∣ A) = 0.5 P(A ∣ B) = P(B

    ∣ A) P(A) P(B) 0.75 0.5 * 0.5
  92. $3.7 mil given Heads P(A ∣ B) = P(B ∣

    A) P(A) P(B) 0.75 0.5 * 0.5 P(A ∣ B) = 1 3 = 0.3333
  93. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  94. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  95. YouTube Channel: Art of the Problem

  96. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  97. I lied about Bayes Rule

  98. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  99. P(Ai ∣ B) = P(B ∣ Ai ) P(Ai )

    ∑ j P(B ∣ Aj ) P(Aj )
  100. P(A ∣ B) = P(B ∣ A) P(A) P(B) P(Ai

    ∣ B) = P(B ∣ Ai ) P(Ai ) ∑ j P(B ∣ Aj ) P(Aj )
  101. Total Probability

  102. $3.7 mil $0

  103. $3.7 mil $0 Heads

  104. $3.7 mil $0 Tails Heads

  105. $3.7 mil $0 Heads Tails

  106. $3.7 mil $0 Heads Tails

  107. $3.7 mil $0 Heads Tails

  108. P(Hea d s) = P(Hea d s ∣ $$$)P($$$) +

    P(Hea d s ∣ $0)P($0) $3.7 mil $0 Heads Tails
  109. $3.7 mil $0 P(Hea d s) = P(Hea d s

    ∣ $$$)P($$$) + P(Hea d s ∣ $0)P($0) Heads Tails
  110. P(B) = ∑ j P(B ∣ Aj ) P(Aj )

    Total Probability
  111. P(B) = H H H T probability of heads P(B)

    = 0.5 * 0.5 + 0.5 * 1 0.75 P(B) =
  112. P(B) = ∑ j P(B ∣ Aj ) P(Aj )

    P(Hea d s) = P(Hea d s ∣ $$$)P($$$) + P(Hea d s ∣ $0)P($0) Total Probability
  113. Let’s make it tougher

  114. P(Ai ∣ B) = P(B ∣ Ai ) P(Ai )

    ∑ j P(B ∣ Aj ) P(Aj )
  115. P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini

    ) ∑ j P(HH ∣ Coinj ) P(Coinj )
  116. P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini

    ) ∑ j P(HH ∣ Coinj ) P(Coinj ) P(HH ∣ Coini ) = 0.5 * 0.5
  117. P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini

    ) ∑ j P(HH ∣ Coinj ) P(Coinj ) 0.5 P(Coini ) =
  118. P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini

    ) ∑ j P(HH ∣ Coinj ) P(Coinj ) ∑ j P(B ∣ Aj ) P(Aj ) = P(HH ∣ $$$)P($$$) + P(HH ∣ $0)P($0) ∑ j P(B ∣ Aj ) P(Aj ) = 0.25(0.5) + 1.0(0.5)
  119. P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini

    ) ∑ j P(HH ∣ Coinj ) P(Coinj ) ∑ j P(B ∣ Aj ) P(Aj ) = P(HH ∣ $$$)P($$$) + P(HH ∣ $0)P($0) ∑ j P(B ∣ Aj ) P(Aj ) = 0.25(0.5) + 1.0(0.5)
  120. P(Coin$$$ ∣ HH ) = 0.25(0.5) 0.625 = 1 5

    = 0.2 P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini ) ∑ j P(HH ∣ Coinj ) P(Coinj )
  121. P(Coin$$$ ∣ HH ) = 0.25(0.5) 0.625 = 1 5

    = 0.2 P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini ) ∑ j P(HH ∣ Coinj ) P(Coinj )
  122. P(Coini ∣ HH ) = 0.25(0.5) 0.625 = 1 5

    = 0.2 P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini ) ∑ j P(HH ∣ Coinj ) P(Coinj )
  123. Who is ready for a break?

  124. Lets take a break from math

  125. With more math

  126. P(A ∣ B) = P(B ∣ A) P(A) P(B)

  127. P(A ∣ B) = P(B ∣ A) P(A) P(B) P(Ai

    ∣ B) = P(B ∣ Ai ) P(B) P(Ai )
  128. P(Ai ∣ B) = P(B ∣ Ai ) P(B) P(Ai

    ) Prior
  129. P(Ai ∣ B) = P(B ∣ Ai ) P(B) P(Ai

    ) Posterior
  130. The kalman filter is a recursive bayes estimation

  131. Prediction/ Prior

  132. Measure/ Posterior

  133. None
  134. Simon D. Levy

  135. alt it u decurrent time = 0.75 alt it u

    deprevious time
  136. alt it u decurrent time = 0.75 alt it u

    deprevious time
  137. a = rate_of_decent = 0.75 x = initial_position = 1000

    r = measure_error = x * 0.20
  138. x_guess = measure_array[0] p = estimate_error = 1 x_guess_array =

    []
  139. for k in range(10): measure = measure_array[k]

  140. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess
  141. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a
  142. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess)
  143. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess) Low Predict Error, low gain
  144. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 0 * (measure - x_guess) Low Predict Error, low gain
  145. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 0 * (measure - x_guess) Low Predict Error, low gain
  146. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 1 * (measure - x_guess) High Predict Error, High gain
  147. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 1 * (measure - x_guess) High Predict Error, High gain
  148. Prediction less certain

  149. Prediction more certain

  150. for k in range(10): measure = measure_array[k] # Predict x_guess

    = a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess) p = (1 - g) * p
  151. None
  152. None
  153. None
  154. None
  155. None
  156. None
  157. None
  158. That’s it for Kalman Filters

  159. Bayes Rule

  160. Two most important parts

  161. None
  162. None
  163. None
  164. None
  165. None
  166. None
  167. None
  168. None
  169. Algorithms to Live By

  170. The Signal and the Noise

  171. Audio: Mozart Requiem in D minor https://www.youtube.com/watch?v=sPlhKP0nZII

  172. http:// bit.ly/ kalman-tutorial

  173. http:// bit.ly/ kalman-notebook

  174. Udacity & Georgia Tech

  175. BAE

  176. BAE

  177. BAE

  178. BAE

  179. Questions?

  180. Questions?

  181. Test Audio

  182. Test Audio 2

  183. Simon D. Levy

  184. None
  185. None
  186. None
  187. None
  188. None
  189. None
  190. None
  191. None
  192. None
  193. None
  194. What is g?

  195. None
  196. Prediction

  197. Measurement

  198. Convolution

  199. Prediction less certain

  200. Prediction more certain

  201. Prediction error is not constant

  202. What is g?

  203. None
  204. None
  205. What is g?

  206. None
  207. Introducing r

  208. None
  209. None
  210. None
  211. None
  212. Prediction + Measurement

  213. i.e. Prediction + Update

  214. Prediction Update

  215. Prediction Update ✅

  216. Prediction

  217. Prediction

  218. Prediction Update ✅ ✅

  219. $3.7 mil $0

  220. $3.7 mil $0