Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bayes is BAE

Bayes is BAE

Before programming, before formal probability there was Bayes. He introduced the notion that multiple uncertain estimates which are related could be combined to form a more certain estimate. It turns out that this extremely simple idea has a profound impact on how we write programs and how we can think about life. The applications range from machine learning and robotics to determining cancer treatments. In this talk we'll take an in depth look at Bayses rule and how it can be applied to solve problems in programming and beyond.

Richard Schneeman

May 08, 2017
Tweet

More Decks by Richard Schneeman

Other Decks in Science

Transcript

  1. WELCOME

    View Slide

  2. Bayes
    is BAE

    View Slide

  3. View Slide

  4. Introducing
    our
    Protagonist

    View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. View Slide

  11. Divine Benevolence, or an
    Attempt to Prove That the
    Principal End of the Divine
    Providence and
    Government is the
    Happiness of His Creatures

    View Slide

  12. &

    View Slide

  13. An Introduction to the
    Doctrine of Fluxions,
    and a Defence of the
    Mathematicians Against
    the Objections of the
    Author of The Analyst

    View Slide

  14. Harry
    Potter & the
    Sorcerer’s Stone

    View Slide

  15. Why do we care?

    View Slide

  16. 1720

    View Slide

  17. View Slide

  18. 1720s

    View Slide

  19. 1720

    View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. No

    View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. Machine
    learning

    View Slide

  29. Artificial
    Intelligence

    View Slide

  30. They
    Call me
    @Schneems

    View Slide

  31. Maintain
    Sprockets

    View Slide

  32. Georgia
    Tech
    Online
    Masters

    View Slide

  33. Georgia
    Tech
    Online
    Masters

    View Slide

  34. View Slide

  35. View Slide

  36. Automatic
    Certificate

    Management

    View Slide

  37. SSL

    View Slide

  38. Heroku
    CI

    View Slide

  39. Review
    Apps

    View Slide

  40. Self Promotion

    View Slide

  41. View Slide

  42. View Slide

  43. View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. View Slide

  48. But wait
    Schneems,
    what can we
    do?

    View Slide

  49. Call your state
    representatives

    View Slide

  50. But wait
    Schneems,
    what can we
    do more?

    View Slide

  51. degerrymander
    texas
    .org

    View Slide

  52. Un-Patriotic
    Un-Texan

    View Slide

  53. Back to Bayes

    View Slide

  54. Artificial
    Intelligence

    View Slide

  55. View Slide

  56. View Slide

  57. View Slide

  58. View Slide

  59. View Slide

  60. View Slide

  61. View Slide

  62. Low
    Information
    state

    View Slide

  63. View Slide

  64. Predict

    View Slide

  65. Measure

    View Slide

  66. Measure +
    Predict

    View Slide

  67. Convolution

    View Slide

  68. Kalman
    Filter

    View Slide

  69. View Slide

  70. View Slide

  71. View Slide

  72. Do you like
    money?

    View Slide

  73. View Slide

  74. View Slide

  75. View Slide

  76. View Slide

  77. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  78. Probability
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  79. Probability
    of $3.7 mil given Heads
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  80. Probability
    of $3.7 mil given Heads
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  81. probability of
    heads
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    P(B) =

    View Slide

  82. probability of
    heads
    P(B) =
    H H H
    T

    View Slide

  83. P(B) =
    H H H
    T
    probability of
    heads
    P(B) =
    0.5 * 0.5 + 0.5 * 1
    0.75
    P(B) =

    View Slide

  84. 0.75
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    P(B) =
    0.75

    View Slide

  85. P(A) =
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    0.75
    probability of
    $3.7 million

    View Slide

  86. probability of
    $3.7 million
    $$$ Nope
    P(A) =

    View Slide

  87. $$$ Nope
    0.5
    probability of
    $3.7 million
    P(A) =
    P(A) =

    View Slide

  88. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    0.50
    0.75
    0.50
    P(A) =

    View Slide

  89. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    P(B ∣ A) =
    0.75
    0.50
    probability of
    heads given
    $3.7

    View Slide

  90. probability of
    heads given
    $3.7
    H T
    P(B ∣ A) =

    View Slide

  91. P(B ∣ A) = 0.5
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    0.75
    0.5 * 0.5

    View Slide

  92. $3.7 mil given Heads
    P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    0.75
    0.5 * 0.5
    P(A ∣ B) =
    1
    3
    = 0.3333

    View Slide

  93. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  94. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  95. YouTube Channel:
    Art of the Problem

    View Slide

  96. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  97. I lied about Bayes
    Rule

    View Slide

  98. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  99. P(Ai
    ∣ B) =
    P(B ∣ Ai
    ) P(Ai
    )

    j
    P(B ∣ Aj
    ) P(Aj
    )

    View Slide

  100. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    P(Ai
    ∣ B) =
    P(B ∣ Ai
    ) P(Ai
    )

    j
    P(B ∣ Aj
    ) P(Aj
    )

    View Slide

  101. Total Probability

    View Slide

  102. $3.7
    mil
    $0

    View Slide

  103. $3.7
    mil
    $0
    Heads

    View Slide

  104. $3.7
    mil
    $0
    Tails Heads

    View Slide

  105. $3.7
    mil
    $0
    Heads
    Tails

    View Slide

  106. $3.7
    mil
    $0
    Heads
    Tails

    View Slide

  107. $3.7
    mil
    $0
    Heads
    Tails

    View Slide

  108. P(Hea d s) = P(Hea d s ∣ $$$)P($$$) +
    P(Hea d s ∣ $0)P($0)
    $3.7
    mil
    $0
    Heads
    Tails

    View Slide

  109. $3.7
    mil
    $0
    P(Hea d s) = P(Hea d s ∣ $$$)P($$$) +
    P(Hea d s ∣ $0)P($0)
    Heads
    Tails

    View Slide

  110. P(B) =

    j
    P(B ∣ Aj
    ) P(Aj
    )
    Total Probability

    View Slide

  111. P(B) =
    H H H
    T
    probability of
    heads
    P(B) =
    0.5 * 0.5 + 0.5 * 1
    0.75
    P(B) =

    View Slide

  112. P(B) =

    j
    P(B ∣ Aj
    ) P(Aj
    )
    P(Hea d s) = P(Hea d s ∣ $$$)P($$$) +
    P(Hea d s ∣ $0)P($0)
    Total Probability

    View Slide

  113. Let’s make it
    tougher

    View Slide

  114. P(Ai
    ∣ B) =
    P(B ∣ Ai
    ) P(Ai
    )

    j
    P(B ∣ Aj
    ) P(Aj
    )

    View Slide

  115. P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    View Slide

  116. P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )
    P(HH ∣ Coini
    ) = 0.5 * 0.5

    View Slide

  117. P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )
    0.5
    P(Coini
    ) =

    View Slide

  118. P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    j
    P(B ∣ Aj
    ) P(Aj
    ) = P(HH ∣ $$$)P($$$) + P(HH ∣ $0)P($0)

    j
    P(B ∣ Aj
    ) P(Aj
    ) = 0.25(0.5) + 1.0(0.5)

    View Slide

  119. P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    j
    P(B ∣ Aj
    ) P(Aj
    ) = P(HH ∣ $$$)P($$$) + P(HH ∣ $0)P($0)

    j
    P(B ∣ Aj
    ) P(Aj
    ) = 0.25(0.5) + 1.0(0.5)

    View Slide

  120. P(Coin$$$
    ∣ HH ) =
    0.25(0.5)
    0.625
    =
    1
    5
    = 0.2
    P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    View Slide

  121. P(Coin$$$
    ∣ HH ) =
    0.25(0.5)
    0.625
    =
    1
    5
    = 0.2
    P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    View Slide

  122. P(Coini
    ∣ HH ) =
    0.25(0.5)
    0.625
    =
    1
    5
    = 0.2
    P(Coini
    ∣ HH ) =
    P(HH ∣ Coini
    ) P(Coini
    )

    j
    P(HH ∣ Coinj
    ) P(Coinj
    )

    View Slide

  123. Who is ready for a
    break?

    View Slide

  124. Lets take a break
    from math

    View Slide

  125. With more math

    View Slide

  126. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)

    View Slide

  127. P(A ∣ B) =
    P(B ∣ A) P(A)
    P(B)
    P(Ai
    ∣ B) =
    P(B ∣ Ai
    )
    P(B)
    P(Ai
    )

    View Slide

  128. P(Ai
    ∣ B) =
    P(B ∣ Ai
    )
    P(B)
    P(Ai
    )
    Prior

    View Slide

  129. P(Ai
    ∣ B) =
    P(B ∣ Ai
    )
    P(B)
    P(Ai
    )
    Posterior

    View Slide

  130. The kalman filter
    is a recursive
    bayes estimation

    View Slide

  131. Prediction/
    Prior

    View Slide

  132. Measure/
    Posterior

    View Slide

  133. View Slide

  134. Simon D. Levy

    View Slide

  135. alt it u decurrent time
    = 0.75 alt it u deprevious time

    View Slide

  136. alt it u decurrent time
    = 0.75 alt it u deprevious time

    View Slide

  137. a = rate_of_decent = 0.75
    x = initial_position = 1000
    r = measure_error = x * 0.20

    View Slide

  138. x_guess = measure_array[0]
    p = estimate_error = 1
    x_guess_array = []

    View Slide

  139. for k in range(10):
    measure = measure_array[k]

    View Slide

  140. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess

    View Slide

  141. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a

    View Slide

  142. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + gain * (measure - x_guess)

    View Slide

  143. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + gain * (measure - x_guess)
    Low Predict Error, low gain

    View Slide

  144. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + 0 * (measure - x_guess)
    Low Predict Error, low gain

    View Slide

  145. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + 0 * (measure - x_guess)
    Low Predict Error, low gain

    View Slide

  146. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + 1 * (measure - x_guess)
    High Predict Error, High gain

    View Slide

  147. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + 1 * (measure - x_guess)
    High Predict Error, High gain

    View Slide

  148. Prediction less
    certain

    View Slide

  149. Prediction more
    certain

    View Slide

  150. for k in range(10):
    measure = measure_array[k]
    # Predict
    x_guess = a * x_guess
    p = a * p * a
    # Update
    gain = p / (p + r)
    x_guess = x_guess + gain * (measure - x_guess)
    p = (1 - g) * p

    View Slide

  151. View Slide

  152. View Slide

  153. View Slide

  154. View Slide

  155. View Slide

  156. View Slide

  157. View Slide

  158. That’s it for
    Kalman Filters

    View Slide

  159. Bayes Rule

    View Slide

  160. Two most
    important
    parts

    View Slide

  161. View Slide

  162. View Slide

  163. View Slide

  164. View Slide

  165. View Slide

  166. View Slide

  167. View Slide

  168. View Slide

  169. Algorithms
    to
    Live
    By

    View Slide

  170. The Signal
    and
    the
    Noise

    View Slide

  171. Audio: Mozart
    Requiem in D
    minor
    https://www.youtube.com/watch?v=sPlhKP0nZII

    View Slide

  172. http://
    bit.ly/
    kalman-tutorial

    View Slide

  173. http://
    bit.ly/
    kalman-notebook

    View Slide

  174. Udacity &
    Georgia Tech

    View Slide

  175. BAE

    View Slide

  176. BAE

    View Slide

  177. BAE

    View Slide

  178. BAE

    View Slide


  179. Questions?

    View Slide


  180. Questions?

    View Slide

  181. Test Audio

    View Slide

  182. Test Audio 2

    View Slide

  183. Simon D. Levy

    View Slide

  184. View Slide

  185. View Slide

  186. View Slide

  187. View Slide

  188. View Slide

  189. View Slide

  190. View Slide

  191. View Slide

  192. View Slide

  193. View Slide

  194. What is g?

    View Slide

  195. View Slide

  196. Prediction

    View Slide

  197. Measurement

    View Slide

  198. Convolution

    View Slide

  199. Prediction less
    certain

    View Slide

  200. Prediction more
    certain

    View Slide

  201. Prediction error
    is not constant

    View Slide

  202. What is g?

    View Slide

  203. View Slide

  204. View Slide

  205. What is g?

    View Slide

  206. View Slide

  207. Introducing r

    View Slide

  208. View Slide

  209. View Slide

  210. View Slide

  211. View Slide

  212. Prediction +
    Measurement

    View Slide

  213. i.e.
    Prediction +
    Update

    View Slide

  214. Prediction
    Update

    View Slide

  215. Prediction
    Update

    View Slide

  216. Prediction

    View Slide

  217. Prediction

    View Slide

  218. Prediction
    Update


    View Slide

  219. $3.7
    mil
    $0

    View Slide

  220. $3.7
    mil
    $0

    View Slide