Bayes is BAE

WELCOME

Bayes is BAE

Introducing our Protagonist

Divine Benevolence, or an Attempt to Prove That the Principal
End of the Divine Providence and Government is the Happiness of His Creatures

An Introduction to the Doctrine of Fluxions, and a Defence
of the Mathematicians Against the Objections of the Author of The Analyst

Harry Potter & the Sorcerer’s Stone

Why do we care?

Machine learning

Artificial Intelligence

They Call me @Schneems

Maintain Sprockets

Georgia Tech Online Masters

Automatic Certificate  Management

Heroku CI

Review Apps

Self Promotion

But wait Schneems, what can we do? “

Call your state representatives

But wait Schneems, what can we do more? “

degerrymander texas .org

Un-Patriotic Un-Texan

Back to Bayes

Artificial Intelligence

Low Information state

Predict

Measure

Measure + Predict

Convolution

Kalman Filter

Do you like money?

P(A ∣ B) = P(B ∣ A) P(A) P(B)

Probability P(A ∣ B) = P(B ∣ A) P(A) P(B)

Probability of $3.7 mil given Heads P(A ∣ B) =
P(B ∣ A) P(A) P(B)

probability of heads P(A ∣ B) = P(B ∣ A)
P(A) P(B) P(B) =

probability of heads P(B) = H H H T

P(B) = H H H T probability of heads P(B)
= 0.5 * 0.5 + 0.5 * 1 0.75 P(B) =

0.75 P(A ∣ B) = P(B ∣ A) P(A) P(B)
P(B) = 0.75

P(A) = P(A ∣ B) = P(B ∣ A) P(A)
P(B) 0.75 probability of $3.7 million

probability of $3.7 million $$$ Nope P(A) =

$$$ Nope 0.5 probability of $3.7 million P(A) = P(A)
=

P(A ∣ B) = P(B ∣ A) P(A) P(B) 0.50
0.75 0.50 P(A) =

P(A ∣ B) = P(B ∣ A) P(A) P(B) P(B
∣ A) = 0.75 0.50 probability of heads given $3.7

probability of heads given $3.7 H T P(B ∣ A)
=

P(B ∣ A) = 0.5 P(A ∣ B) = P(B
∣ A) P(A) P(B) 0.75 0.5 * 0.5

$3.7 mil given Heads P(A ∣ B) = P(B ∣
A) P(A) P(B) 0.75 0.5 * 0.5 P(A ∣ B) = 1 3 = 0.3333

P(A ∣ B) = P(B ∣ A) P(A) P(B)

YouTube Channel: Art of the Problem

P(A ∣ B) = P(B ∣ A) P(A) P(B)

I lied about Bayes Rule

P(A ∣ B) = P(B ∣ A) P(A) P(B)

P(Ai ∣ B) = P(B ∣ Ai ) P(Ai )
∑ j P(B ∣ Aj ) P(Aj )

P(A ∣ B) = P(B ∣ A) P(A) P(B) P(Ai
∣ B) = P(B ∣ Ai ) P(Ai ) ∑ j P(B ∣ Aj ) P(Aj )

Total Probability

$3.7 mil $0

$3.7 mil $0 Heads

$3.7 mil $0 Tails Heads

$3.7 mil $0 Heads Tails

P(Hea d s) = P(Hea d s ∣ $$$)P($$$) +
P(Hea d s ∣ $0)P($0) $3.7 mil $0 Heads Tails

$3.7 mil $0 P(Hea d s) = P(Hea d s
∣ $$$)P($$$) + P(Hea d s ∣ $0)P($0) Heads Tails

P(B) = ∑ j P(B ∣ Aj ) P(Aj )
Total Probability

P(B) = H H H T probability of heads P(B)
= 0.5 * 0.5 + 0.5 * 1 0.75 P(B) =

P(B) = ∑ j P(B ∣ Aj ) P(Aj )
P(Hea d s) = P(Hea d s ∣ $$$)P($$$) + P(Hea d s ∣ $0)P($0) Total Probability

Let’s make it tougher

P(Ai ∣ B) = P(B ∣ Ai ) P(Ai )
∑ j P(B ∣ Aj ) P(Aj )

P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini
) ∑ j P(HH ∣ Coinj ) P(Coinj )

) ∑ j P(HH ∣ Coinj ) P(Coinj ) P(HH ∣ Coini ) = 0.5 * 0.5

) ∑ j P(HH ∣ Coinj ) P(Coinj ) 0.5 P(Coini ) =

) ∑ j P(HH ∣ Coinj ) P(Coinj ) ∑ j P(B ∣ Aj ) P(Aj ) = P(HH ∣ $$$)P($$$) + P(HH ∣ $0)P($0) ∑ j P(B ∣ Aj ) P(Aj ) = 0.25(0.5) + 1.0(0.5)

P(Coin$$$ ∣ HH ) = 0.25(0.5) 0.625 = 1 5
= 0.2 P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini ) ∑ j P(HH ∣ Coinj ) P(Coinj )

P(Coini ∣ HH ) = 0.25(0.5) 0.625 = 1 5
= 0.2 P(Coini ∣ HH ) = P(HH ∣ Coini ) P(Coini ) ∑ j P(HH ∣ Coinj ) P(Coinj )

Who is ready for a break?

Lets take a break from math

With more math

P(A ∣ B) = P(B ∣ A) P(A) P(B)

P(A ∣ B) = P(B ∣ A) P(A) P(B) P(Ai
∣ B) = P(B ∣ Ai ) P(B) P(Ai )

P(Ai ∣ B) = P(B ∣ Ai ) P(B) P(Ai
) Prior

P(Ai ∣ B) = P(B ∣ Ai ) P(B) P(Ai
) Posterior

The kalman filter is a recursive bayes estimation

Prediction/ Prior

Measure/ Posterior

Simon D. Levy

alt it u decurrent time = 0.75 alt it u
deprevious time

a = rate_of_decent = 0.75 x = initial_position = 1000
r = measure_error = x * 0.20

x_guess = measure_array[0] p = estimate_error = 1 x_guess_array =
[]

for k in range(10): measure = measure_array[k]

for k in range(10): measure = measure_array[k] # Predict x_guess
= a * x_guess

= a * x_guess p = a * p * a

= a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess)

= a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess) Low Predict Error, low gain

= a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 0 * (measure - x_guess) Low Predict Error, low gain

= a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + 1 * (measure - x_guess) High Predict Error, High gain

Prediction less certain

Prediction more certain

= a * x_guess p = a * p * a # Update gain = p / (p + r) x_guess = x_guess + gain * (measure - x_guess) p = (1 - g) * p

That’s it for Kalman Filters

Bayes Rule

Two most important parts

Algorithms to Live By

The Signal and the Noise

Audio: Mozart Requiem in D minor https://www.youtube.com/watch?v=sPlhKP0nZII

http:// bit.ly/ kalman-tutorial

http:// bit.ly/ kalman-notebook

Udacity & Georgia Tech

Questions?

Test Audio

Test Audio 2

Simon D. Levy

What is g?

Prediction

Measurement

Convolution

Prediction less certain

Prediction more certain

Prediction error is not constant

What is g?

Introducing r

Prediction + Measurement

i.e. Prediction + Update

Prediction Update

Prediction Update ✅

Prediction

Prediction Update ✅ ✅

$3.7 mil $0

Bayes is BAE

Bayes is BAE

More Decks by Richard Schneeman

Other Decks in Science

Featured

Transcript