Slide 1

Slide 1 text

Fred J. Hickernell, Illinois Institute of Technology May 28, 2024 Speedier Simulations Probability Density Estimation with Quasi-Monte Carlo Methods

Slide 2

Slide 2 text

Overview • Where does this arise in practice? • How to choose ? • How to chose to get the desired accuracy? • How to estimate the probability density of ? x1 , x2 , … n Y = f(X) μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn

Slide 3

Slide 3 text

Where does this arise in practice? μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn Y = f(X) = option payo ff underground water pressure with random rock porosity pixel intensity from random ray option price average water pressure average pixel intensity = μ may be dozens or hundreds d

Slide 4

Slide 4 text

How to choose ? x1 , x2 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn Do not fi ll space well
 Hard to extend

Slide 5

Slide 5 text

How to choose ? x1 , x2 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn Fill space better better than grids 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi3 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi4 64 Independent and identically distributed (IID) points, (d = 6)

Slide 6

Slide 6 text

How to choose ? x1 , x2 , … μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi2 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi3 0.00 0.25 0.50 0.75 1.00 xi1 0.00 0.25 0.50 0.75 1.00 xi4 Fill space even better! 64 low discrepancy points, (d = 6)

Slide 7

Slide 7 text

How to choose to get the desired accuracy? n μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn • There is an explicit formula for discrepancy, but the variation is hard to compute in practice • is for low discrepancy points versus 
 for IID • Rigorous data-based rules for choosing in QMCPy software library • QMCPy also has routines for generating low discrepancy points discrepancy({xi }n i=1 ) 𝒪 (n−1) 𝒪 (n−1/2) n |μ − ̂ μn | ≤ discrepancy({xi }n i=1 ) variation(f ) want |μ − ̂ μn | ≤ ε

Slide 8

Slide 8 text

How to estimate the probability density of ? Y = f(X) μ := expectation 𝔼 [f( X ⏟ ∼ 𝒰 ([0,1]d) )] = integral ∫ [0,1]d f(x) dx ≈ sample mean 1 n n ∑ i=1 f(xi ) =: ̂ μn kernel density estimation (KDE) ϱ(y) ≈ ∫ ∞ −∞ ˜ k((y − z)/h) h ϱ(z) dz ≈ 1 n n ∑ i=1 ˜ k((y − f(xi ))/h) h =: ̂ ϱ(y) °4 °2 0 2 4 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 e k(y) = exp(°(y/h)2)/( p ºh), h = 0.5 °4 °2 0 2 4 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 h = 1.0 °4 °2 0 2 4 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 h = 2.0

Slide 9

Slide 9 text

How to estimate the probability density of ? Y = f(X) kernel density estimation (KDE) ϱ(y) ≈ ∫ ∞ −∞ ˜ k((y − z)/h) h ϱ(z) dz ≈ 1 n n ∑ i=1 ˜ k((y − f(xi ))/h) h =: ̂ ϱ(y) °4 °2 0 2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 1.2 h = 0.05 °4 °2 0 2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 1.2 h = 0.20 f(x) = 10 exp(−w1 x1 − ⋯ − wd xd )sin(w1 x1 + ⋯ + wd xd )

Slide 10

Slide 10 text

How to estimate the probability density of ? Y = f(X) kernel density estimation (KDE) ϱ(y) ≈ ∫ ∞ −∞ ˜ k((y − z)/h) h ϱ(z) dz ≈ 1 n n ∑ i=1 ˜ k((y − f(xi ))/h) h =: ̂ ϱ(y) • Guillem has implemented scipy’s KDE with low discrepancy points, but accuracy not understood • My initial thoughts on theory are in this Overleaf, which also includes some recent articles by others

Slide 11

Slide 11 text

Next steps today and tomorrow • Visit qmcpy.org to understand more about our research • Look at the Colab notebook used to generate the fi gures https://tinyurl.com/ SURE2024Kicko ff Colab and tinker with it • Join our Speedy Simulations Slack group • Email me at [email protected] with questions, or message me on Slack

Slide 12

Slide 12 text

Steps for the next 10 weeks • We will explore experimentally how the choices of kernels, bandwidths, and numbers of points a ff ect accuracy • We will develop theory to show when low discrepancy sequences are better • We will learn good collaboration and software development practices with the Speedy Simulations team • We may present our results at conferences