Slide 1

Slide 1 text

Surrogate modeling for stochastic simulators an overview and recent developments Xujia Zhu Mar. 8, 2024

Slide 2

Slide 2 text

Risk, Safety & Uncertainty Quantification Prof. Dr. Bruno Sudret Prof. Dr. Marco Broccardo Dr. Nora Lüthen Gif-sur-Yvette, 08.03.24 1/28

Slide 3

Slide 3 text

Outline 1. Stochastic simulators 2. Review of stochastic emulators 3. Generalized lambda model 4. Stochastic polynomial chaos expansions 5. Conclusion & outlook Gif-sur-Yvette, 08.03.24 2/28

Slide 4

Slide 4 text

Computational model Physical laws Conservation of mass Conservation of momentum Conservation of energy Quan�ta�ve modeling Geometry Material properties Boundary&initial conditions Analy�cal/numerical solver Discretization Numerical algorithms Implementation Gif-sur-Yvette, 08.03.24 3/28

Slide 5

Slide 5 text

Deterministic vs. stochastic simulators Deterministic simulators ▶ A given set of input parameters has a unique corresponding output value Md : DX ⊂ RM → R Stochastic simulators ▶ A given set of input parameters can lead to different values of the output ▶ Yx = Ms (x) is a random variable ▶ Source of randomness: Yx = Md (x, Ξ) Gif-sur-Yvette, 08.03.24 4/28

Slide 6

Slide 6 text

Deterministic vs. stochastic simulators Deterministic simulators ▶ Each set of input variables has a unique corresponding output Md : DX ⊂ RM → R Stochastic simulators ▶ A given set of input parameters can lead to different values of the output ▶ Yx = Ms (x) is a random variable ▶ Source of randomness: Yx = Md (x, Ξ) Gif-sur-Yvette, 08.03.24 5/28

Slide 7

Slide 7 text

Why stochastic simulators? Gif-sur-Yvette, 08.03.24 6/28

Slide 8

Slide 8 text

Why stochastic simulators? ▶ The relevant variables are extremely high-dimensional Simulation-based seismic fragility analysis Ground motion parameters Effective duration Arias intensity Main frequency ... Stochastic ground motion Structural dynamics Fragility analysis Rezaeian & Der Kiureghian (2010). Simulation of synthetic ground motions for specified earthquake and site characteristics. Earthquake Engng Struct. Dyn. Gif-sur-Yvette, 08.03.24 7/28

Slide 9

Slide 9 text

Why stochastic simulators? ▶ The relevant variables are extremely high-dimensional ▶ Some variables do not have significant physical meaning or interest Agent-based model ▶ Input variables: initial configurations, population characteristics, intervention, etc. ▶ Latent variables: detailed interactions between individuals, microscopic events, etc. Cuevas (2020). An agent-based model to evaluate the COVID-19 transmission risks in facilities. Comput. Biol. Med. Gif-sur-Yvette, 08.03.24 8/28

Slide 10

Slide 10 text

Why stochastic simulators? ▶ The relevant variables are extremely high-dimensional ▶ Some variables do not have significant physical meaning or interest ▶ Some uncertain sources cannot be accessed or even controlled Hybrid simulation K 3 M 3 F (t) θ(t) C 3 E, I, A, L, ρ, α , K 2 J 2 , K 1 J 1 , K 2 J 2 u 2 u 3 u 3 u 2 u 1 u 1 E, I, A, L, ρ, α NS NS PS + + K 3 M 3 C 3 , K 1 J 1 Tsokanas et al. (2021). A global sensitivity analysis framework for hybrid simulation with stochastic substructures. Front. Built Environ. Gif-sur-Yvette, 08.03.24 9/28

Slide 11

Slide 11 text

Computational costs induced by stochastic simulators ▶ Replications are needed to estimate the probability distribution of Yx (i.e., Y | X = x) ▶ Various values of X should be investigated for optimization, uncertainty quantification, etc. ▶ Realistic simulators (e.g., for wind turbine design) are costly Gif-sur-Yvette, 08.03.24 10/28

Slide 12

Slide 12 text

Stochastic surrogate models Gif-sur-Yvette, 08.03.24 11/28

Slide 13

Slide 13 text

Outline 1. Stochastic simulators 2. Review of stochastic emulators Random field representation Replication-based approach Statistical models 3. Generalized lambda model 4. Stochastic polynomial chaos expansions 5. Conclusion & outlook Gif-sur-Yvette, 08.03.24 11/28

Slide 14

Slide 14 text

Random field representation Main idea ▶ Consider the stochastic simulator as a random field indexed by the input variables: Yx (ω) = Md (x, Ξ(ω)) ▶ Fixing the internal stochasticity, i.e., ξ(1) = Ξ ω(1) , gives access to trajectories x → Md x, ξ(1) Gif-sur-Yvette, 08.03.24 12/28

Slide 15

Slide 15 text

Random field representation Main idea ▶ Consider the stochastic simulator as a random field indexed by the input variables: Yx (ω) = Md (x, Ξ(ω)) ▶ Fixing the internal stochasticity, i.e., ξ(1) = Ξ ω(1) , gives access to trajectories x → Md x, ξ(1) Literature ▶ Trajectory emulation with PCE: Azzi et al. (2019). Surrogate modeling of stochastic functions-application to computational electromagnetic dosimetry. Int. J. Uncertainty Quantification. ▶ Trajectory emulation with Kriging: Pearce et al. (2022). Bayesian optimization allowing for common random numbers. Oper. Res. ▶ Full random field construction: Lüthen et al. (2023). A spectral surrogate model for stochastic simulators computed from trajectory samples. Comput. Methods Appl. Mech. Eng. Gif-sur-Yvette, 08.03.24 12/28

Slide 16

Slide 16 text

Replication-based approach Main idea ▶ Estimate distributions/QoIs based on replications ▶ Apply standard regression techniques to the estimated quantities Gif-sur-Yvette, 08.03.24 13/28

Slide 17

Slide 17 text

Replication-based approach Main idea ▶ Estimate distributions/QoIs based on replications ▶ Apply standard regression techniques to the estimated quantities Literature ▶ Stochastic Kriging: Ankenman et al. (2006). Stochastic Kriging for simulation metamodeling. Oper. Res. ▶ PCE-based mean-variance estimation: Murcia et al. (2018). Uncertainty propagation through an aeroelastic wind turbine model using polynomial, Renew. Energy. ▶ Quantile Kriging: Plumlee & Tuo (2014). Building accurate emulators for stochastic simulations via quantile Kriging. Technometrics. ▶ Kernel density estimation: Moutoussamy et al. (2015). Emulators for stochastic simulation codes, ESAIM: Math. Model. Num. Anal. Gif-sur-Yvette, 08.03.24 13/28

Slide 18

Slide 18 text

Statistical models Statistical assumptions ▶ Data generation process, e.g., linear models Y = a X + b + ϵ Estimation method ▶ A framework to infer the model ▶ Regression method: loss function, e.g., mean-squared error, check function loss Distribution estimation ▶ Parametric family: normal distribution, exponential family ▶ Non-parametric models: kernel estimation, logistic Gaussian process ▶ Latent variable models: variational auto-encoder, GAN, normalizing flow, diffusion model Gif-sur-Yvette, 08.03.24 14/28

Slide 19

Slide 19 text

Assuming normality Main idea ▶ Response distributions are normal ▶ Mean function µ(x) and log-variance function log(V (x)) are modeled by Gaussian processes Literature ▶ Full Bayesian setup: Goldberg et al. (1997). Regression with input-dependent noise: a Gaussian process treatment. NIPS10. ▶ Variational Bayes: Lazaro-Gredilla & Titsias (2011).Variational heteroscedastic Gaussian process regression. ICML. ▶ MAP leveraging replications: Binois et al. (2018). Practical heteroscedastic Gaussian process modeling for large simulation experiments. J. Comput. Graph. Stat. ▶ Iterative fitting: Marrel et al. (2012). Global sensitivity analysis of stochastic computer models with joint metamodels, Stat. Comput. Gif-sur-Yvette, 08.03.24 15/28

Slide 20

Slide 20 text

Kernel estimation Main idea ▶ Estimate the joint distribution of (X, Y ) by kernel smoothing, then compute the conditional PDF by: ˆ f(y | x) = ˆ f(x, y) ˆ f(x) Literature ▶ Density estimation: Hall et al. (2004). Cross-validation and the estimation of conditional probability densities, J. Amer. Stat. Assoc. ▶ CDF and quantile estimations: Li et al. (2013). Optimal bandwidth selection for nonparametric conditional distribution and quantile functions, J. Bus. Econ. Stat. ▶ Local regression: Fan et al. (1996). Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems. Biometrika. ▶ Dimension reduction: Hall & Yao (2005). Conditional distribution function approximation, and prediction, using dimension reduction. Ann. Stats. Gif-sur-Yvette, 08.03.24 16/28

Slide 21

Slide 21 text

Latent variable models Main idea ▶ Introduce explicit latent variables Z into a deterministic model to emulate the random nature of stochastic simulators Yx d ≈ ˜ Mθ (x, Z) Literature ▶ Conditional VAE: Sohn et al. (2015). Learning structured output representation using deep conditional generative models. NIPS 2015. ▶ Diffusion model: Rombach et al. (2022) High-resolution image synthesis with latent diffusion models. CVPR. ▶ Conditional GAN: Yan & Perdikaris (2019). Conditional deep surrogate models for stochastic, high-dimensional, and multi-fidelity systems. Comput. Mech. ▶ Kernel embedding: Thakur & Chakraborty (2022). A deep learning based surrogate model for stochastic simulators. Probabilistic Eng. Mech. Gif-sur-Yvette, 08.03.24 17/28

Slide 22

Slide 22 text

Recap of different approaches Random field representation ▶ Produces the entire random field ▶ Requires seed control Replication-based approach ▶ Allows reusing existing deterministic surrogates ▶ Necessitates replications ▶ Demands a trade-off between exploration and replication Statistical models ▶ Run the simulator as it is ▶ Encompass a wide array of methods ▶ Require a balance between model flexibility and sample size Gif-sur-Yvette, 08.03.24 18/28

Slide 23

Slide 23 text

Outline 1. Stochastic simulators 2. Review of stochastic emulators 3. Generalized lambda model Generalized lambda distribution Generalized lambda model 4. Stochastic polynomial chaos expansions 5. Conclusion & outlook Gif-sur-Yvette, 08.03.24 18/28

Slide 24

Slide 24 text

Generalized lambda distribution The response probability distribution of Yx can be approximated by the generalized lambda distribution (GLD) -5 0 5 0 0.1 0.2 0.3 0.4 0.5 Standard normal Analytical GLD -5 0 5 0 0.1 0.2 0.3 0.4 Student's t (5) Analytical GLD 0 1 2 3 4 5 0 0.5 1 1.5 Exponential (1) Analytical GLD 0 1 2 3 0 0.2 0.4 0.6 0.8 1 Weibull (1,2) Analytical GLD Gif-sur-Yvette, 08.03.24 19/28

Slide 25

Slide 25 text

Generalized lambda distribution The response probability distribution of Yx can be approximated by the generalized lambda distribution (GLD) ▶ It can approximate many common parametric distributions ▶ The quantile function reads Q(u) def = F−1(u) = λ1 + 1 λ2 uλ3 − 1 λ3 − (1 − u)λ4 − 1 λ4 ▶ The probability density function (PDF) is implicitly given by fY (y) = λ2 uλ3−1 + (1 − u)λ4−1 1 [0,1] (u), where y = Q(u) Gif-sur-Yvette, 08.03.24 19/28

Slide 26

Slide 26 text

Properties of GLD Moments of order k do not exist Moments of order k do not exist ▶ λ3 and λ4 control the shape and boundedness Bl (λ) = −∞, λ3 ≤ 0 λ1 − 1 λ2λ3 , λ3 > 0 Bu (λ) = +∞, λ4 ≤ 0 λ1 + 1 λ2λ4 , λ4 > 0 ▶ Rich tail behaviors ▶ Moments, quantiles, and superquantiles can be computed analytically Gif-sur-Yvette, 08.03.24 20/28

Slide 27

Slide 27 text

Generalized lambda model General setting Yx ∼ GLD (λ1 (x) , λ2 (x) , λ3 (x) , λ4 (x)) Polynomial chaos expansions λk (x) ≈ λPCE k (x; c) = α∈Ak ck,α ψα (x) k = 1, 3, 4 λ2 (x) ≈ λPCE 2 (x; c) = exp α∈A2 c2,α ψα (x) Gif-sur-Yvette, 08.03.24 21/28

Slide 28

Slide 28 text

Model construction Data generation ▶ Experimental design (ED) of size N in the X-space: X = x(1), x(2), . . . , x(N) ▶ The model is evaluated only once, i.e. no replication, for each ED point y(i) def = Md (x(i), ξ(i)) Selection of PCE bases ▶ Modified feasible generalized least-squares for λPCE 1 and λPCE 2 ▶ Low-degree polynomials for λPCE 3 and λPCE 4 Maximum likelihood estimation ˆ c = arg max c 1 N N i=1 log fGLD Y |X y(i); λPCE(x(i); c) Gif-sur-Yvette, 08.03.24 22/28

Slide 29

Slide 29 text

Examples ▶ Comparison with one of the state-of-the-art kernel estimators (KCDE)1 2-d geometric Brownian motion 2-d stochastic SIR model 1 Hayfield & Racine (2008) Nonparametric Econometrics: The np Package, J. Stat. Softw. Gif-sur-Yvette, 08.03.24 23/28

Slide 30

Slide 30 text

Examples ▶ Comparison with one of the state-of-the-art kernel estimators (KCDE)1 ▶ Normalized Wasserstein distance as a performance indicator ε def = EX d2 WS YX , ˜ YX Var [YX ] , where dWS is the Wasserstein distance of order 2 d2 WS (Y, ˜ Y ) def = ∥QY − Q˜ Y ∥2 L2 = 1 0 (QY (u) − Q˜ Y (u))2 du 1 Hayfield & Racine (2008) Nonparametric Econometrics: The np Package, J. Stat. Softw. Gif-sur-Yvette, 08.03.24 23/28

Slide 31

Slide 31 text

Examples ▶ Comparison with one of the state-of-the-art kernel estimators (KCDE)1 ▶ Normalized Wasserstein distance as a performance indicator 2-d geometric Brownian motion 2-d stochastic SIR model 1 Hayfield & Racine (2008) Nonparametric Econometrics: The np Package, J. Stat. Softw. Gif-sur-Yvette, 08.03.24 23/28

Slide 32

Slide 32 text

Outline 1. Stochastic simulators 2. Review of stochastic emulators 3. Generalized lambda model 4. Stochastic polynomial chaos expansions 5. Conclusion & outlook Gif-sur-Yvette, 08.03.24 23/28

Slide 33

Slide 33 text

Stochastic PCE Yx d ≈ ˜ Yx def = α∈A cα ψα (x, Z) + ϵ ▶ Z is an artificial latent variable, and ϵ ∼ N(0, σ2 ϵ ) is a noise variable ▶ Z and ϵ are used to mimic the intrinsic stochasticity of the stochastic simulator Gif-sur-Yvette, 08.03.24 24/28

Slide 34

Slide 34 text

Stochastic PCE Yx d ≈ ˜ Yx def = α∈A cα ψα (x, Z) + ϵ ▶ Z is an artificial latent variable, and ϵ ∼ N(0, σ2 ϵ ) is a noise variable ▶ Z and ϵ are used to mimic the intrinsic stochasticity of the stochastic simulator ▶ The response distribution is given by f˜ Y |X (y | x) = DZ 1 √ 2πσϵ exp − y − α∈A cα ψα (x, z) 2 2σ2 ϵ fZ (z)dz ▶ Some important properties can be computed analytically by simple post-processing Gif-sur-Yvette, 08.03.24 24/28

Slide 35

Slide 35 text

Estimation without replication Maximum likelihood estimation ▶ The conditional likelihood for a data point (x, y) is l(c, σϵ ; x, y) = DZ 1 √ 2πσϵ exp − y − α∈A cα ψα (x, z) 2 2σ2 ϵ fZ (z)dz ▶ Numerical integration by 1d quadrature l(c, σϵ ; x, y) ≈ ˜ l(c, σϵ ; x, y) ▶ Maximum likelihood to estimate the coefficients ˆ c = arg max c N i=1 log ˜ l c, σϵ ; x(i), y(i) Cross-validation ▶ The likelihood is unbounded for σϵ = 0: σϵ is a hyperparameter that can be selected by cross-validation ▶ The cross-validation score is also used to find a suitable distribution for Z and a truncation scheme Gif-sur-Yvette, 08.03.24 25/28

Slide 36

Slide 36 text

Examples ▶ Comparison with GLaM and KCDE 4-d stochastic SIR model 1-d bimodal example Gif-sur-Yvette, 08.03.24 26/28

Slide 37

Slide 37 text

Examples ▶ Comparison with GLaM and KCDE ▶ Normalized Wasserstein distance as the performance indicator 4-d stochastic SIR model 1-d bimodal example Gif-sur-Yvette, 08.03.24 26/28

Slide 38

Slide 38 text

Applications Seismic fragility analysis Wind turbine simulations Mean speed yaw control Turbulence pitch control Drag Lift Wave Sensitivity analysis ▶ Study of Sobol’ indices: three potential extensions. ▶ Estimation of the indices related to the statistical dependence between the input and output. Gif-sur-Yvette, 08.03.24 27/28

Slide 39

Slide 39 text

Conclusion & outlook Conclusion ▶ Surrogate modeling for stochastic simulators is an active multidisciplinary research field ▶ Two stochastic emulators have been developed, without the need for replications ▶ The consistency of the maximum likelihood estimator for both models has been investigated Outlook ▶ Theoretical development – Asymptotic properties and bootstrap consistency of the maximum likelihood estimation – Error bound under model misspecification ▶ Methodological development – Sparse models for high-dimensional problems ˆ c = arg min c L(c) + penθ (c) – Other loss functions, e.g., f-divergence, Wasserstein distance – Sequential design of experiments – Multiple outputs – ... Gif-sur-Yvette, 08.03.24 28/28

Slide 40

Slide 40 text

Thank you very much for your attention! https://xkcd.com/2048/ Gif-sur-Yvette, 08.03.24 28/28