Robert Niven - Speaker Deck

Slide 1

Slide 1 text

Bayesian Cyclic Networks, Mutual Information and Reduced-Order Bayesian Inference Laboratoire des signaux et systèmes, CNRS-Centrale Supelec-Univ Paris Sud, 17 July 2015 Robert K. Niven UNSW Canberra, ACT, Australia. [email protected] Bernd R. Noack Institut PPrime, Poitiers, France. Eurika Kaiser Institut PPrime, Poitiers, France. Lou Cattafesta Florida State University, USA. Laurent Cordier Institut PPrime, Poitiers, France. Markus Abel Ambrosys GmbH / Univ. of Potsdam, Germany Funding from ARC, Go8/DAAD, CNRS, Region Poitou-Charentes

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

© R.K. Niven 4 Contents • Cluster-based reduced-order modelling - algorithm - examples • “Bayesian cyclic networks” - concept - mathematical implications • Application: reduced-order Bayesian inference - modelling of turbulent flows - turbulent flow control

Slide 5

Slide 5 text

© R.K. Niven 5 Cluster-based Reduced-Order Modelling (Kaiser et al., J. Fluid Mech. 754: 365-414, 2014) - time-series data are partitioned into similar clusters - compute probability transition matrix → clustered dynamical model

Slide 6

Slide 6 text

© R.K. Niven 6 Clustering Algorithm (Kaiser et al. 2014) 1. Time-series data (e.g. flow snapshots) are classified by a distance metric e.g. k-means algorithm: use Euclidean metric d mn =|| xm − xn ||= ds Ω s ∫ xm (s)⋅xn (s) 2. Data partitioned into K equally weighted Voronoi cells (“clusters”) 3. Cluster allocations are optimised by minimising an objective function, representing the intracluster variances J = || xm − ck || xm ∈ Ck ∑ 2 k=1 K ∑ where c k = centroid of cluster C k → cluster-based reduced order model of data (CROM)

Slide 7

Slide 7 text

© R.K. Niven 7 Clustered Dynamical Model (Kaiser et al. 2014) 1. Calc. stepwise probability transition matrix P, based on frequencies of transitions P = P j|i ⎡ ⎣ ⎤ ⎦ with P j|i = n j|i n j|i j ∑ 2. → Markov model for probabilities (vector) of clusters at th time step p  = P p −1 = P p 0 with asymptotic limits: p∞ = lim →∞ P p 0 and P∞ = lim →∞ P 3. → Clustered dynamical system model

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

© R.K. Niven 16 Advantages 1. Clear representation of transitions → simplified dynamical model 2. Dramatic reduction in order 3. Computationally efficient (although did use Galerkin ROM) Disadvantages 1. Purely “data-driven”: - inference on data space only - does not incorporate any information on the model space, or any uncertainties 2. Number of clusters chosen in advance (not optimised) 3. Dynamical model is oversimplified (not probabilistic) e.g. what is the space of possible clustered dynamical models?

Slide 17

Slide 17 text

© R.K. Niven 17 Q1: How can we combine the advantages of clustering for (data) reduction and simplification, with a more robust framework for probabilistic inference (= Bayes) ? Q2: How can we build on this framework for flow control?

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

© R.K. Niven 22 “Bayesian Cyclic Networks” Here define as probabilistic network, which - includes probabilistic cycles (complete graph) - includes all prior probabilities Here assume Markovian – but can extend if necessary

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

© R.K. Niven 25 3-D Bayesian Cyclic Network (Discrete) p(i, j,k) = p(i)p(j | i)p(k | j) = p(i)p(k | i)p(j | k) = p(j)p(k | j)p(i | k) = p(j)p(i | j)p(k | i) = p(k)p(i | k)p(j | i) = p(k)p(j | k)p(i | j) , (3! = 6 relations)

Slide 26

Slide 26 text

© R.K. Niven 26 3-D Bayesian Cyclic Network (Discrete) From p(i, j,k) and Bayes → p(i | j) p(i) = p(i | k) p(i) = p(j | i) p(j) = p(j | k) p(j) = p(k | i) p(k) = p(k | j) p(k) → p(i, j) p(i)p(j) = p(i,k) p(i)p(k) = p(j,k) p(j)p(k) → p(i, j)ln j ∑ i ∑ p(i, j) p(i)p(j) = p(i,k)ln k ∑ i ∑ p(i,k) p(i)p(k) = p(j,k)ln k ∑ j ∑ p(j,k) p(j)p(k) This is the mutual information! I(ϒ i ,ϒ j ) = I(ϒ i ,ϒ k ) = I(ϒ j ,ϒ k ) → M.I. between any pair of parameters is identical

Slide 27

Slide 27 text

© R.K. Niven 27 3-D Bayesian Cyclic Network (Continuous) From p(x,y,z)dxdydz and Bayes → same relations → I(ϒ x ,ϒ y ) = I(ϒ x ,ϒ z ) = I(ϒ y ,ϒ z ) with I(ϒ x ,ϒ y ) = p(x,y)ln p(x,y) p(x)p(y) Ω y ∫ Ω x ∫ dxdy

Slide 28

Slide 28 text

Slide 29

Slide 29 text

© R.K. Niven 29 4-D Bayesian Cyclic Network (Discrete) From p(i, j,k,) and Bayes → p(α,β) p(α)p(β) = constant for α,β ∈{i, j,k,} I(ϒ α,ϒ β) = constant → M.I. between any pair of parameters is identical Same result for continuous case Similarly for any Markovian Bayesian cyclic network with n nodes

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

© R.K. Niven 35 Reduced-Order Bayesian Inference Note: mix of continuous and discrete variables (omit diagonals) Will be a loss of information due to clustering: I(ϒ x ,ϒ θ ) direct ≥ I(ϒ x ,ϒ θ ) loop → measure of uncertainty in algorithm ΔI x,θ = I(ϒ x ,ϒ θ ) direct − I(ϒ x ,ϒ θ ) loop ≥ 0 - compare to computational “costs” ΔC x,θ = C(ϒ x ,ϒ θ ) direct −C(ϒ x ,ϒ θ ) loop ≥ 0 ⎫ ⎬ ⎪ ⎪ ⎭ ⎪ ⎪ min N (ΔC x,θ + ΔI x,θ) = max N C(ϒ x ,ϒ θ) loop ( +I(ϒ x ,ϒ θ) loop ) Optimal criterion!

Slide 36

Slide 36 text

Slide 37

Slide 37 text

© R.K. Niven 37 Flow Control dξ(t) dt = f(ξ(t),u(t)) Dynamical system y(t) = g(ξ(t),u(t)) Sensor system u(t) = K(y(t)) Control operator where ξ(t) = parameter(s) y(t) = sensor signals u(t) = control signals Want the models f and g Commonly K found by minimising an objective function J(ξ,u) Plant Controller y(t) u(t) ξ(t) Other outputs

Slide 38

Slide 38 text

Slide 39

Slide 39 text

© R.K. Niven 39 Conclusions • Cluster-based reduced-order modelling - algorithm - examples: Lorenz, mixing layer, Ahmed body, engine cycle • “Bayesian cyclic networks” = cyclic probabilistic network (complete graph) - Markovian → mutual information is equivalent between pairs of variables • Application: Reduced-order Bayesian inference - flow modelling - inequality in mutual information (non-Markovian) → criterion for optimal choice of ROM - flow control

Slide 40

Slide 40 text

Slide 41

Slide 41 text