Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robert Niven

Robert Niven

(University of New South Wales, Canberra, Australia)

https://s3-seminar.github.io/seminars/robert-niven

Title — Bayesian Cyclic Networks, Mutual Information and Reduced-Order Bayesian Inference

Abstract — A branch of Bayesian inference involves the analysis of so-called "Bayesian networks", defined as directed acyclic networks composed of probabilistic connections [e.g. 1-2]. We extend this class of networks to consider cyclic Bayesian networks, which incorporate every pair of inverse conditional probabilities or probability density functions, thereby enabling the application of Bayesian updating around the network. The networks are assumed Markovian, although this assumption can be relaxed when necessary. The analysis of probabilistic cycles reveals a deep connection to the mutual information between pairs of variables on the network. Analysis of a four-parameter network - of the form of a commutative diagram - is shown to enable thedevelopment of a new branch of Bayesian inference using a reduced order model (coarse-graining) framework.

S³ Seminar

June 17, 2015
Tweet

More Decks by S³ Seminar

Other Decks in Research

Transcript

  1. Bayesian Cyclic Networks, Mutual Information and Reduced-Order Bayesian Inference Laboratoire

    des signaux et systèmes, CNRS-Centrale Supelec-Univ Paris Sud, 17 July 2015 Robert K. Niven UNSW Canberra, ACT, Australia. [email protected] Bernd R. Noack Institut PPrime, Poitiers, France. Eurika Kaiser Institut PPrime, Poitiers, France. Lou Cattafesta Florida State University, USA. Laurent Cordier Institut PPrime, Poitiers, France. Markus Abel Ambrosys GmbH / Univ. of Potsdam, Germany Funding from ARC, Go8/DAAD, CNRS, Region Poitou-Charentes
  2. © R.K. Niven 3 Reduced Order Model (ROM) ROM Inversion

    ROM- Bayesian Updating Bayesian Updating
  3. © R.K. Niven 4 Contents • Cluster-based reduced-order modelling -

    algorithm - examples • “Bayesian cyclic networks” - concept - mathematical implications • Application: reduced-order Bayesian inference - modelling of turbulent flows - turbulent flow control
  4. © R.K. Niven 5 Cluster-based Reduced-Order Modelling (Kaiser et al.,

    J. Fluid Mech. 754: 365-414, 2014) - time-series data are partitioned into similar clusters - compute probability transition matrix → clustered dynamical model
  5. © R.K. Niven 6 Clustering Algorithm (Kaiser et al. 2014)

    1. Time-series data (e.g. flow snapshots) are classified by a distance metric e.g. k-means algorithm: use Euclidean metric d mn =|| xm − xn ||= ds Ω s ∫ xm (s)⋅xn (s) 2. Data partitioned into K equally weighted Voronoi cells (“clusters”) 3. Cluster allocations are optimised by minimising an objective function, representing the intracluster variances J = || xm − ck || xm ∈ Ck ∑ 2 k=1 K ∑ where c k = centroid of cluster C k → cluster-based reduced order model of data (CROM)
  6. © R.K. Niven 7 Clustered Dynamical Model (Kaiser et al.

    2014) 1. Calc. stepwise probability transition matrix P, based on frequencies of transitions P = P j|i ⎡ ⎣ ⎤ ⎦ with P j|i = n j|i n j|i j ∑ 2. → Markov model for probabilities (vector) of clusters at th time step p  = P p −1 = P p 0 with asymptotic limits: p∞ = lim →∞ P p 0 and P∞ = lim →∞ P 3. → Clustered dynamical system model
  7. © R.K. Niven 10 Example 2: Mixing Layer (cont’d) (Kaiser

    et al. 2014) 1-step transition matrix Simplified dynamical model
  8. © R.K. Niven 12 Example 2: Mixing Layer (cont’d) Transition

    matrices:  = (a) 1; (b) 10; (c) 100; (d) 1000
  9. © R.K. Niven 13 Example 3: Ahmed Body (Kaiser et

    al. 2014) Instantaneous isosurface (pressure coefficient) Transition matrix (1 step) Simplified dynamical model
  10. © R.K. Niven 16 Advantages 1. Clear representation of transitions

    → simplified dynamical model 2. Dramatic reduction in order 3. Computationally efficient (although did use Galerkin ROM) Disadvantages 1. Purely “data-driven”: - inference on data space only - does not incorporate any information on the model space, or any uncertainties 2. Number of clusters chosen in advance (not optimised) 3. Dynamical model is oversimplified (not probabilistic) e.g. what is the space of possible clustered dynamical models?
  11. © R.K. Niven 17 Q1: How can we combine the

    advantages of clustering for (data) reduction and simplification, with a more robust framework for probabilistic inference (= Bayes) ? Q2: How can we build on this framework for flow control?
  12. © R.K. Niven 19 Functional Analysis But here want to

    represent probabilistic connections
  13. © R.K. Niven 21 Markov Chains = networks of probabilities

    - assume independent of history - almost what we want!
  14. © R.K. Niven 22 “Bayesian Cyclic Networks” Here define as

    probabilistic network, which - includes probabilistic cycles (complete graph) - includes all prior probabilities Here assume Markovian – but can extend if necessary
  15. © R.K. Niven 23 2-D Bayesian Cyclic Network (Discrete) p(i,

    j) = p(i)p(j | i) = p(j)p(i | j) ⇒ p(i, j) p(i) = p(j | i), p(i, j) p(j) = p(i | j) ⇒ p(j | i) = p(j) p(i | j) p(i) Consider i = D i , j = H j ⇒ extended Bayes’ theorem
  16. © R.K. Niven 24 2-D Bayesian Cyclic Network (Continuous) p(x,y)dxdy

    = p(x)dx p(y | x)dy = p(y)dy p(x | y)dx ⇒ p(x,y) p(x) = p(y | x), p(x,y) p(y) = p(x | y) ⇒ p(y | x) = p(y) p(x | y) p(x) Consider x = D, y = θ ⇒ continuous Bayes’ theorem
  17. © R.K. Niven 25 3-D Bayesian Cyclic Network (Discrete) p(i,

    j,k) = p(i)p(j | i)p(k | j) = p(i)p(k | i)p(j | k) = p(j)p(k | j)p(i | k) = p(j)p(i | j)p(k | i) = p(k)p(i | k)p(j | i) = p(k)p(j | k)p(i | j) , (3! = 6 relations)
  18. © R.K. Niven 26 3-D Bayesian Cyclic Network (Discrete) From

    p(i, j,k) and Bayes → p(i | j) p(i) = p(i | k) p(i) = p(j | i) p(j) = p(j | k) p(j) = p(k | i) p(k) = p(k | j) p(k) → p(i, j) p(i)p(j) = p(i,k) p(i)p(k) = p(j,k) p(j)p(k) → p(i, j)ln j ∑ i ∑ p(i, j) p(i)p(j) = p(i,k)ln k ∑ i ∑ p(i,k) p(i)p(k) = p(j,k)ln k ∑ j ∑ p(j,k) p(j)p(k) This is the mutual information! I(ϒ i ,ϒ j ) = I(ϒ i ,ϒ k ) = I(ϒ j ,ϒ k ) → M.I. between any pair of parameters is identical
  19. © R.K. Niven 27 3-D Bayesian Cyclic Network (Continuous) From

    p(x,y,z)dxdydz and Bayes → same relations → I(ϒ x ,ϒ y ) = I(ϒ x ,ϒ z ) = I(ϒ y ,ϒ z ) with I(ϒ x ,ϒ y ) = p(x,y)ln p(x,y) p(x)p(y) Ω y ∫ Ω x ∫ dxdy
  20. © R.K. Niven 28 4-D Bayesian Cyclic Network (Discrete) p(i,

    j,k,) = p(i)p(j | i)p(k | j)p( | k), etc (4! = 24 relations)
  21. © R.K. Niven 29 4-D Bayesian Cyclic Network (Discrete) From

    p(i, j,k,) and Bayes → p(α,β) p(α)p(β) = constant for α,β ∈{i, j,k,} I(ϒ α,ϒ β) = constant → M.I. between any pair of parameters is identical Same result for continuous case Similarly for any Markovian Bayesian cyclic network with n nodes
  22. © R.K. Niven 33 Reduced-Order Bayesian Inference Clustering (or ROM)

    Declustering Clustered dynamical model Bayesian Updating
  23. © R.K. Niven 35 Reduced-Order Bayesian Inference Note: mix of

    continuous and discrete variables (omit diagonals) Will be a loss of information due to clustering: I(ϒ x ,ϒ θ ) direct ≥ I(ϒ x ,ϒ θ ) loop → measure of uncertainty in algorithm ΔI x,θ = I(ϒ x ,ϒ θ ) direct − I(ϒ x ,ϒ θ ) loop ≥ 0 - compare to computational “costs” ΔC x,θ = C(ϒ x ,ϒ θ ) direct −C(ϒ x ,ϒ θ ) loop ≥ 0 ⎫ ⎬ ⎪ ⎪ ⎭ ⎪ ⎪ min N (ΔC x,θ + ΔI x,θ) = max N C(ϒ x ,ϒ θ) loop ( +I(ϒ x ,ϒ θ) loop ) Optimal criterion!
  24. © R.K. Niven 37 Flow Control dξ(t) dt = f(ξ(t),u(t))

    Dynamical system y(t) = g(ξ(t),u(t)) Sensor system u(t) = K(y(t)) Control operator where ξ(t) = parameter(s) y(t) = sensor signals u(t) = control signals Want the models f and g Commonly K found by minimising an objective function J(ξ,u) Plant Controller y(t) u(t) ξ(t) Other outputs
  25. © R.K. Niven 38 Flow Control Framework dξ(t) dt =

    f(ξ(t),u(t)) y(t) = g(ξ(t),u(t)) u(t) = K(y(t)) Propose same framework! but now with x = {D(ξ(t),u(t)) m ,y(t),u(t)} θ = {f,g,K}
  26. © R.K. Niven 39 Conclusions • Cluster-based reduced-order modelling -

    algorithm - examples: Lorenz, mixing layer, Ahmed body, engine cycle • “Bayesian cyclic networks” = cyclic probabilistic network (complete graph) - Markovian → mutual information is equivalent between pairs of variables • Application: Reduced-order Bayesian inference - flow modelling - inequality in mutual information (non-Markovian) → criterion for optimal choice of ROM - flow control