Upgrade to Pro — share decks privately, control downloads, hide ads and more …

John Skilling

S³ Seminar
September 25, 2015

John Skilling

(Maximum Entropy Data Consultants Ltd, UK)

https://s3-seminar.github.io/seminars/john-skilling

Title — Bayesian Tomography

Abstract — John Skilling was awarded his PhD in radio astronomy in 1969. Through the 1970s and 1980s he was a lecturer in applied mathematics at Cambridge University, specialising in data analysis. He left to concentrate on consultancy work, originally using maximum entropy methods but moving to Bayesian methodology when algorithms became sufficiently powerful. John has been a prominent contributor to the “MaxEnt” conferences since their beginning in 1981. He is the discoverer of the nested sampling algorithm which performs integration over spaces of arbitrary dimension, which is the basic operation dictated by the sum rule of Bayesian calculus.

S³ Seminar

September 25, 2015
Tweet

More Decks by S³ Seminar

Other Decks in Research

Transcript

  1. 1 John Skilling ([email protected]) Bayesian Tomography Ignore finite width, fan-beam

    shape, refraction, scattering. Include noise ± . (x) Seek to compute D . infer Paris, March 2015 Tomographie au Laplace An object of density is scanned by line integrals line (x) dx = D.
  2. 2 1. The computing grid (hexagonal) 2. The prior (cartoon

    pixel colours and bond energies) 3. The likelihood (chi not squared data misfit) 4. MCMC requirements (simple moves) 5. Bayes (by nested sampling) 6. Results (display cartoon, posterior samples, mock data) 17 slides, mostly pictures. Contents
  3. 3 Computation needs finite grid. 45 0 1. The Computing

    Grid Nodes • represent square pixels. Bonds quantify gradients or discontinuities. This is badly anisotropic. Pixel density along scan line varies by 2. Diagonal edges will be harder to reconstruct. That is bad.
  4. 4 0 30 0 4 6 8 2 x y

    3 2 3 3 3 4 3 0 Pixel density along scan line varies by only 3 2 (less than half as bad). Hexagonal grid (6 neighbours) is better. (In 3 dimensions, use 12-neighbour face-centred cubic lattice.)
  5. 5 4 neighbours Sad M´ ediocre 3 neighbours Bad Mauvais

    6 neighbours Best Le plus meilleur
  6. 6 i 0 2 4 8 6 j 10 0

    1 2 3 4 5 6 7 8 9 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 1 3 5 0 2 4 3h 2h n=5 5 4 33 34 35 36 37 38 39 40 41 42 43 1 • • • • • • • +1 +n+1 n 1 n +n East West • • • • • • • • • • • • • • • South-East South-West North-West North-East +n+1 +n +1 1 n 1 n 33 0 1 2 3 4 5 6 7 8 9 31 30 29 28 27 26 25 24 23 22 20 19 18 17 16 15 14 13 12 11 33 34 35 36 37 38 39 40 41 42 32 21 10 43 0 1 2 3 4 5 0 11 22 0 Bond direction is additive constant. Get wraparound periodicity.
  7. 7 Object Opacity Data scans Data scans Cartoon Prior(opacity) interpret

    Bayes User Bayes Prior(materials) 2. The Prior Cartoons enable data fusion, where di erent properties (X-ray opacity, nuclear spin, sonar reflectivity,. . . ) of the same material can be combined into a single multi-technique reconstruction.
  8. 8 Type of material = “colour” Air Fluid Muscle Bone

    Prob(colour) = expected prior proportions We do NOT want uncorrelated Prob(colour). At high resolution, almost surely almost uniform grey. We DO want coherent patches with reduced boundaries. 0 0 0 0 9 9 8 8 5 5 2 2 1 1 1 1 Air Muscle Fluid Bone Bone Muscle Fluid Air Prior = Prob(colours) e Energy(bonds) Bond Energy ( = 1 expected perimeter ) = The Prior
  9. 9 Object Opacity Trial cartoon Trial opacity Mock data Actual

    data Likelihood is based on Residuals = Mock data Actual data Noise magnitude 3. The Likelihood
  10. 10 log L Residual log cosh log L Residual 2

    too steep log L Residual 1 fast Commonly, (Note: Bayes with 2 often gives posteriors that are wrongly placed and too definitive, because of imperfect modelling.) But cartoon is idealisation, not completely faithful. We expect residuals 1. A 10-sigma residual should not be 100 times worse than 1 sigma, and should not attract a e 50 penalty. The mismatch is systematic, not random noise. Likelihood = exp( 1 2 2), 2 = N k=1 Residual k 2 L = exp k |Residual k | Instead of log L = 1 2 Rk 2, use log cosh Rk or |Rk | . Hope to get 2 N (each residual 1).
  11. 11 Start: Sample from prior Evolve: Move faithfully to prior

    E = 13 E = 8 E = 5 Select random trial cell. Select trial colour Prob(colour). Accept if e E Uniform(0,1). Simplicity = generality + power If derivatives are wanted, hexagonal neighbours give all of f, fx, fy, fxx, fxy, fyy . Moves will later be modulated by data (rejected by mismatch). So moves must be small and sympathetic to data. No clever mathematics. No di erentials. No di erential geometry. No barycentre. 4. MCMC requirements Try individual colour changes. Could also exchange colours. Whatever you choose. . .
  12. 12 X 0 1 Prior (drawn uniform) Likelihood contours Pr(object)

    Prior (x) Pr(Data | object) Likelihood L(x) = Pr(Data) Evidence Z Pr(object | Data) Posterior P (x) P(x) = (x) L(x) / Z Z = L(x) (x)dnx Riemann sum L dX is expensive (German). Use Lebesgue X dL instead (French). X(L) = prior mass with likelihood > L. Can not compute X exactly. Must get X statistically. L Posterior is tiny! 5. Bayes is just sum and product rules
  13. 13 X 0 1 L 1 2 • • L1

    X 0 1 L 1 2 • • L1 Then X1 (L1 ) Uniform(0, 1) 1 2 . Start with random x 1 . Get L1 = L(x 1 ). Next, move to x 2 random within L > L1 . Get L2 = L(x 2 ). • • 1 4 X 0 1 L 1 2 • • L1 • • 1 4 1 8 • • • • • • L2 L2 And so on, with x 3, x 4, x 5, . . . . Then X2 (L2 ) Uniform(0, X1 ) 1 4 . Compress exponentially to reach the exponentially tiny posterior. (! Oui ou non !)
  14. 14 Get sequence {location x r , likelihood Lr ,

    enclosed prior mass Xr }. Terminate when information H (“ P log P”) saturates. For more accuracy, use more samples. Lebesgue compression is nested sampling. Guaranteed convergence to truth if H < . Start with random colours at = 0 (disorder). Allow to increase appropriately as the data impose order. pre-computed from prior model of colours and bonds on assigned grid. Best get n(E)dE as nested sampling’s prior mass dX in a calibration run, then simulate E ). E = E n(E)e E dE n(E)e E between and average energy E , To set , use relationship X L • • • • • • • • • • ( = is uniform colour.)
  15. 15 Lr x r L X • • • •

    • • • • • • • • • • • Xr 0 1 Pr Pr Pr(colour) Pr(opacity) Best cartoon uses most probable colour Posterior weight Pr = proportional area. Evidence Z = total area Lr Xr . 6. Results Each cell accumulates its own posterior colour distribution. Best opacity uses mean value, thence mock data.
  16. 17