Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bayesian Inference For PTAs

Bayesian Inference For PTAs

Invited overview talk at "Black Holes Across The Gravitational Wave Spectrum" workshop in Natal, Brazil.

Dr. Stephen R. Taylor

August 04, 2017
Tweet

More Decks by Dr. Stephen R. Taylor

Other Decks in Science

Transcript

  1. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 © 2017 California

    Institute of Technology. Government sponsorship acknowledged Stephen Taylor Bayesian Inference for PTAs JET PROPULSION LABORATORY, CALIFORNIA INSTITUTE OF TECHNOLOGY
  2. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Building the PTA

    likelihood Search strategies for stochastic and deterministic signals Sampling & model selection Assessing detection significance Overview
  3. Stephen Taylor IIP 2017, Natal, Brazil, 08-03-2017 good timing-solution error

    in frequency derivative error in position unmodeled proper motion Lorimer & Kramer (2005)
  4. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Timing residuals Deviations

    around best-fit of timing model White noise TOA measurement uncertainties Extra unaccounted white-noise from receivers “Jitter”
  5. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Timing residuals Deviations

    around best-fit of timing model White noise TOA measurement uncertainties Extra unaccounted white-noise from receivers “Jitter” Intrinsic low-frequency (“red”) processes Rotational instabilities lead to random-walk behavior in phase, period, period-derivative
  6. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Timing residuals Deviations

    around best-fit of timing model White noise TOA measurement uncertainties Extra unaccounted white-noise from receivers “Jitter” Intrinsic low-frequency (“red”) processes Rotational instabilities lead to random-walk behavior in phase, period, period-derivative Spatially-correlated low-frequency processes Stochastic variations in time standards Solar-system ephemeris errors Gravitational-wave background!!
  7. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Timing residuals Delays

    caused by all unmodeled phenomena. We treat all stochastic processes as Gaussian stationary.
  8. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Timing residuals GWB

    Delays caused by all unmodeled phenomena. We treat all stochastic processes as Gaussian stationary. White noise Intrinsic low-frequency (“red”) processes h tai tbj i = 2 WN ij ab h tai tbj i = Cred(|tai tbj |) ab h tai tbj i = abCgwb(|tai tbj |)
  9. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 How do we

    build a stochastic signal from these binaries? Quick detour: Sources
  10. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 How do we

    build a stochastic signal from these binaries? Phinney (2001) hc(f)2 / X i h2 i f Quick detour: Sources
  11. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 How do we

    build a stochastic signal from these binaries? Phinney (2001) hc(f)2 / X i h2 i f C i r c u l a r, G W - d r i v e n population Quick detour: Sources Characteristic strain-spectrum — PSD of timing residuals — S(f) = hc(f)2 12⇡2 = A2 gwb 12⇡2 ✓ f yr 1 ◆ 13/3 yr3 hc(f) = Agwb ✓ f yr 1 ◆ 2/3 Time-domain covariance of residuals — C ( ti, tj)gwb = Z f high f low d f S ( f ) cos(2 ⇡f|ti tj | ) f3
  12. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 van Haasteren &

    Levin (2013) The Pulsar-timing Likelihood p ( t|⌘ ) = 1 det (2 ⇡GT CG ) exp ✓ 1 2 tT ( GT C ( ⌘ ) G ) 1 t ◆ The way we used to do it… Matrix which projects all quantities into a space orthogonal to the timing-model. This accounts for the possible fitting-out of GW power. G = C = Cred + Cgwb + N 1/2
  13. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 van Haasteren &

    Levin (2013) The Pulsar-timing Likelihood p ( t|⌘ ) = 1 det (2 ⇡GT CG ) exp ✓ 1 2 tT ( GT C ( ⌘ ) G ) 1 t ◆ The way we used to do it… Matrix which projects all quantities into a space orthogonal to the timing-model. This accounts for the possible fitting-out of GW power. G = C = Cred + Cgwb + N 1/2 O(NpsrNTOA) With a GWB, all pulsars are correlated spatially and temporally…the covariance matrix is dense and of dimension Bottleneck in likelihood is not matrix multiplications, but rather inversion. Fastest way is via Cholesky decomposition, but that is still O(N3 psr N3 TOA )
  14. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 van Haasteren &

    Levin (2013) The Pulsar-timing Likelihood p ( t|⌘ ) = 1 det (2 ⇡GT CG ) exp ✓ 1 2 tT ( GT C ( ⌘ ) G ) 1 t ◆ The way we used to do it… Matrix which projects all quantities into a space orthogonal to the timing-model. This accounts for the possible fitting-out of GW power. G = C = Cred + Cgwb + N 1/2 O(NpsrNTOA) With a GWB, all pulsars are correlated spatially and temporally…the covariance matrix is dense and of dimension Bottleneck in likelihood is not matrix multiplications, but rather inversion. Fastest way is via Cholesky decomposition, but that is still O(N3 psr N3 TOA ) 10k observations likelihood time ~ 20 seconds
  15. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 van Haasteren &

    Levin (2013) The Pulsar-timing Likelihood p ( t|⌘ ) = 1 det (2 ⇡GT CG ) exp ✓ 1 2 tT ( GT C ( ⌘ ) G ) 1 t ◆ The way we used to do it… Matrix which projects all quantities into a space orthogonal to the timing-model. This accounts for the possible fitting-out of GW power. G = C = Cred + Cgwb + N 1/2 O(NpsrNTOA) With a GWB, all pulsars are correlated spatially and temporally…the covariance matrix is dense and of dimension Bottleneck in likelihood is not matrix multiplications, but rather inversion. Fastest way is via Cholesky decomposition, but that is still O(N3 psr N3 TOA ) 10k observations likelihood time ~ 20 seconds Current J1713+0747 ~30k observations!! Full pulsar array ~ 60k observations
  16. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 van Haasteren &

    Levin (2013) The Pulsar-timing Likelihood p ( t|⌘ ) = 1 det (2 ⇡GT CG ) exp ✓ 1 2 tT ( GT C ( ⌘ ) G ) 1 t ◆ The way we used to do it… Matrix which projects all quantities into a space orthogonal to the timing-model. This accounts for the possible fitting-out of GW power. G = C = Cred + Cgwb + N 1/2 O(NpsrNTOA) With a GWB, all pulsars are correlated spatially and temporally…the covariance matrix is dense and of dimension Bottleneck in likelihood is not matrix multiplications, but rather inversion. Fastest way is via Cholesky decomposition, but that is still O(N3 psr N3 TOA ) 10k observations likelihood time ~ 20 seconds Current J1713+0747 ~30k observations!! Full pulsar array ~ 60k observations UNACCEPTABLE!
  17. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 t = M✏

    + Fa + Uj + n The Pulsar-timing Likelihood Describe all processes in rank-reduced formalism Lentati et al. (inc Taylor) (2013) van Haasteren & Vallisneri (2014a,b)
  18. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 t = M✏

    + Fa + Uj + n The Pulsar-timing Likelihood Describe all processes in rank-reduced formalism small linear perturbations around best-fit timing solution low-frequency processes in Fourier basis jitter white noise Lentati et al. (inc Taylor) (2013) van Haasteren & Vallisneri (2014a,b)
  19. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 t = M✏

    + Fa + Uj + n The Pulsar-timing Likelihood Describe all processes in rank-reduced formalism small linear perturbations around best-fit timing solution low-frequency processes in Fourier basis jitter white noise M ! [NTOA ⇥ NTM] ✏ ! NTM F ! [NTOA ⇥ 2Nfreqs] a ! 2Nfreqs U ! [NTOA ⇥ Nepochs] j ! N epochs n ! NTOA Lentati et al. (inc Taylor) (2013) van Haasteren & Vallisneri (2014a,b)
  20. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 t = M✏

    + Fa + Uj + n The Pulsar-timing Likelihood Describe all processes in rank-reduced formalism small linear perturbations around best-fit timing solution low-frequency processes in Fourier basis jitter white noise M ! [NTOA ⇥ NTM] ✏ ! NTM F ! [NTOA ⇥ 2Nfreqs] a ! 2Nfreqs U ! [NTOA ⇥ Nepochs] j ! N epochs n ! NTOA ~ few tens ~ few tens ~ couple of hundred “M” is matrix of TOA derivatives wrt timing- model parameters “F” has alternating columns of sines and cosines for each frequency “U” has block diagonal structure, with ones filling each block Lentati et al. (inc Taylor) (2013) van Haasteren & Vallisneri (2014a,b)
  21. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 t = M✏

    + Fa + Uj + n The Pulsar-timing Likelihood Describe all processes in rank-reduced formalism small linear perturbations around best-fit timing solution low-frequency processes in Fourier basis jitter white noise M ! [NTOA ⇥ NTM] ✏ ! NTM F ! [NTOA ⇥ 2Nfreqs] a ! 2Nfreqs U ! [NTOA ⇥ Nepochs] j ! N epochs n ! NTOA ~ few tens ~ few tens ~ couple of hundred “M” is matrix of TOA derivatives wrt timing- model parameters “F” has alternating columns of sines and cosines for each frequency “U” has block diagonal structure, with ones filling each block Lentati et al. (inc Taylor) (2013) van Haasteren & Vallisneri (2014a,b)
  22. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    p ( n ) = exp 1 2 nT N 1n p det(2 ⇡N ) White noise is Gaussian N2 ij,k = E2 k W2 ij + Q2 k EFAC EQUAD
  23. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    p ( n ) = exp 1 2 nT N 1n p det(2 ⇡N ) White noise is Gaussian N2 ij,k = E2 k W2 ij + Q2 k EFAC EQUAD p ( t|✏, a, j, ⌘ ) = exp 1 2 ( t M✏ Fa Uj ) T N 1 ( t M✏ Fa Uj ) p det(2 ⇡N )
  24. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    p ( n ) = exp 1 2 nT N 1n p det(2 ⇡N ) White noise is Gaussian N2 ij,k = E2 k W2 ij + Q2 k EFAC EQUAD p ( t|✏, a, j, ⌘ ) = exp 1 2 ( t M✏ Fa Uj ) T N 1 ( t M✏ Fa Uj ) p det(2 ⇡N ) IS THIS ALL WE KNOW?
  25. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    T = ⇥ M F U ⇤ b = 2 4 ✏ a j 3 5 Tb = M✏ + Fa + Uj p ( t|b ) = exp 1 2 ( t Tb ) TN 1 ( t Tb ) p det(2 ⇡N )
  26. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    T = ⇥ M F U ⇤ b = 2 4 ✏ a j 3 5 Tb = M✏ + Fa + Uj p ( b|⌘ ) = exp 1 2 bT B 1b p det(2 ⇡B ) B = 2 4 1 0 0 0 0 0 0 J 3 5 p ( t|b ) = exp 1 2 ( t Tb ) TN 1 ( t Tb ) p det(2 ⇡N ) everything is a random Gaussian process!
  27. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    p(⌘| t) = Z p(⌘, b| t)db p(⌘, b| t) / p( t|b)p(b|⌘)p(⌘) hierarchical modelling (analytically!) marginalize over coefficients
  28. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    p(⌘| t) = Z p(⌘, b| t)db p(⌘, b| t) / p( t|b)p(b|⌘)p(⌘) p ( ⌘| t ) / exp ⇣ 1 2 tT C 1 t ⌘ p det(2 ⇡C ) p ( ⌘ ) C = N + TBTT hierarchical modelling (analytically!) marginalize over coefficients marginalized likelihood
  29. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    C = N + TBTT what are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | )
  30. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    C = N + TBTT what are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) C 1 = (N + TBTT ) = N 1 N 1T(B 1 + TT N 1T) 1TT N 1 Woodbury lemma 1
  31. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    C = N + TBTT what are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) C 1 = (N + TBTT ) = N 1 N 1T(B 1 + TT N 1T) 1TT N 1 easy! Woodbury lemma 1
  32. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    Without cross-pulsar correlations [~ tens of ms] With cross-pulsar correlations [~few seconds]
  33. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    Without cross-pulsar correlations [~ tens of ms] With cross-pulsar correlations [~few seconds] courtesy J. Ellis
  34. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) [ ](ak),(bl) = ab⇢k kl + ak ab kl
  35. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    ORF GWB PSD Intrinsic red- noise PSD [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) [ ](ak),(bl) = ab⇢k kl + ak ab kl
  36. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 The Pulsar-timing Likelihood

    ORF GWB PSD Intrinsic red- noise PSD [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) [ ](ak),(bl) = ab⇢k kl + ak ab kl ⇢k = S(fk) f ! A2 gwb 12⇡2 T obs ✓ fk yr 1 ◆ yr2
  37. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Parametrizing the GWB

    spectrum S(f) = Simple power-law Free spectrum Turnover spectrum Np = 2 Np = Nfreqs Np = 4
  38. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Parametrizing the GWB

    spectrum S(f) = Simple power-law Free spectrum Turnover spectrum Np = 2 Np = Nfreqs Np = 4 Sampson, Cornish, McWilliams (2015) Arzoumanian et al. (inc Taylor) (2014)
  39. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Parametrizing the GWB

    angular power ab / (1 + ab) Z d2 ˆ ⌦P(ˆ ⌦) h F(ˆ ⌦)+ a F(ˆ ⌦)+ b + F(ˆ ⌦)⇥ a F(ˆ ⌦)⇥ b i
  40. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Parametrizing the GWB

    angular power ab / (1 + ab) Z d2 ˆ ⌦P(ˆ ⌦) h F(ˆ ⌦)+ a F(ˆ ⌦)+ b + F(ˆ ⌦)⇥ a F(ˆ ⌦)⇥ b i = R · P · RT R ! [N psr ⇥ 2N pix ] P ! diag(2N pix ) ! [Npsr ⇥ Npsr] R = pulsar response matrix [fixed] P = power in each pixel [parametrize] Mingarelli et al. (2013) Taylor & Gair (2013) Taylor & van Haasteren (in prep.)
  41. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Parametrizing the GWB

    angular power Spherical harmonics Disk anisotropy Point source Reconstruction from correlation elements Search for all elements of matrix We know Solve for R P
  42. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Searching for single

    GW sources t ! t s(t) Deterministic signal 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 Post-fit residual [µs] 0.04 s] 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 Post-fit residual [µs] 53000 53500 54000 54500 55000 55500 56000 56500 0.03 0.02 0.01 0.00 0.01 0.02 0.03 0.04 = srec.(t) strue(t) [µs] 1662 J. B. Wang et al. circular binary eccentric binary burst with memory burst Ellis (2013) Taylor et al. (2016) van Haasteren & Levin (2010) Madison et al. (2014) Finn & Lommen (2010) Ellis & Cornish (in prep.)
  43. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 MCMC Sampling Use

    tuned parallel-tempering MCMC samplers to explore ~50-100-D parameter spaces. (Formally, ~1000-D, but we have analytically marginalized) PTMCMCSampler Uses a cocktail of parallel-tempering MCMC (proposals between temperatures), with in-temperature proposal mixture of differential evolution, adaptive MCMC, single-component adaptive Metropolis (SCAM), prior draws, and custom sampling groupings.
  44. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection Detection is

    a model-selection problem. We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations.
  45. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection Detection is

    a model-selection problem. We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations. P12 = p(H1 |d) p(H2 |d) = p(d|H1) p(d|H2) p(H1) p(H2) Posterior odds ratio Bayes factor Prior odds ratio
  46. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection Detection is

    a model-selection problem. We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations. P12 = p(H1 |d) p(H2 |d) = p(d|H1) p(d|H2) p(H1) p(H2) Posterior odds ratio Bayes factor Prior odds ratio MultiNest Thermodynamic integration RJMCMC Savage-Dickey ratio Product space
  47. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection — Savage-Dickey

    Ratio Works great for nested model comparisons, i.e. signal+noise versus noise. Need a parameter (or region of parameter space) that “switches off” the signal. In our case, we can take the pdf value of the GWB amplitude at very low values. Our sampling schemes are efficient enough to get good sampling coverage. Becomes inefficient in strong signal regime.
  48. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection — Product-Space

    Method Product space sampling is ridiculously easy. [Carlin & Chib (1995), but more recently Veitch (2007, thesis), Hee et al. (2016)] Search parameters are the union of all model parameter spaces. Add a model indexing variable to parameter space. Current location of indexing variable says which likelihood is “switched on” Relative time spent at each value of model index is the posterior odds ratio.
  49. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection — Estimating

    Significance Detection statistic (whether frequentist or Bayesian) needs context. How often can noise alone produce this? Could run many noise simulations to determine p-value of measured statistic… Better to operate on real dataset.
  50. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Cgwb = F'gwbFT

    Phase Shifting Sky Scrambles …also see Cornish & Sampson (2016) for discussions of how to make robust detections Detection — Estimating Significance Detection statistic (whether frequentist or Bayesian) needs context. How often can noise alone produce this? Could run many noise simulations to determine p-value of measured statistic… Better to operate on real dataset. “All correlations must die” , Taylor et al. (2017a), arXiv:1606.09180
  51. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Detection -60 -50

    -40 -30 -20 -10 0 lnB 0.00 0.02 0.04 0.06 0.08 0.10 Common uncorrelated red process GWB signal, scrambled ORF GWB signal, phase shifts Phase-shifting: add random phases to all basis functions of GWB signal. Sky-scrambling: scramble pulsar positions when constructing the Hellings & Downs curve in our model, to be orthogonal to the true Hellings & Downs curve.
  52. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 PTA Graphical Model

    tTOA yWN yTM yRN yGW WN ⇥TM aRN aGW ⇢GW ⇢RN ARN, RN AGW, e0, ... pulsars P(ˆ ⌦)GW clm
  53. Stephen Taylor IIP 2017, Natal, Brazil, 08-04-2017 Summary Our Bayesian

    techniques use sophisticated hierarchical modeling and MCMC strategies to search for correlations. Bayes factors are now quite cheap to compute. [Savage-Dickey, Product-space sampling] Phase-shifting and sky-scrambling give context to these Bayes factors. [“All correlations must die”, arXiv:1606.09180] We can also build physically-sophisticated spectral models by training Gaussian Processes on populations of binaries. Sometimes its easier to simulate the Universe than write down an equation. [Taylor et al. (2017b), arXiv:1612.02817]