Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction To NANOGrav Data Analysis

Introduction To NANOGrav Data Analysis

Student workshop talk on data analysis methods for pulsar-timing arrays. Presented at NANOGrav Spring 2017 student workshop.

Dr. Stephen R. Taylor

April 17, 2017
Tweet

More Decks by Dr. Stephen R. Taylor

Other Decks in Science

Transcript

  1. Stephen Taylor WVU, 04-17-2017 © 2017 California Institute of Technology.

    Government sponsorship acknowledged Stephen R. Taylor Introduction To NANOGrav Data Analysis JET PROPULSION LABORATORY, CALIFORNIA INSTITUTE OF TECHNOLOGY
  2. Stephen Taylor WVU, 04-17-2017 Bayesian inference Search strategies for stochastic

    and deterministic signals Assessing detection significance Overview
  3. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 50 Image credit: David Champion
  4. Stephen Taylor WVU, 04-17-2017 Bayesian Inference p(A, B) = p(A)p(B|A)

    = p(B)p(A|B) p(B|A) = p(A|B)p(B) p(A) Bayes’ Theorem
  5. Stephen Taylor WVU, 04-17-2017 Bayesian Inference p(A, B) = p(A)p(B|A)

    = p(B)p(A|B) p(B|A) = p(A|B)p(B) p(A) Bayes’ Theorem A = data B = parameters we care about
  6. Stephen Taylor WVU, 04-17-2017 Bayesian Inference p(A, B) = p(A)p(B|A)

    = p(B)p(A|B) p(B|A) = p(A|B)p(B) p(A) Bayes’ Theorem A = data B = parameters we care about p(~ ✓|~ d) = p(~ d|~ ✓)p(~ ✓) p(~ d)
  7. Stephen Taylor WVU, 04-17-2017 Bayesian Inference p(A, B) = p(A)p(B|A)

    = p(B)p(A|B) p(B|A) = p(A|B)p(B) p(A) Bayes’ Theorem A = data B = parameters we care about p(~ ✓|~ d) = p(~ d|~ ✓)p(~ ✓) p(~ d) posterior probability likelihood prior knowledge “Evidence” (more later)
  8. Stephen Taylor WVU, 04-17-2017 Bayesian Inference Bayesian inference recovers probability

    distributions — measures the spread in our belief. Frequentist inference recovers frequency distributions — measures the long-timescale spread of experiments.
  9. Stephen Taylor WVU, 04-17-2017 Bayesian Inference Example: • A disease

    test is accurate 99% of the time. • But the disease is quite rare: only affects 1 in 10,000 people. • If a test comes back positive, what is the probability that you actually have the disease? • Intuition might lead us to think that, since the test is 99% accurate, then there is a good chance we are infected!
  10. Stephen Taylor WVU, 04-17-2017 p (infected | positive test) =

    p (positive test | infected) p (infected) p (positive test) Bayesian Inference
  11. Stephen Taylor WVU, 04-17-2017 p (infected | positive test) =

    p (positive test | infected) p (infected) p (positive test) Bayesian Inference p (infected | positive) = 0 . 99 ⇥ 0 . 0001 0 . 99 ⇥ 0 . 0001 + 0 . 01 ⇥ 0 . 9999
  12. Stephen Taylor WVU, 04-17-2017 p (infected | positive test) =

    p (positive test | infected) p (infected) p (positive test) Bayesian Inference p (infected | positive) = 0 . 99 ⇥ 0 . 0001 0 . 99 ⇥ 0 . 0001 + 0 . 01 ⇥ 0 . 9999 p (infected | positive) = 0 . 99 ⇥ 0 . 0001 0 . 99 ⇥ 0 . 0001 + 0 . 01 ⇥ 0 . 9999 ⇠ 0 . 01 ONLY 1% !!!
  13. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood Two basic types

    of signals • (1) stochastic — characterize through statistical properties, e.g. standard deviation • (2) deterministic — characterize through amplitude, phase, etc.
  14. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood Two basic types

    of signals • (1) stochastic — characterize through statistical properties, e.g. standard deviation • (2) deterministic — characterize through amplitude, phase, etc. p ( t|n ) = exp ( 1 2 ( t s ) T N 1 ( t s )) p det(2 ⇡N )
  15. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood Two basic types

    of signals • (1) stochastic — characterize through statistical properties, e.g. standard deviation • (2) deterministic — characterize through amplitude, phase, etc. p ( t|n ) = exp ( 1 2 ( t s ) T N 1 ( t s )) p det(2 ⇡N ) The only difference between treating stochastic signals and deterministic signals is through our prior on s
  16. Stephen Taylor WVU, 04-17-2017 Searching for single GW sources t

    ! t s(t) Deterministic signal 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 Post-fit residual [µs] 0.04 s] 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 Post-fit residual [µs] 53000 53500 54000 54500 55000 55500 56000 56500 0.03 0.02 0.01 0.00 0.01 0.02 0.03 0.04 = srec.(t) strue(t) [µs] 1662 J. B. Wang et al. circular binary eccentric binary burst with memory burst Ellis (2013) Taylor et al. (2016) van Haasteren & Levin (2010) Madison et al. (2014) Finn & Lommen (2010) Ellis & Cornish (in prep.)
  17. Stephen Taylor WVU, 04-17-2017 12 10 8 10 7 Frequency

    [Hz] 10 15 10 14 10 13 10 12 10 11 Strain Amplitude Bayesian fixed noise Bayesian varying noise Fp statistic Figure 5. Sky-averaged upper limit on the strain amplitude, h 0 as a function of GW frequency. The Bayesian upper limits are computed using a fixed-noise model (thick black(blue)) and a varying noise model (thin black(purple)) and the frequentist upper limit (gray(red)) is computed using the Fp-statistic. The dashed curves indicate lines of constant chirp mass for a source with a distance to the Virgo cluster (16.5 Mpc) and chirp mass of 109M (lower) and 1010M (upper). The gray(green) squares show the strain amplitude of the loudest GW sources in 1000 monte-carlo realizations using an optimistic phenomenological Arzoumanian et al. (2014) Searching for single GW sources
  18. Stephen Taylor WVU, 04-17-2017 D95 ⇥ ⇣ M 109 M

    ⌘5/3 ⇣ fgw 10 8Hz ⌘2/3 [Mpc] Figure 6. 95% lower limit on the luminosity distance as a function of sky location computed using the Fp-statistic plotted in equatorial coordinates . The values in the colorbar are calculated assuming a chirp mass of M = 109M and a GW frequency f gw = 1 ⇥ 10-8 Hz. The white diamonds denote the locations of the pulsars in the sky and the black(white) stars denote possible SMBHBs or clusters possibly containing SMBHBs. As expected from the antenna pattern functions of the pulsars, we are most sensitive to GWs from sky locations near the pulsars. The luminosity distances to the potential sources are 92.3, 1575.5, 2161.7, 16.5, 104.5, and 19 Mpc for 3C66B, OJ287, J002444-003221, Virgo Cluster, Coma Cluster, and Fornax Cluster, respectively. (Color figure available in the online version.) 6 7 9 10 12 13 15 16 18 D95 ⇥ ⇣ M 109 M ⌘5/3 ⇣ fgw 10 8Hz ⌘2/3 [Mpc] Coma Virgo OJ287 Fornax 3C66B J002444-003221 Figure 7. 95% lower limit on the luminosity distance as a function of sky location computed using the Bayesian method including the noise model. The values in the colorbar are calculated assuming a chirp mass of M = 109M and a GW frequency f gw = 1 ⇥ 10-8 Hz. The white diamonds denote the locations of the Arzoumanian et al. (2014) Searching for single GW sources
  19. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them.
  20. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance.
  21. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance. • All information on the background is in signal variance and cross correlations between pulsars.
  22. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance. • All information on the background is in signal variance and cross correlations between pulsars.
  23. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance. • All information on the background is in signal variance and cross correlations between pulsars. • Let’s do a simple Fourier analysis of the background.
  24. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance. • All information on the background is in signal variance and cross correlations between pulsars. • Let’s do a simple Fourier analysis of the background. s = Tb
  25. Stephen Taylor WVU, 04-17-2017 p ( t|b ) = exp

    1 2 ( t Tb ) TN 1 ( t Tb ) p det(2 ⇡N ) Searching for a GW background • We have lots of binaries, so we can’t track all of them. • Look at statistical properties of the signal, e.g. the variance. • All information on the background is in signal variance and cross correlations between pulsars. • Let’s do a simple Fourier analysis of the background. s = Tb
  26. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    Put a Gaussian prior on the signal amplitude coefficients.
  27. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    Put a Gaussian prior on the signal amplitude coefficients. • Variance is proportional to the power spectrum of the background.
  28. Stephen Taylor WVU, 04-17-2017 Searching for a GW background •

    Put a Gaussian prior on the signal amplitude coefficients. • Variance is proportional to the power spectrum of the background. • We can parameterize that power spectrum whichever way we want…
  29. Stephen Taylor WVU, 04-17-2017 Searching for a GW background p

    ( b|⌘ ) = exp 1 2 bT B 1b p det(2 ⇡B ) • Put a Gaussian prior on the signal amplitude coefficients. • Variance is proportional to the power spectrum of the background. • We can parameterize that power spectrum whichever way we want…
  30. Stephen Taylor WVU, 04-17-2017 p(⌘| t) = Z p(⌘, b|

    t)db p(⌘, b| t) / p( t|b)p(b|⌘)p(⌘) hierarchical modelling (analytically!) marginalize over coefficients Searching for a GW background
  31. Stephen Taylor WVU, 04-17-2017 p(⌘| t) = Z p(⌘, b|

    t)db p(⌘, b| t) / p( t|b)p(b|⌘)p(⌘) p ( ⌘| t ) / exp ⇣ 1 2 tT C 1 t ⌘ p det(2 ⇡C ) p ( ⌘ ) C = N + TBTT hierarchical modelling (analytically!) marginalize over coefficients marginalized likelihood Searching for a GW background
  32. Stephen Taylor WVU, 04-17-2017 C = N + TBTT what

    are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) Searching for a GW background
  33. Stephen Taylor WVU, 04-17-2017 C = N + TBTT what

    are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) C 1 = (N + TBTT ) = N 1 N 1T(B 1 + TT N 1T) 1TT N 1 Woodbury lemma 1 Searching for a GW background
  34. Stephen Taylor WVU, 04-17-2017 C = N + TBTT what

    are we actually doing here? this is just the Wiener- Khinchin theorem! [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) C 1 = (N + TBTT ) = N 1 N 1T(B 1 + TT N 1T) 1TT N 1 easy! Woodbury lemma 1 Searching for a GW background
  35. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood [ F FT

    ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) [ ](ak),(bl) = ab⇢k kl + ak ab kl
  36. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood ORF GWB PSD

    Intrinsic red- noise PSD [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) [ ](ak),(bl) = ab⇢k kl + ak ab kl
  37. Stephen Taylor WVU, 04-17-2017 The Pulsar-timing Likelihood ORF GWB PSD

    Intrinsic red- noise PSD [ F FT ]ij ' Z d fS ( f ) cos(2 ⇡f|ti tj | ) ⇢k = S(fk) f ! A2 gwb 12⇡2 T obs ✓ fk yr 1 ◆ yr2 [ ](ak),(bl) = ab⇢k kl + ak ab kl
  38. Stephen Taylor WVU, 04-17-2017 Fit for GW background, correlated clock-noise,

    dipole process due to imprecise solar-system ephemerides, and intrinsic low-frequency pulsar noise. Investigated consistency of constraints with astrophysical predictions from Sesana (2013). Upper limit cuts out 5% of plausible amplitude distribution. Detailed analysis of constraints on possible cosmic-string network and primordial GWs Detailed investigations of consistency of upper limits with Sesana (2013) and McWilliams, Ostriker, Pretorious (2014) predictions. Searched with a generalized turnover model to investigate “final-parsec” processes. Detailed analysis of constraints on possible cosmic-string network and primordial GWs Best published constraints to date. Used 4 pulsars, but limit dominated by J1909-3744 which is exceptionally well-timed with no measured low-frequency noise. Limit is in tension with our basic astrophysical predictions. Excludes 91 - 99.7% of range of basic predictions. Lentati, Taylor, Mingarelli et al. (2015) Arzoumanian et al. (2016) Shannon et al. (2015)
  39. Stephen Taylor WVU, 04-17-2017 Parametrizing the GWB angular power ab

    / (1 + ab) Z d2 ˆ ⌦P(ˆ ⌦) h F(ˆ ⌦)+ a F(ˆ ⌦)+ b + F(ˆ ⌦)⇥ a F(ˆ ⌦)⇥ b i = [Npsr ⇥ Npsr]
  40. Stephen Taylor WVU, 04-17-2017 Parametrizing the GWB angular power ab

    / (1 + ab) Z d2 ˆ ⌦P(ˆ ⌦) h F(ˆ ⌦)+ a F(ˆ ⌦)+ b + F(ˆ ⌦)⇥ a F(ˆ ⌦)⇥ b i = R · P · RT R ! [N psr ⇥ 2N pix ] P ! diag(2N pix ) R = pulsar response matrix [fixed] P = power in each pixel [parametrize] Mingarelli et al. (2013) Taylor & Gair (2013) Taylor & van Haasteren (in prep.) = [Npsr ⇥ Npsr]
  41. Stephen Taylor WVU, 04-17-2017 Detection Detection is a model-selection problem.

    We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations.
  42. Stephen Taylor WVU, 04-17-2017 Detection Detection is a model-selection problem.

    We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations. P12 = p(H1 |d) p(H2 |d) = p(d|H1) p(d|H2) p(H1) p(H2) Posterior odds ratio Bayes factor Prior odds ratio
  43. Stephen Taylor WVU, 04-17-2017 Detection Detection is a model-selection problem.

    We need to prove the presence of spatial correlations between pulsars. Compare Bayesian evidence for a model with Hellings and Downs correlations versus no correlations. P12 = p(H1 |d) p(H2 |d) = p(d|H1) p(d|H2) p(H1) p(H2) Posterior odds ratio Bayes factor Prior odds ratio MultiNest Thermodynamic integration RJMCMC Savage-Dickey ratio Product space
  44. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 1 Npairs = N(N 1)/2 Detection
  45. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 1 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 4 Npairs = N(N 1)/2 Detection
  46. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 1 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 4 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 10 Npairs = N(N 1)/2 Detection
  47. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 1 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 4 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 10 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 30 Npairs = N(N 1)/2 Detection
  48. Stephen Taylor WVU, 04-17-2017 0 20 40 60 80 100

    120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 1 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 4 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 10 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 30 0 20 40 60 80 100 120 140 160 180 pulsar angular separation [deg] -0.2 0.0 0.2 0.4 0.6 0.8 1.0 arrival time correlation N = 50 Npairs = N(N 1)/2 Detection
  49. Stephen Taylor WVU, 04-17-2017 Summary Pulsar-timing is a very rich

    area for statistical inference. We can do single-source searches and stochastic background searches all together in our pipelines. Current limits are cutting into astrophysically interesting territory. We can probe the dynamics and environments of supermassive black-holes binaries. Detection within 10 years (or even sooner) looks good.
  50. Stephen Taylor WVU, 04-17-2017 Code Demo We use TEMPO2 (https://bitbucket.org/psrsoft/

    tempo2.git) to read in “par” and “tim” files and to perform the timing-model analysis. Libstempo (https://github.com/vallis/libstempo) is a python module that wraps around TEMPO2. All our Bayesian codes interact with TEMPO2 using libstempo. Our production level codes include PAL2 (https:// github.com/jellis18/PAL2), NX01 (https://github.com/ stevertaylor/NX01), and Piccard (not in active development), and soon ENTERPRISE (https:// github.com/nanograv/enterprise).